I've actually always used C++ for embedded (AVR8 / STM32 / PIC32 / Automotive PowerPC) (and cluster/hpc stuff). It can blow your foot off if you don't know what you are doing, but it can be much nicer than C with very little extra overhead.
Most of the slow features are optional. My favorite is when old guard C programmers gripe about C++ constructors being slow... not only are they optional but every real C library I have seen has an init function to set the struct members, or a call to memset(0). Not really a difference then....
Things like virtual functions are not much slower than a function call through a pointer offset / indirection, which is usually fast.
Real-time and soft-real time are about provably bounding your execution run time ---- This is a much different statement than saying C++ is slow. C++ is fine, just make sure you are using deterministic features. Eg, no heap. Or only special heaps. Using C doesn't magically make you hard-real time safe.
C++ can generate code that is large in bytes, especially in debug mode, mostly though bloat that templates make. No defense there.
Abstraction can make code that is slightly slower then no abstraction (as in, ~few extra pointer indirection for a virtual function call), but having a set of peripheral drivers written against a base hardware abstraction class is really nice. It makes switching architectures or even to eg linux simple. IMO abstraction isn't really optional. It is just how you do it, and doing it will have a non-zero cost unless you use C macros to hard-compile in your HAL layer.
For a contrived counter example, on a desktop pc with -O3 -march=native turned on, I've seen C++ templated code run 6x faster than C because the compiler had more information to optimize (alignment, etc) with so it used the perfect AVX instruction instead of the more general sse code the C version used.
Template code can eat up all your L1 cache fast though. And all your flash. and take forever to compile. But is is very worth it to never have macros infest a program. It is a balance, depending on what you are doing it might actually help L1 cache usage, if you use the same templated function a lot.
C++11/C++17 added a lot of nice features. Templates + constexpr can make a lot of magic happen at compile time, and are easy for the compiler to verify that they are functionally pure, so tend to inline and optimize well, which would generate nice machine code.