Higher-level languages with an actual runtime will have a hard time replacing really low-level languages like C, because of the engineering differences.
It is easy to miss how truly simple C really is. Almost everything is part of the standard library, which is only provided in hosted environments; a freestanding environment is truly simple.
I have played with the notion of replacing C with something more appropriate for the tasks I have used it (from microcontrollers to kernels to low-level libraries to applications), and I've come to realize that the features C is missing or should replace with something, are rather simple.
For example, we really need a notion of compile-time only data structures, as a completely new concept. As a practical example, consider the I/O pin configuration on a microcontroller. We really do not care how they are initialized, just that they are initialized to a specific state. Fortran
foreach loops are a small step in this direction, decoupling the order in which iterations are done, making it possible for a compiler to easily parallelize such loops. The fundamental idea is a way to represent final states, when intermediate states or order of operations or other side effects are unimportant.
Currently, C has two contexts: hosted and freestanding. Splitting these into
facilities would make a lot of sense. For example, when using GCC on microcontrollers you actually use an interesting subset of C++. The Linux kernel uses a subset of C that excludes floating-point math unless special precautions are taken. And so on.
As of 2019, there are two approaches to synchronization primitives in hardware:
load-link/store-conditional and
compare-and-swap. If we consider synchronization primitives a
facility, then LL/SC and CAS would be alternate implementations of that facility.
A lot of hardware can do atomic loads, stores, and even additions and subtractions. These should be exposed as a facility.
As you can see, the above points are really about the "standard library", and modularizing it in a way that is different to now, but better corresponds to the actual use cases we can see in various software project types. Some of the facilities are provided by the user, some by the environment, some by the compiler (consider e.g.
udivdi3 helper when using GCC).
The language itself needs some new concepts. Arrays with compile-time boundary checking, for example; arrays with run-time boundary checking; vector types. One important concept we need is splitting
casting into
conversion and
reinterpretation. Conversion occurs when a value of one type is converted to the same or as similar value as possible in the other type; reinterpretation occurs when the storage bit pattern is used with a different type of the same size.
I do not believe atomic types is the right direction, because atomicity is a property of the access and not of the variable itself. I think a variable attribute indicating atomic access being required would be much better. (Similarly, variable attributes specifying byte order would be useful. As would be a way to specify data structures exactly.)
The sequence point semantics in C are too strong. In many cases, the order of operations and their side effects really does not matter, and it would be better if the compiler could choose/optimize them as it sees fit. A lot of initialization loops fall into this category. Also, instead of global atomicity, it would be useful to give operations compile-time "tags", and provide primitives for compiler/hardware synchronization within each "tag". An example of such use is dual counters for opportunistic read-only access for rarely-modified data structures: modifiers increment one before, and the other after, modifying the data. An opportunistic reader gets the latter counter first, then the data, and then the first counter; the data is reliable only if the counter values matches. The order in which the counter values and their increments are visible is paramount, and is easily b0rked by caching strategies.
In practice,
foreach-type order-ignoring loops, and some way to mark entire scopes as "side effect order and sequence points irrelevant", might suffice.
I suspect that turning the entire thing on its head, dropping sequence point order unless explicitly marked, would be better.
The biggest flaw in C is that it does not have a native facility exposing the status flags. For example, an internal ABI would be much more efficient, if it used a status flag, say carry flag, to indicate failure. This way, a function could have one return type for success, and completely another for failure cases. In essence, the status flag would allow a kind of "exception" handling, except in a form native to basically all current hardware that can support C.
Even more interesting would be if the language allowed multiple ABIs in the same project -- it would have to do that on a per-callable basis. It would help a lot in implementing low-level libraries for various higher-level programming languages. Static introspection would be useful here too, for example with a new ELF section describing exported callables, so that higher-level languages could call these directly, without any special shims in between; much like ctypes for Python.
As a whole, these changes would neither bring it closer to assembly, nor add higher-level abstractions; it's a step sideways.