Problems do arise when managers/businesses don't want to pay for thorough checks
Here we are in violent agreement.
...
I am interested in the latter, but do not believe newer languages will solve all the problems: they will just replace the old set of problems with a new one, because humans just aren't very good at this sort of design, not yet anyway. I do hope Rust and others will become better than what we have now, but they're not there yet.
There we are, again, in violent agreement.
It then becomes about the philosophy of what to do: demand perfection or expect imperfection.
Yep, and not necessarily. (I'm not saying you are wrong, I am saying I see this differently, but am not sure either one is correct or "more correct".)
I like to at least think I am on the constant lookout for better tools, because a tool is always imperfect unless it is a simple statement of the answer. That is, given a particular problem, there are at least small changes possible to apply to the tool to make it even better suited for that particular problem. Perfection, therefore, is not a valid goal, unless we define it as the vague centerpoint related to a set of problems.
As an example, I use five wildly different programming languages just about every day: bash, awk, C, Python, and JavaScript. Their domains are so different I do not see it is even possible for a single programming language to be better than each of them in their respective domains. I can see adding to the set when the type of things I do changes, and replacing any one with a better one at that domain. (Well, I've already done that a few times. None of these were my "first" programming language, and I've gotten paid to write code in half a dozen to dozen other programming languages too.)
In the past, C tried to be a full-stack language, catering for everything from the lowest-level libraries to the highest-level abstractions. That didn't work, so objective-C and C++ bubbled off it by people who used the language to solve particular kinds of problems, using abstraction schemes they thought would make the new programming language a better tool.
Currently, C is mostly used as a systems programming language, for low-level implementation (kernels, firmwares) up to services (daemons in POSIX/Unix parlance) and libraries. In this domain, bugs related to memory accesses are prominent, and seen as a problem that needs fixing.
Thing is, memory safety is only one facet of issues, and is not sufficient criterion to be better than C.
Instead of introducing a completely new language, my logic is that since C has proven to be practical, but has
these faults, fixing memory safety by adding the feature set I described in a backwards-compatible manner with zero runtime overhead, is likely to yield a better tool than designing a completely new one from scratch.
Essentially, by doing this derivative-language bubble, which simultaneously would mean a standard C library replacement with something else (which is not that big of a deal, considering the C standard explicitly defines the
free-standing environment for that case), I claim that typical memory safety issues in C code can be easily and effectively avoided, while requiring relatively little adjustment from C programmers.
The more interesting thing here is to look at why such changes have not been proposed before. (They might have; I just haven't found any yet.)
Nothing in it is "novel", as it is simply based on the fact that for arrays, C compilers already do effective bounds checking at runtime within a single scope, even for variably modified types. Variants based on C seem to have simply added new abstractions, rather than delve into fixing C's known deficiencies wrt. code quality and bug type tendencies.
Moreover, any new abstraction or feature brings in its own set of problems. Always.
An example of this is how initialization of static (global) C++ objects happen in microcontrollers. Under fully featured OSes using ELF binaries, there is actually a section (
.init_array) that contains only initializer function pointers that are called without arguments to initialize objects in the correct order. (It can be used in C, too, via GNU
constructor function attribute.) On microcontrollers, the objects tend to be initialized as part of the RAM initialization process, copying or decompressing initial data from Flash/ROM to RAM, but a compiler may still generate similar initializer functions you need to call after initializing the RAM contents, but before the execution of the firmware image begins. The order in which these initializer functions are called can be extremely important, when an object refers to the state of another object at initialization time. (I am not sure if it is possible to construct a ring of dependencies that is impossible to implement in practice, although it would be a fun experiment; like proving the C++ template engine is Turing-complete.)
Any new feature will have its risks. A completely new programming language has an unproven track record, and an unknown set of risks and weaknesses. I am happy that others are developing the next generation of programming languages, even though I expect almost all of them to fail and lapse into niche use cases. It is unlikely that I will be using them in true anger (to solve real-world problems others are having) until they have at least a decade of development under their belt, though; they tend to take at least that long to find their "groove", and iron out their backwards-incompatible warts.
Because of the above, I do not believe a new language is likely to replace C anytime soon, but in the meantime, we might reap some significant rewards with relatively small backwards-compatible changes to C itself. This is the key. Why wait for the moon, when you can have a small asteroid now in the mean time?
The feature that is analogous to the memory-safety issue here is the difference in line input functions in standard C (
fgets()) and POSIX.1 (
getline()/
getdelim()). The latter can easily deal with even binary data and embedded nuls (
\0) in input, and has no inherent line length limitations. It is also extremely well suited for tokenization in line-based inputs; for CSV and similar formats where record separators can appear in fields as long as quoted, you need slightly more complicated functions. Yet, if you look at POSIX.1 C examples and tutorials, very few if any show how to use
getline() or
getdelim(), and instead focus on
fgets(). Even moreso for
opendir()/
readdir()/
closedir vs.
nftw()/
scandir()/
glob(). Better tools exist in POSIX.1 C, but because one company (Microsoft) rejected it, most tutorials and guides teach the inferior tools.
You could say that my contribution is limited to showing how small changes to how people use C in anger could affect their bug density, especially memory-related bugs, in a worthwhile manner. I do not have the charisma or social skills to make any of that popular, though, which heavily colors my opinion as to what kind of results one can get in the programming language as a tool arena in general. To affect a change, you need PR and social manipulation, not new technology. And definitely not political decrees as to what kind of programming languages developers should use.