Author Topic: Future Software Should Be Memory Safe, says the White House (Read 3741 times)

tggzzz · « **Reply #75 on:** March 05, 2024, 12:53:26 pm »

Quote from: coppice on March 05, 2024, 12:43:55 pm

Quote from: tggzzz on March 04, 2024, 11:46:19 pm
If you can cast away noalias/const then the compiler can (and with higher optimisation levels, will) generate incorrect code.
More basically, if you can cast away const then const stuff can't go into NV memory, which would make C pretty useless for many kinds of machine. "const" was an absolutely essential feature for the MCU world. For the very first pass of standardising C in the 1980s, const had to go into make it a suitable language for the embedded market.

And if you can't cast away constness then you can't write a debugger that pokes (ordinary) memory.

Damned if you can, damned if you can't => damned

The committee took years debating that in the early-mid 90s. That is damning in itself.

coppice · « **Reply #76 on:** March 05, 2024, 12:56:23 pm »

Quote from: tggzzz on March 05, 2024, 12:53:26 pm

Quote from: coppice on March 05, 2024, 12:43:55 pm
Quote from: tggzzz on March 04, 2024, 11:46:19 pm
If you can cast away noalias/const then the compiler can (and with higher optimisation levels, will) generate incorrect code.
More basically, if you can cast away const then const stuff can't go into NV memory, which would make C pretty useless for many kinds of machine. "const" was an absolutely essential feature for the MCU world. For the very first pass of standardising C in the 1980s, const had to go into make it a suitable language for the embedded market.

And if you can't cast away constness then you can't write a debugger that pokes (ordinary) memory.

Damned if you can, damned if you can't => damned

The committee took years debating that in the early-mid 90s. That is damning in itself.

That was an issue in the 80s. These days, with most NV memory being flash, the debuggers just rewrite a page.

tggzzz · « **Reply #77 on:** March 05, 2024, 01:21:49 pm »

Quote from: Nominal Animal on March 05, 2024, 12:28:30 pm

Quote from: tggzzz on March 05, 2024, 08:53:49 am
Problems do arise when managers/businesses don't want to pay for thorough checks
Here we are in violent agreement.
...
I am interested in the latter, but do not believe newer languages will solve all the problems: they will just replace the old set of problems with a new one, because humans just aren't very good at this sort of design, not yet anyway. I do hope Rust and others will become better than what we have now, but they're not there yet.

There we are, again, in violent agreement.

It then becomes about the philosophy of what to do: demand perfection or expect imperfection.

Quote

For the reasons I described earlier, I suggest better overall results would be achieved faster by smaller incremental changes, for example those I described for C. History provides additional reasons: objective-C and C++, when those who wrote code decided the C tool they had was insufficient.

There we disagree. Sometimes the only winning move is not to play.

While I liked Objective-C in the mid-late 80s as being an honorable pragmatic way to import the Smalltalk philosophy to C, I rapidly decided C++ was something I wanted to avoid. Things like the the C++ committee refusing to realise they had created a language where a valid C++ program could never be compiled, just cemented my opinion.

C++ FQA is exceedingly amusing from a distance.

Modern C and C++ are castles carefully and deliberately built on the sand of obsolete presumptions about technology.

Quote

PHP is a horrible example of what kind of a mess you may end up with if you try to cater for all possible paradigms: for some things, like string manipulation, it has at least two completely separate interfaces (imperative and object-oriented). Unfortunately, Python shows partial signs of this too, what with its strings/bytes separation, and increasing number of string template facilities.

C++ deliberately took the decision that kind of thing was a benefit!

It seems all popular languages accrete features over time, growing like Topsy until they appear cancerous. Even Java is getting creeping featuritis, but at least it has a solid starting point.

Quote

To fix a popularity problem, a social problem, like programmers and companies being happy to produce buggy code and customers being happy to pay for buggy code, you need to apply social/human tools, not technological ones.

The principal social tool/technique is to choose tools that make it easy to avoid classes of problems.

EDIT: this directly relevent ACM article has just popped into my attention: https://queue.acm.org/detail.cfm?id=3648601
"Based on work at Google over the past decade on managing the risk of software defects in its wide-ranging portfolio of applications and services, the members of Google's security engineering team developed a theory about the reason for the prevalence of defects: It's simply too difficult for real-world development and operations teams to comprehensively and consistently apply the available guidance, which results in a problematic rate of new defects. Commonly used approaches to find and fix implementation defects after the fact can help (e.g., code review, testing, scanning, or static and dynamic analysis such as fuzzing), but in practice they find only a fraction of these defects. Design-level defects are difficult or impractical to remediate after the fact. This leaves a problematic residual rate of defects in production systems.
We came to the conclusion that the rate at which common types of defects are introduced during design, development, and deployment is systemic—it arises from the design and structure of the developer ecosystem, which means the end-to-end collection of systems, tooling, and processes in which developers design, implement, and deploy software. This includes programming languages, software libraries, application frameworks, source repositories, build and deployment tooling, the production platform and its configuration surfaces, and so forth.
...
Guidance for developers in memory-unsafe languages such as C and C++ is, essentially, to be careful: For example, the section on memory management of the SEI CERT C Coding Standard stipulates rules like, "MEM30-C: Do not access freed memory" (bit.ly/3uSMBSk).
While this guidance is technically correct, it's difficult to apply comprehensively and consistently in large, complex codebases. For example, consider a scenario where a software developer is making a change to a large C++ codebase, maintained by a team of dozens of developers. The change intends to fix a memory leak that occurs because some heap-allocated objects aren't deallocated under certain conditions. The developer adds deallocation statements based on the implicit assumption that the objects will no longer be dereferenced. Unfortunately, this assumption turns out to be incorrect, because there is code in another part of the program that runs later and still dereferences pointers to this object."

Quote

We do not get better software developers by teaching them the programming language du jour; we get better software developers by convincing them to try harder to not create bugs, to use the tools they have available to support them in detecting and fixing issues when they do happen. But most of all, we'd need to convince customers and business leadership that buying and selling buggy code is counterproductive, and that we can do better if we choose to. All we need to do is choose to.

Two relevant quotes from the 80s, but I can't find a source:

if you make it possible for English to be a programming language, you will find programmers cannot write English
(after "losing" a programming contest to a faster program that was mostly correct) if I had known it was allowable to generate incorrect answers, I could have written much a faster program much sooner

The former has to be re-learned every generation; currently ML generated programs are the silver bullet. Expect more Air Canada chatbot experiences

The latter wasn't originally about C/C++, but it is clearly and horribly relevant.

Quote

In a very real way, software companies today are in a very similar position to mining companies a century and a half ago. They, too, could do basically what they pleased, and had their own "company towns" where employees had to rent from the company and buy company wares to survive. Campaign contributions to politicians kept those companies operations untouched, until people fed up with it. I'm waiting for people to get fed up with how crappy software generally speaking is. I just don't want a bloody revolution, just incremental changes that help fair competition, if that is what people want.

The only route out of the mess will be legal liability. Hopefully the Air Canada chatbot case is the start of that. (See: I'm an optimist!)

tggzzz · « **Reply #78 on:** March 05, 2024, 01:23:50 pm »

Quote from: coppice on March 05, 2024, 12:56:23 pm

Quote from: tggzzz on March 05, 2024, 12:53:26 pm
Quote from: coppice on March 05, 2024, 12:43:55 pm
Quote from: tggzzz on March 04, 2024, 11:46:19 pm
If you can cast away noalias/const then the compiler can (and with higher optimisation levels, will) generate incorrect code.
More basically, if you can cast away const then const stuff can't go into NV memory, which would make C pretty useless for many kinds of machine. "const" was an absolutely essential feature for the MCU world. For the very first pass of standardising C in the 1980s, const had to go into make it a suitable language for the embedded market.

And if you can't cast away constness then you can't write a debugger that pokes (ordinary) memory.

Damned if you can, damned if you can't => damned

The committee took years debating that in the early-mid 90s. That is damning in itself.
That was an issue in the 80s. These days, with most NV memory being flash, the debuggers just rewrite a page.

Oh, yuck. Q1: what happens if when the debugger gets the page size wrong?

coppice · « **Reply #79 on:** March 05, 2024, 01:55:31 pm »

Quote from: tggzzz on March 05, 2024, 01:23:50 pm

Quote from: coppice on March 05, 2024, 12:56:23 pm
Quote from: tggzzz on March 05, 2024, 12:53:26 pm
Quote from: coppice on March 05, 2024, 12:43:55 pm
Quote from: tggzzz on March 04, 2024, 11:46:19 pm
If you can cast away noalias/const then the compiler can (and with higher optimisation levels, will) generate incorrect code.
More basically, if you can cast away const then const stuff can't go into NV memory, which would make C pretty useless for many kinds of machine. "const" was an absolutely essential feature for the MCU world. For the very first pass of standardising C in the 1980s, const had to go into make it a suitable language for the embedded market.

And if you can't cast away constness then you can't write a debugger that pokes (ordinary) memory.

Damned if you can, damned if you can't => damned

The committee took years debating that in the early-mid 90s. That is damning in itself.
That was an issue in the 80s. These days, with most NV memory being flash, the debuggers just rewrite a page.

Oh, yuck. Q1: what happens if when the debugger gets the page size wrong?

Flash pages are fixed size. How could the debugger get them wrong? Page read, erase, and rewrite with modifications is normal practice in debuggers these days.

tggzzz · « **Reply #80 on:** March 05, 2024, 02:01:56 pm »

Quote from: coppice on March 05, 2024, 01:55:31 pm

Quote from: tggzzz on March 05, 2024, 01:23:50 pm
Quote from: coppice on March 05, 2024, 12:56:23 pm
Quote from: tggzzz on March 05, 2024, 12:53:26 pm
Quote from: coppice on March 05, 2024, 12:43:55 pm
Quote from: tggzzz on March 04, 2024, 11:46:19 pm
If you can cast away noalias/const then the compiler can (and with higher optimisation levels, will) generate incorrect code.
More basically, if you can cast away const then const stuff can't go into NV memory, which would make C pretty useless for many kinds of machine. "const" was an absolutely essential feature for the MCU world. For the very first pass of standardising C in the 1980s, const had to go into make it a suitable language for the embedded market.

And if you can't cast away constness then you can't write a debugger that pokes (ordinary) memory.

Damned if you can, damned if you can't => damned

The committee took years debating that in the early-mid 90s. That is damning in itself.
That was an issue in the 80s. These days, with most NV memory being flash, the debuggers just rewrite a page.

Oh, yuck. Q1: what happens if when the debugger gets the page size wrong?
Flash pages are fixed size. How could the debugger get them wrong? Page read, erase, and rewrite with modifications is normal practice in debuggers these days.

All MCUs and memory devices have exactly the same page size? That would surprise me.

coppice · « **Reply #81 on:** March 05, 2024, 02:06:12 pm »

Quote from: tggzzz on March 05, 2024, 02:01:56 pm

Quote from: coppice on March 05, 2024, 01:55:31 pm
Quote from: tggzzz on March 05, 2024, 01:23:50 pm
Quote from: coppice on March 05, 2024, 12:56:23 pm
Quote from: tggzzz on March 05, 2024, 12:53:26 pm
Quote from: coppice on March 05, 2024, 12:43:55 pm
Quote from: tggzzz on March 04, 2024, 11:46:19 pm
If you can cast away noalias/const then the compiler can (and with higher optimisation levels, will) generate incorrect code.
More basically, if you can cast away const then const stuff can't go into NV memory, which would make C pretty useless for many kinds of machine. "const" was an absolutely essential feature for the MCU world. For the very first pass of standardising C in the 1980s, const had to go into make it a suitable language for the embedded market.

And if you can't cast away constness then you can't write a debugger that pokes (ordinary) memory.

Damned if you can, damned if you can't => damned

The committee took years debating that in the early-mid 90s. That is damning in itself.
That was an issue in the 80s. These days, with most NV memory being flash, the debuggers just rewrite a page.

Oh, yuck. Q1: what happens if when the debugger gets the page size wrong?
Flash pages are fixed size. How could the debugger get them wrong? Page read, erase, and rewrite with modifications is normal practice in debuggers these days.

All MCUs and memory devices have exactly the same page size? That would surprise me.

No. Many MCUs even have some small and some large pages within one chip. However, that's part of the MCU's spec, which the debugger knows about.

tggzzz · « **Reply #82 on:** March 05, 2024, 02:10:32 pm »

Quote from: coppice on March 05, 2024, 02:06:12 pm

Quote from: tggzzz on March 05, 2024, 02:01:56 pm
Quote from: coppice on March 05, 2024, 01:55:31 pm
Quote from: tggzzz on March 05, 2024, 01:23:50 pm
Quote from: coppice on March 05, 2024, 12:56:23 pm
Quote from: tggzzz on March 05, 2024, 12:53:26 pm
Quote from: coppice on March 05, 2024, 12:43:55 pm
Quote from: tggzzz on March 04, 2024, 11:46:19 pm
If you can cast away noalias/const then the compiler can (and with higher optimisation levels, will) generate incorrect code.
More basically, if you can cast away const then const stuff can't go into NV memory, which would make C pretty useless for many kinds of machine. "const" was an absolutely essential feature for the MCU world. For the very first pass of standardising C in the 1980s, const had to go into make it a suitable language for the embedded market.

And if you can't cast away constness then you can't write a debugger that pokes (ordinary) memory.

Damned if you can, damned if you can't => damned

The committee took years debating that in the early-mid 90s. That is damning in itself.
That was an issue in the 80s. These days, with most NV memory being flash, the debuggers just rewrite a page.

Oh, yuck. Q1: what happens if when the debugger gets the page size wrong?
Flash pages are fixed size. How could the debugger get them wrong? Page read, erase, and rewrite with modifications is normal practice in debuggers these days.

All MCUs and memory devices have exactly the same page size? That would surprise me.
No. Many MCUs even have some small and some large pages within one chip. However, that's part of the MCU's spec, which the debugger knows about.

That makes sense. The issue is then to ensure the config information for the MCU is correct, and that the debugger is using the config related to the correct MCU.

That's "do-able", but obviously is not the most pressing issue.

coppice · « **Reply #83 on:** March 05, 2024, 02:19:32 pm »

Quote from: tggzzz on March 05, 2024, 02:10:32 pm

That makes sense. The issue is then to ensure the config information for the MCU is correct, and that the debugger is using the config related to the correct MCU.

That's "do-able", but obviously is not the most pressing issue.

Modern debuggers get an update each time relevant new chips are released. They can read the chip ID out of most chips, so they match up the config data with the hardware in a fairly robust manner.

Siwastaja · « **Reply #84 on:** March 05, 2024, 05:01:10 pm »

Quote from: SiliconWizard on March 05, 2024, 12:20:10 am

Can be interesting to look at: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=rust

Can someone shed light at what's happening here? Use-after-free, heap buffer overflows. Wasn't Rust supposed to completely get rid of exactly these types of memory errors? What went wrong?

coppice · « **Reply #85 on:** March 05, 2024, 05:09:58 pm »

Quote from: Siwastaja on March 05, 2024, 05:01:10 pm

Quote from: SiliconWizard on March 05, 2024, 12:20:10 am
Can be interesting to look at: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=rust
Can someone shed light at what's happening here? Use-after-free, heap buffer overflows. Wasn't Rust supposed to completely get rid of exactly these types of memory errors? What went wrong?

This is why I wrote against the stupidity of calling something memory safe or type safe. Try to stop one kind of corruption issue, and some new threading, DMA, GPU or other complexity will soon pick up the slack and keep the bug reporters in safe employment.

Nominal Animal · « **Reply #86 on:** March 05, 2024, 05:26:31 pm »

Quote from: tggzzz on March 05, 2024, 01:21:49 pm

Quote from: Nominal Animal on March 05, 2024, 12:28:30 pm
Quote from: tggzzz on March 05, 2024, 08:53:49 am
Problems do arise when managers/businesses don't want to pay for thorough checks
Here we are in violent agreement.
...
I am interested in the latter, but do not believe newer languages will solve all the problems: they will just replace the old set of problems with a new one, because humans just aren't very good at this sort of design, not yet anyway. I do hope Rust and others will become better than what we have now, but they're not there yet.

There we are, again, in violent agreement.

It then becomes about the philosophy of what to do: demand perfection or expect imperfection.

Yep, and not necessarily. (I'm not saying you are wrong, I am saying I see this differently, but am not sure either one is correct or "more correct".)

I like to at least think I am on the constant lookout for better tools, because a tool is always imperfect unless it is a simple statement of the answer. That is, given a particular problem, there are at least small changes possible to apply to the tool to make it even better suited for that particular problem. Perfection, therefore, is not a valid goal, unless we define it as the vague centerpoint related to a set of problems.

As an example, I use five wildly different programming languages just about every day: bash, awk, C, Python, and JavaScript. Their domains are so different I do not see it is even possible for a single programming language to be better than each of them in their respective domains. I can see adding to the set when the type of things I do changes, and replacing any one with a better one at that domain. (Well, I've already done that a few times. None of these were my "first" programming language, and I've gotten paid to write code in half a dozen to dozen other programming languages too.)

In the past, C tried to be a full-stack language, catering for everything from the lowest-level libraries to the highest-level abstractions. That didn't work, so objective-C and C++ bubbled off it by people who used the language to solve particular kinds of problems, using abstraction schemes they thought would make the new programming language a better tool.

Currently, C is mostly used as a systems programming language, for low-level implementation (kernels, firmwares) up to services (daemons in POSIX/Unix parlance) and libraries. In this domain, bugs related to memory accesses are prominent, and seen as a problem that needs fixing.

Thing is, memory safety is only one facet of issues, and is not sufficient criterion to be better than C.

Instead of introducing a completely new language, my logic is that since C has proven to be practical, but has these faults, fixing memory safety by adding the feature set I described in a backwards-compatible manner with zero runtime overhead, is likely to yield a better tool than designing a completely new one from scratch.

Essentially, by doing this derivative-language bubble, which simultaneously would mean a standard C library replacement with something else (which is not that big of a deal, considering the C standard explicitly defines the free-standing environment for that case), I claim that typical memory safety issues in C code can be easily and effectively avoided, while requiring relatively little adjustment from C programmers.

The more interesting thing here is to look at why such changes have not been proposed before. (They might have; I just haven't found any yet.)
Nothing in it is "novel", as it is simply based on the fact that for arrays, C compilers already do effective bounds checking at runtime within a single scope, even for variably modified types. Variants based on C seem to have simply added new abstractions, rather than delve into fixing C's known deficiencies wrt. code quality and bug type tendencies.

Moreover, any new abstraction or feature brings in its own set of problems. Always.

An example of this is how initialization of static (global) C++ objects happen in microcontrollers. Under fully featured OSes using ELF binaries, there is actually a section (.init_array) that contains only initializer function pointers that are called without arguments to initialize objects in the correct order. (It can be used in C, too, via GNU constructor function attribute.) On microcontrollers, the objects tend to be initialized as part of the RAM initialization process, copying or decompressing initial data from Flash/ROM to RAM, but a compiler may still generate similar initializer functions you need to call after initializing the RAM contents, but before the execution of the firmware image begins. The order in which these initializer functions are called can be extremely important, when an object refers to the state of another object at initialization time. (I am not sure if it is possible to construct a ring of dependencies that is impossible to implement in practice, although it would be a fun experiment; like proving the C++ template engine is Turing-complete.)

Any new feature will have its risks. A completely new programming language has an unproven track record, and an unknown set of risks and weaknesses. I am happy that others are developing the next generation of programming languages, even though I expect almost all of them to fail and lapse into niche use cases. It is unlikely that I will be using them in true anger (to solve real-world problems others are having) until they have at least a decade of development under their belt, though; they tend to take at least that long to find their "groove", and iron out their backwards-incompatible warts.

Because of the above, I do not believe a new language is likely to replace C anytime soon, but in the meantime, we might reap some significant rewards with relatively small backwards-compatible changes to C itself. This is the key. Why wait for the moon, when you can have a small asteroid now in the mean time?

The feature that is analogous to the memory-safety issue here is the difference in line input functions in standard C (fgets()) and POSIX.1 (getline()/getdelim()). The latter can easily deal with even binary data and embedded nuls (\0) in input, and has no inherent line length limitations. It is also extremely well suited for tokenization in line-based inputs; for CSV and similar formats where record separators can appear in fields as long as quoted, you need slightly more complicated functions. Yet, if you look at POSIX.1 C examples and tutorials, very few if any show how to use getline() or getdelim(), and instead focus on fgets(). Even moreso for opendir()/readdir()/closedir vs. nftw()/scandir()/glob(). Better tools exist in POSIX.1 C, but because one company (Microsoft) rejected it, most tutorials and guides teach the inferior tools.

You could say that my contribution is limited to showing how small changes to how people use C in anger could affect their bug density, especially memory-related bugs, in a worthwhile manner. I do not have the charisma or social skills to make any of that popular, though, which heavily colors my opinion as to what kind of results one can get in the programming language as a tool arena in general. To affect a change, you need PR and social manipulation, not new technology. And definitely not political decrees as to what kind of programming languages developers should use.

tggzzz · « **Reply #87 on:** March 05, 2024, 06:17:43 pm »

Quote from: coppice on March 05, 2024, 05:09:58 pm

Quote from: Siwastaja on March 05, 2024, 05:01:10 pm
Quote from: SiliconWizard on March 05, 2024, 12:20:10 am
Can be interesting to look at: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=rust
Can someone shed light at what's happening here? Use-after-free, heap buffer overflows. Wasn't Rust supposed to completely get rid of exactly these types of memory errors? What went wrong?
This is why I wrote against the stupidity of calling something memory safe or type safe. Try to stop one kind of corruption issue, and some new threading, DMA, GPU or other complexity will soon pick up the slack and keep the bug reporters in safe employment.

You're quite right, but you don't go far enough. Everything should be written in assembler.

Personally I prefer to apply my thought and concentration to my unique application, and prefer not to have to (re)do boring stuff that can be done by machines.

tggzzz · « **Reply #88 on:** March 05, 2024, 06:19:35 pm »

Quote from: Siwastaja on March 05, 2024, 05:01:10 pm

Quote from: SiliconWizard on March 05, 2024, 12:20:10 am
Can be interesting to look at: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=rust

Can someone shed light at what's happening here? Use-after-free, heap buffer overflows. Wasn't Rust supposed to completely get rid of exactly these types of memory errors? What went wrong?

Thank dog we have other compilers which are always bug-free and completely implement other languages, as defined by their standard.

tggzzz · « **Reply #89 on:** March 05, 2024, 06:46:10 pm »

Quote from: Nominal Animal on March 05, 2024, 05:26:31 pm

In the past, C tried to be a full-stack language, catering for everything from the lowest-level libraries to the highest-level abstractions. That didn't work, so objective-C and C++ bubbled off it by people who used the language to solve particular kinds of problems, using abstraction schemes they thought would make the new programming language a better tool.

When Objective-C and C++ started in the mid-80s, C was only a systems programming language.
Numerical programming: use Fortran (still better, I'm informed).
Business programming: use COBOL (I decided against that before university)
IDEs: use Smalltalk, which showed the way for the next 15 years!

Quote

Currently, C is mostly used as a systems programming language, for low-level implementation (kernels, firmwares) up to services (daemons in POSIX/Unix parlance) and libraries. In this domain, bugs related to memory accesses are prominent, and seen as a problem that needs fixing.

The problem was that in the early 80s C faced a choice: to be a systems programming language or to be a general purpose application language. Either would have been practical and reasonable. But in attempting to satisfy both requirements, the compromises and complexity caused it to be bad at both.

Now people have correctly decided it is deficient as a general purpose application language, and abandoned that usage. But it still has a lot of baggage.

Quote

Thing is, memory safety is only one facet of issues, and is not sufficient criterion to be better than C.

In many cases it is sufficient to be regarded as better than C. People have (correctly, IMHO) voted with their feet keyboards.

Quote

Instead of introducing a completely new language, my logic is that since C has proven to be practical, but has these faults, fixing memory safety by adding the feature set I described in a backwards-compatible manner with zero runtime overhead, is likely to yield a better tool than designing a completely new one from scratch.

The stuff added to C to try to bring it out of the 70s is baroquely complex. Better to start afresh with concepts and technology developed and proven since then.

Simplicity is a virtue; KISS.

Quote

Moreover, any new abstraction or feature brings in its own set of problems. Always.

Agreed.

A well-conceived group of abstractions that work together harmoniously brings far more benefits than problems, and is thus a good tradeoff.

None of that applies to modern C or modern C++.

Quote

Any new feature will have its risks. A completely new programming language has an unproven track record, and an unknown set of risks and weaknesses. I am happy that others are developing the next generation of programming languages, even though I expect almost all of them to fail and lapse into niche use cases. It is unlikely that I will be using them in true anger (to solve real-world problems others are having) until they have at least a decade of development under their belt, though; they tend to take at least that long to find their "groove", and iron out their backwards-incompatible warts.

We agree.

My career has consisted of evaluating languages/technologies - and choosing to ignore them because they are merely "shiny chrome" variations rather than fundamentally different and better.

Rust is getting there, after starting in 2006/2009/2015 depending on your preference.

Quote

Because of the above, I do not believe a new language is likely to replace C anytime soon, but in the meantime, we might reap some significant rewards with relatively small backwards-compatible changes to C itself. This is the key. Why wait for the moon, when you can have a small asteroid now in the mean time?

True.

COBOL isn't going away, and the PDP-11 will continue to be used and supported until 2050 at least. https://www.theregister.com/2013/06/19/nuke_plants_to_keep_pdp11_until_2050/

And the B-52 BUFF is still being upgraded.

baldurn · « **Reply #90 on:** March 05, 2024, 06:52:07 pm »

Quote from: Siwastaja on March 05, 2024, 05:01:10 pm

Quote from: SiliconWizard on March 05, 2024, 12:20:10 am
Can be interesting to look at: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=rust

Can someone shed light at what's happening here? Use-after-free, heap buffer overflows. Wasn't Rust supposed to completely get rid of exactly these types of memory errors? What went wrong?

The first CVE from that link is:

"CVE-2024-27284 cassandra-rs is a Cassandra (CQL) driver for Rust. Code that attempts to use an item (e.g., a row) returned by an iterator after the iterator has advanced to the next item will be accessing freed memory and experience undefined behaviour. The problem has been fixed in version 3.0.0."

I followed the link which took me to a github pull request that fixes this bug. The freed memory they are talking about is freed by a C driver that is called from the Rust code. Make your own conclusions about C and Rust from that :-)

coppice · « **Reply #91 on:** March 05, 2024, 06:54:06 pm »

Quote from: tggzzz on March 05, 2024, 06:17:43 pm

Quote from: coppice on March 05, 2024, 05:09:58 pm
Quote from: Siwastaja on March 05, 2024, 05:01:10 pm
Quote from: SiliconWizard on March 05, 2024, 12:20:10 am
Can be interesting to look at: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=rust
Can someone shed light at what's happening here? Use-after-free, heap buffer overflows. Wasn't Rust supposed to completely get rid of exactly these types of memory errors? What went wrong?
This is why I wrote against the stupidity of calling something memory safe or type safe. Try to stop one kind of corruption issue, and some new threading, DMA, GPU or other complexity will soon pick up the slack and keep the bug reporters in safe employment.

You're quite right, but you don't go far enough. Everything should be written in assembler.

Personally I prefer to apply my thought and concentration to my unique application, and prefer not to have to (re)do boring stuff that can be done by machines.

You gives some very weird replies that seem to miss the point entirely.

Nominal Animal · « **Reply #92 on:** March 05, 2024, 08:50:44 pm »

One of the patterns I like to use in C is
type *p = NULL;
and after possible dynamic allocation and resizing,
free(p);
p = NULL;
The key point is that free(NULL) is safe, and does nothing.

This does not fix the double-free or use-after-free cases that occur because the same thing is used in different threads with insufficient locking, or because a function uses a temporary copy of p while the original gets destroyed, but it does expose the common use-after-free cases and defuses double-free bugs using the original pointer.

Yet, for some reason, most C programmers see the initial NULL assignment and the final p = NULL; as superfluous/ugly/unstylish, even though it is just defensive programming, and costs minimal machine code. (Their cost will be lost within optimizer noise.)

(I'd prefer free() to always return (void *)0, so one could do p = free(p);, which may look odd initially to some, but tends to generate quite sensible code on many architectures, and I feel could easily become a habit. I don't like freep(&p);, because it hints at timing promises it cannot provide; the implicit "p is freed before it is NULLed" is useful to me as a pattern behaviour reminder.)

Similarly, my grow-as-needed pattern tends to be initialized to explicit "none" via
type *data = NULL;
size_t size = 0;
size_t used = 0;
with additional room allocated via
if (used + need > size) {
size_t new_size = allocation_policy(used + need);
void *new_data = realloc(data, new_size * sizeof data[0]);
if (!new_data) {
// Failure, but data is still valid.
free(data); data = NULL; used = 0; size = 0;
return error;
}
data = new_data; // *dataptr = new_data;
size = new_size; // *sizeptr = new_size;
}
If data and size are aliases of pointer to same supplied by the caller (as in e.g. getline()), they're assigned initially and updated after each reallocation, otherwise all accesses are via data, size, and used. (I omitted the overflow checks for used+need, allocation_policy() and new_size*sizeof data[0], for simplicity.)

This is pretty much bullet-proof memory access safety-wise. I often use it to read data from potentially large data sets, in chunks up to 2 MiB or so (configurable at compile time), with each additional read reading up to (size-used-n) bytes, to data+used, where n is the number of additional trailing bytes needed when processing the input, and need > n, need <= configurable_chunk_size_limit. For now, this balances the number of syscalls used and the overhead in setting up or updating the virtual memory mapping for the file contents.
As I use pipes extremely often to pass data to/from my programs, I don't use memory mapping unless I know the target/source is an actual file.

One detail to realize is that if nothing is added to the array, it may not be allocated at all. I often avoid this by having a final optimizing realloc whenever used+n>size or (size>used && (size-used-n)*sizeof data[0]>limit), i.e. whenever needed or more than limit bytes would be wasted. (Most hosted C libraries will only return allocated memory back to the OS if it was large enough originally.)

Yet, this pattern seems surprising to many C programmers, because they are not aware that realloc(NULL,N) is exactly equivalent to malloc(N). In many cases, they believe the initial allocation must be done using malloc(), which tends to complicate the code.

For my own dynamically allocated and passed structures, they often have a data-ish C99 flexible array member,
struct typename {
// whatever elements I need, plus then
size_t size;
size_t used;
type data[];
};
where size is the number of elements allocated for the flexible data array, and used is the number of initial elements currently in use there.

These are the tools I use to write "memory-safe" code in C. It is not perfect über-skillz stuff. It is just a set of sensible patterns, and a healthy dose of suspicion against any assumption on what a given parameter or variable value might be. I like to use data aliasing to my advantage, so tend to check for it at run time when it matters. I can see how some find this lot of effort, if they've not learned the defensive approaches from the get go.
My biggest peeve on that front is the ubiquitous "We'll add error checking later, when we have more time", which is just a complicated way of saying "We're not gonna bother", because robustness is not something you add on top, it is something you either design in, or don't have.

Fact is, this kind of practical defensive patterns in C code are rare. They could be used, and if used they would reduce the number of memory-related bugs, but they aren't. I'm very comfortable with them, and don't think they take any more "effort" than any other approach. The reason these are not used is not technical, just social/cultural/habit.

I don't know how to change programmers' habits. Examples only sway those who are already looking for better tools, and they'd likely have found all these on their own given enough time.

SiliconWizard · « **Reply #93 on:** March 05, 2024, 10:01:38 pm »

To elaborate on that, I've personally not used standard memory allocation functions in C *directly* in ages.
I've developed, long time ago already, a set of macros that would pretty much do what you describe above in a streamlined way.
Too bad for macro haters, this has worked wonderfully well as far as I'm concerned for many, many years.

I have also written my own allocators, that I use in some specific cases as a replacement for the standard ones (and I do that more and more these days). They are not general-purpose, global allocators such as malloc(), and thus require more thought when using them, about lifetime in particular.

Wil_Bloodworth · « **Reply #94 on:** March 05, 2024, 11:20:56 pm »

Quote from: Nominal Animal on March 05, 2024, 08:50:44 pm

My biggest peeve on that front is the ubiquitous "We'll add error checking later, when we have more time", which is just a complicated way of saying "We're not gonna bother", because robustness is not something you add on top, it is something you either design in, or don't have.

Truth!

Quote from: Nominal Animal on March 05, 2024, 08:50:44 pm

I don't know how to change programmers' habits. Examples only sway those who are already looking for better tools, and they'd likely have found all these on their own given enough time.

I have run into this issue as well and have threatened the team with required pull requests is the laziness continues. Our team knows better... they're just lazy. The beatings will continue until morale improves! LOL

- Wil

SiliconWizard · « **Reply #95 on:** March 05, 2024, 11:34:46 pm »

Laziness well applied can be a virtue in engineering. It's what drives you to elaborate architectures to make further development much easier after that initial effort, similarly to factor code so that you'll avoid a lot of repetitions and tedious coding after that. It's also what pushes leaner designs, rather than overbloated ones.

I think the whole point is in understanding that this initial effort is required to enjoy your laziness in the longer run. And so, IMO the main problem is not with software developers being lazy per se, but the need for immediate reward, preventing them from investing this initial effort to make their life much easier afterwards.

This appeal for immediate rewards is what plagues software engineering in particular, and our whole socieity in general.

tggzzz · « **Reply #96 on:** March 06, 2024, 08:53:58 am »

Quote from: SiliconWizard on March 05, 2024, 11:34:46 pm

Laziness well applied can be a virtue in engineering. It's what drives you to elaborate architectures to make further development much easier after that initial effort, similarly to factor code so that you'll avoid a lot of repetitions and tedious coding after that. It's also what pushes leaner designs, rather than overbloated ones.

I think the whole point is in understanding that this initial effort is required to enjoy your laziness in the longer run. And so, IMO the main problem is not with software developers being lazy per se, but the need for immediate reward, preventing them from investing this initial effort to make their life much easier afterwards.

This appeal for immediate rewards is what plagues software engineering in particular, and our whole socieity in general.

Just so, but don't forget to add "show me the reward structure and I'll tell you how people will behave".

I know my implementation is going well when the number of lines of code reduces. Has the bonus of confusing the hell out of idiot managers who measure productivity by the number of lines of code

Luckily I managed to avoid that all but once, and that place was an unpleasant place to work with code strategies that make people laugh in disbelief!

Wil_Bloodworth · « **Reply #97 on:** March 06, 2024, 05:01:03 pm »

Quote from: tggzzz on March 06, 2024, 08:53:58 am

Has the bonus of confusing the hell out of idiot managers who measure productivity by the number of lines of code

Luckily, we are seeing the classical hierarchical employment structure dissolving away as companies have realized that people who are not producing value have no place in their company. Managers who generally do nothing but bark at people are [thankfully] becoming a scarcity these days; at least in the environments I have seen.

"Scrum style" work places where the entire team meets [called the "standup"] for 15 minutes each day at the same time and place and go around the circle saying this, "Yesterday, I worked on X. Today, I am working on Y. I have 0..N blocks."... very quickly makes it very obvious who is not contributing value to the team and thus, the company. Is there a place for managers? Absolutely. Do I think most of them sit on their butt most of the day and do nothing... also yes.

Future Software Should Be Memory Safe, says the White House

tggzzz · « **Reply #98 on:** March 06, 2024, 05:29:30 pm »

Quote from: Wil_Bloodworth on March 06, 2024, 05:01:03 pm

"Scrum style" work places where the entire team meets [called the "standup"] for 15 minutes each day at the same time and place and go around the circle saying this, "Yesterday, I worked on X. Today, I am working on Y. I have 0..N blocks."... very quickly makes it very obvious who is not contributing value to the team and thus, the company. Is there a place for managers? Absolutely. Do I think most of them sit on their butt most of the day and do nothing... also yes.

That too can be a problem.

It is fine for a boring project, by which I mean one where it is obvious how to do it because you have done something very similar before. In such projects you can just steam ahead throwing lots of little bits of functionality together, in the knowledge they will all work together as expected. CRUD projects are classics (create read update delete).

It fails for interesting projects where any of these apply:

you are inventing something
you are finding a path through new territory, using new concepts
it is reasonable to expect that earlier work will have to be undone, as requirements/benefits become apparent
"thinking before doing" is more productive than "doing and finding it didn't work", a.k.a. "no time do do it right in the first place but always time to do it over"

I've made my luck: most of my projects had at least one of those characteristics

nctnico · « **Reply #99 on:** March 06, 2024, 05:54:50 pm »

Quote from: tggzzz on March 06, 2024, 05:29:30 pm

"thinking before doing" is more productive than "doing and finding it didn't work", a.k.a. "no time do do it right in the first place but always time to do it over"

This typically ends up as: "no time do do it right in the first place and NO time to do it over"

In some projects I have consulting on, I had to put up quite a fight to convince management the only way forward is to take a few steps back and do the design properly. To avoid problems piling up so high that the company goes under. Things get very ugly when people tried to design hardware scrum style..


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Future Software Should Be Memory Safe, says the White House (Read 3741 times)

Share me