Author Topic: Future Software Should Be Memory Safe, says the White House (Read 3744 times)

tggzzz · « **Reply #50 on:** March 04, 2024, 10:36:27 am »

Quote from: nctnico on March 04, 2024, 01:04:28 am

Quote from: tggzzz on March 02, 2024, 09:13:15 am
To make that tractable you need to constrain the problem, and that is best done by having a constrained language and environment. The trick is to have constraints which are easy enough to work with and still offer useful guarantees for a large number of problems.
Which is why it makes much sense to have a thin C/ C++ layer and use sand-boxed languages like Lua or Python to implement the logic. A program can still crash, but the C / C++ layer can do a graceful recovery.

That's one workable approach. There are others, and there will need to be more invented in the future.

Marco · « **Reply #51 on:** March 04, 2024, 02:22:24 pm »

Quote from: Siwastaja on March 04, 2024, 08:28:32 am

A practical simple example would be inertial navigation system which does dead reckoning.

Or a centrifuge controller.

Nominal Animal · « **Reply #52 on:** March 04, 2024, 03:50:25 pm »

If a C compiler propagated array bounds checking across function prototypes at compile time, allowed modifier variables to be listed after the variably modified array instead of only before, then it would be rather easy to write memory-safe code in C, too.

In practice, it'd mean you'd declare e.g. strncmp() as
int strncmp(const char s1[n], const char s2[n], size_t n);
which would be compatible with all existing C code, except that at compile time, the compiler could check that n does not exceed the array bounds at compile time.

This is valid today, ever since ISO C99, although one must declare n before the array parameters. Unfortunately, compilers don't bother to check at compile time whether the array bounds thus specified exceed the array bounds it knows about. They could; it's just that it isn't considered important or useful for anybody to implement it yet.

The only functions this cannot help with are those designed to work with all kinds of arrays via void pointers, i.e. memcpy(), memmove(), memset(), memcmp(), qsort(). For those, the compiler would need to support an array of unsigned char type (byte [] below, with noalias array attribute corresponding to restrict pointer attribute) with void * casting semantics, so that they could be declared as
byte[n] memcpy(noalias byte dest[n], const noalias byte src[n], size_t n);
byte[n] memmove(byte dest[n], const byte src[n], size_t n);
byte[n] memset(byte dest[n], const byte value, size_t n);
int memcmp(const byte src1[n], const byte src2[n], size_t n);
void qsort(byte base[n * size], size_t n, size_t size, int (*compare)(const byte src1[size], const byte src2[size]));

For the cases where data is obtained in the form of a pointer and size –– consider malloc(), realloc() for example –– a construct that associates a pointer and size (in unsigned chars/bytes, or elements) to form an array, would complete the set of features.

With these features, programmers would still actually need to choose to write memory-access-safe code using the above pattern instead of always working with pointers, though.

All of the above generates the exact same machine code than the current declarations and implementations, too: there is absolutely no run-time overhead associated with any of this. The entire idea is to expose the implicit array bounds at compile time, propagate them through function calls by letting the compiler know how each pointer and size are associated with each other (forming an "array"), and check for possible bounds breaking at compile time.

The fact that we have objective-C and C++, but lack even compile-time bounds checking for variably-modified array bounds in C, tells me memory safety is more a niche and politics and research subject than a practical request by those doing programming to implement things in practice. In practice, memory safety is really just a programming attitude, an approach, rather than some magical property of the programming language itself.

Marco · « **Reply #53 on:** March 04, 2024, 04:28:30 pm »

Quote from: Nominal Animal on March 04, 2024, 03:50:25 pm

practical request by those doing programming to implement things in practice

Indeed, some check writers have decided to stop listening to them though. In practice, the programmers will request money before C.

Nominal Animal · « **Reply #54 on:** March 04, 2024, 04:48:27 pm »

Quote from: Marco on March 04, 2024, 04:28:30 pm

Quote from: Nominal Animal on March 04, 2024, 03:50:25 pm
practical request by those doing programming to implement things in practice

Indeed, some check writers have decided to stop listening to them though. In practice, the programmers will request money before C.

I'm including the check writers in "those doing programming to implement things in practice", i.e. as in entire companies.

Put simply, those using C to get stuff done, aren't interested in memory safety.

Those who are interested in memory safety and are get paid to get stuff done, either target a specialized niche (aviation, medical, etc) or have specialized in a particular programming language (Ada, Fortran, COBOL, Java, JavaScript, etc).

Those who are interested in memory safety seem to be talking politics, pushing their new programming language, or doing research; and are not being paid to create real-world software tools, utilities and applications. It is one thing to stand next to workers and talk about how they should be doing their work, and a completely another to do the work well and find ways to do it even better.

tggzzz · « **Reply #55 on:** March 04, 2024, 05:24:44 pm »

Quote from: Nominal Animal on March 04, 2024, 03:50:25 pm

If a C compiler propagated array bounds checking across function prototypes at compile time, allowed modifier variables to be listed after the variably modified array instead of only before, then it would be rather easy to write memory-safe code in C, too.

Don't forget the consequences of aliasing.

Don't forget the consequences of what happens inside the library code that your [remarkably and completely competent] programmers and organisation didn't write.

Quote

The fact that we have objective-C and C++, but lack even compile-time bounds checking for variably-modified array bounds in C, tells me memory safety is more a niche and politics and research subject than a practical request by those doing programming to implement things in practice. In practice, memory safety is really just a programming attitude, an approach, rather than some magical property of the programming language itself.

Memory safety is "ignored" because antiquated tools don't offer it. That's the equivalent of putting hands over your eyes and fingers in your ears.

Where safety does matter, rules have been created to sub-set the antiquated languages. Adherence to the rules is not trivial.

If modern tools avoid problems, they should be used.

Nominal Animal · « **Reply #56 on:** March 04, 2024, 05:40:54 pm »

Quote from: tggzzz on March 04, 2024, 05:24:44 pm

Don't forget the consequences of aliasing.

Read the post. noalias keyword would be equivalent to restrict for pointers.

You can argue whether aliasing should be marked the opposite way (i.e., aliased data forbidden unless marked by may_alias or similar), but that is not a technical point: it is a social one, and revolves around how to entice programmers to use the tools they have properly.

Quote from: tggzzz on March 04, 2024, 05:24:44 pm

Don't forget the consequences of what happens inside the library code that your [remarkably and completely competent] programmers and organisation didn't write.

You mean, because the world is full of shit, it makes no sense to generate anything better than shit?

This is compile-time bounds checking via declared interfaces. If a library exposes a function foo(byte a[n], size_t n, byte b[m], size_t m), we can only assume it is correctly implemented and only accesses the data at indexes 0 through n-1 and 0 through m-1, inclusive. This applies to all compiled languages, even Rust. Thing is, the compiler can check at the caller whether specifying such arrays is safe, at compile time.

If that library was implemented with the same approach, then the entire chain is bounds-safe. The compiler verified that the library function does not access the arrays out-of-bounds, when compiling the library. It just is that damn simple.

tggzzz · « **Reply #57 on:** March 04, 2024, 07:21:48 pm »

Quote from: Nominal Animal on March 04, 2024, 05:40:54 pm

Quote from: tggzzz on March 04, 2024, 05:24:44 pm
Don't forget the consequences of aliasing.
Read the post. noalias keyword would be equivalent to restrict for pointers.

Last I looked, which was a long time ago, noalias means that the compiler can assume there is no aliasising, and can optimise the shit out of the code. Whether there is aliasing is a completely different issue. Halting problem springs to mind.

Dennis Ritchie, who knows far more about such things than I do, apparently "dislikes" noalias.
https://www.yodaiken.com/2021/03/19/dennis-ritchie-on-alias-analysis-in-the-c-programming-language-1988/

Quote

Quote from: tggzzz on March 04, 2024, 05:24:44 pm
Don't forget the consequences of what happens inside the library code that your [remarkably and completely competent] programmers and organisation didn't write.
You mean, because the world is full of shit, it makes no sense to generate anything better than shit?

The world is - and always will be - full of shit programmers and business practices. Deal with it accordingly.

Don't cover your eyes and hope it improves. It won.

Siwastaja · « **Reply #58 on:** March 04, 2024, 07:36:17 pm »

Quote from: tggzzz on March 04, 2024, 07:21:48 pm

Last I looked, which was a long time ago, noalias means

Maybe you missed the fact there is no "noalias" keyword in C. This was a suggestion.

Marco · « **Reply #59 on:** March 04, 2024, 07:48:07 pm »

Quote from: Nominal Animal on March 04, 2024, 04:48:27 pm

Those who are interested in memory safety and are get paid to get stuff done

As I said, Apple made an entire memory safe C dialect for the iOS bootloader. Few companies will go to that much effort to maintain their own system programming language. The check writers have been led by the nose by the programmers for decades, they simply had no alternative and were propagandised by the programmers to prevent them from even investing in possible alternatives.

Alternatives are arising now, decades late. American government is a huge check writer, if the press release gets translated into procurement requirements the alternatives will see a lot of investment.

tggzzz · « **Reply #60 on:** March 04, 2024, 07:59:23 pm »

Quote from: Siwastaja on March 04, 2024, 07:36:17 pm

Quote from: tggzzz on March 04, 2024, 07:21:48 pm
Last I looked, which was a long time ago, noalias means

Maybe you missed the fact there is no "noalias" keyword in C. This was a suggestion.

What makes you think I missed it?

I haven't spent much time on C since the committee spent years debating whether it should be possible or impossible to "cast away const". There are good arguments for and against either decision, which is a good indication that there are fundamental problems lurking in the language.

Now, is is possible, within the language specification, to "cast away noalias"? If you or a library/debugger/etc does, are there any guarantees about what happens - or are nasal daemons possible?

uliano · « **Reply #61 on:** March 04, 2024, 08:11:46 pm »

Quote from: baldurn on March 02, 2024, 02:59:55 pm

Quote from: Siwastaja on March 02, 2024, 02:43:22 pm
While a good design goal for a abstraction layer, I have never seen any of this in real life.

In the case of Rust and common MCUs like STM32 the open source people already solved this. We have good crates with HALs with all the bells and whistles.

https://docs.rs/stm32-hal2/latest/stm32_hal2/

And then you follow the link and read this:

Code: [Select]


Errata

SDIO and ethernet unimplemented
DMA unimplemented on F4, and L552
H7 BDMA and MDMA unimplemented
H5 GPDMA unimplemented
USART interrupts unimplemented on F4
CRC unimplemented for F4
High-resolution timers (HRTIM), Low power timers (LPTIM), and low power usart (LPUSART) unimplemented
ADC unimplemented on F4
Low power modes beyond csleep and cstop aren't implemented for H7
WB and WL are missing features relating to second core operations and RF
L4+ MCUs not supported
WL is missing GPIO port C, and GPIO interrupt support
If using PWM (or output compare in general) on an Advanced control timer (eg TIM1 or 8), you must manually set the TIMx_BDTR register, MOE bit.
Octospi implementation is broken
DFSDM on L4x6 is missing Filter 1.
Only FDCAN1 is implemented; not FDCAN2 or 3 (G0, G4, H7).
H5 is missing a lot of functionality, including DMA.

Nominal Animal · « **Reply #62 on:** March 04, 2024, 09:20:46 pm »

Quote from: tggzzz on March 04, 2024, 07:59:23 pm

I haven't spent much time on C since the committee spent years debating whether it should be possible or impossible to "cast away const". There are good arguments for and against either decision, which is a good indication that there are fundamental problems lurking in the language.

If you consider that relevant to memory-safety, then by the same logic Rust isn't memory-safe, because it allows the programmer to write unsafe code.

Quote from: tggzzz on March 04, 2024, 07:59:23 pm

Now, is is possible, within the language specification, to "cast away noalias"?

In general, I do not believe a programming language should cater to the least common denominator, i.e. to try and stop people from shooting themselves in the face with the code they write. I am not willing to trade any efficiency or performance for safety, because I can do that myself at run time.

(I often use C99 flexible array members in a structure with both the number of elements allocated for (size) and the number of elements in use (used) in the array. It is trivial to check accesses are within the range of used elements, and when additional room has to be allocated for additional elements. You can even use the same "trick" as ELF uses internally, and reserve the initial element for invalid or none; only positive indexes are valid then, and you don't need to abort at run time.)

In practice, whether casting away const or noalias/restrict should be allowed or not, depends on the exact situation and context. It is more about style and a part of code quality management tools an organization might use.

If you care enough, you can use _Generic() since C11 to map the calls with non-const/non-noalias/non-restrict arguments to one variant that can deal with that, and the others to the optimized version which relies on const/noalias/restrict. You can even simply clone the symbol, and not even wrap the function calls with pre-vetted casts or copy-paste the code, when the two versions are effectively the same.

Very little C code uses _Generic(), though; typically only stuff effectively like
#define sqrt(x) __Generic(x, float: sqrtf, long double: sqrtl, _Float128: sqrtf128, default: sqrt)(x)
but it does work for qualifiers. For example, you could have
size_t strnlen_cuc(const unsigned char s[n], size_t n);
size_t strnlen_csc(const signed char s[n], size_t n);
size_t strnlen_cc(const char s[n], size_t n);
size_t strnlen_uc(unsigned char s[n], size_t n);
size_t strnlen_sc(signed char s[n], size_t n);
size_t strnlen_c(char s[n], size_t n);
#define strnlen(s, n) _Generic(s, const unsigned char *: strnlen_cuc, const signed char *: strnlen_csc, const char *: strnlen_cc, unsigned char *: strnlen_uc, signed char *: strnlen_sc, char *:strnlen_c)(s, n)
with all six being just aliases to the same strnlen function symbol (because the machine-code implementation stays exactly the same in all six cases).

For functions like strstr() you'd have many more symbols, yes, but the return value would have the correct qualifiers (based on the first argument), which the compiler can enforce.

So, it's not like we don't already have tools to solve the issues like avoiding having to cast away const'ness, simply by declaring all the acceptable variants and selecting the appropriate one at compile time using _Generic(); the issue is that C programmers do not want to.

Simply put, the issue is social, not technological. If a C programmer wants to write memory-safe code, they will need to replace the standard C library with something better, or just use trivial wrappers (basically generating no extra code) around them, but then they absolutely can if they wish to.

Nominal Animal · « **Reply #63 on:** March 04, 2024, 09:26:56 pm »

Now, if the intent is to get even idiots and LLMs to write "safe" code, then the language needs to be designed from the get go for those who otherwise would be sitting in a quiet corner stuffing crayons up their nostrils and eating white glue.

I'm not interested in those. Given the choice between dangerous-but-unlimited and safe-but-limited, I always choose the first one, because I can do "safe" myself. Again, the large majority of existing C code is crappy not because C itself is crappy, but because the majority of C users are not interested in writing non-crappy code. One can write robust, safe code even in PHP (gasp!), although it does require some configuration settings to be set to non-insane values.

(Anyone else remember magic quotes? More like "set this if you like to eat white glue and don't recognize all the letters of the alphabet yet, or are in too much of a hurry to even read what you wrote".)

JPortici · « **Reply #64 on:** March 04, 2024, 09:56:51 pm »

Quote from: uliano on March 04, 2024, 08:11:46 pm

Quote from: baldurn on March 02, 2024, 02:59:55 pm
Quote from: Siwastaja on March 02, 2024, 02:43:22 pm
While a good design goal for a abstraction layer, I have never seen any of this in real life.

In the case of Rust and common MCUs like STM32 the open source people already solved this. We have good crates with HALs with all the bells and whistles.

https://docs.rs/stm32-hal2/latest/stm32_hal2/

And then you follow the link and read this:
Code: [Select]
Errata SDIO and ethernet unimplemented DMA unimplemented on F4, and L552 H7 BDMA and MDMA unimplemented H5 GPDMA unimplemented USART interrupts unimplemented on F4 CRC unimplemented for F4 High-resolution timers (HRTIM), Low power timers (LPTIM), and low power usart (LPUSART) unimplemented ADC unimplemented on F4 Low power modes beyond csleep and cstop aren't implemented for H7 WB and WL are missing features relating to second core operations and RF L4+ MCUs not supported WL is missing GPIO port C, and GPIO interrupt support If using PWM (or output compare in general) on an Advanced control timer (eg TIM1 or 8), you must manually set the TIMx_BDTR register, MOE bit. Octospi implementation is broken DFSDM on L4x6 is missing Filter 1. Only FDCAN1 is implemented; not FDCAN2 or 3 (G0, G4, H7). H5 is missing a lot of functionality, including DMA.

that post and the other were a bit of propaganda that i chose not to answer to.
What's the point of HALs anyway? a MCU embedded system is not a linux SBC that can accomodate tons of different applications from the same base board plus daughterboard (or that is running linux that require special considerations when writing firmware anyway because it's being as generic as possible). When your company designs a board and writes firmware for it, the people designing the hardware will KNOW if the pin can do this or that function, then the firmware writer will KNOW what function to assign, and what peripheral to use. HAL is not an excuse for not ever reading the datasheet, hence it's basically a waste of time.
I'm sure Rust can bring other benefits to the table but in my opinion not when you are too close to the metal, have to write drivers or anything that interact a lot with peripherals. In a few months i'll probably give yet another try and see if things improved. They have, since last time

SiliconWizard · « **Reply #65 on:** March 04, 2024, 11:24:52 pm »

Quote from: Nominal Animal on March 04, 2024, 09:26:56 pm

Now, if the intent is to get even idiots and LLMs to write "safe" code, then the language needs to be designed from the get go for those who otherwise would be sitting in a quiet corner stuffing crayons up their nostrils and eating white glue.

I'm not interested in those. Given the choice between dangerous-but-unlimited and safe-but-limited, I always choose the first one, because I can do "safe" myself. Again, the large majority of existing C code is crappy not because C itself is crappy, but because the majority of C users are not interested in writing non-crappy code. One can write robust, safe code even in PHP (gasp!), although it does require some configuration settings to be set to non-insane values.

Yeah, how dare you? You're a memory safety denier.

tggzzz · « **Reply #66 on:** March 04, 2024, 11:46:19 pm »

Quote from: Nominal Animal on March 04, 2024, 09:20:46 pm

Quote from: tggzzz on March 04, 2024, 07:59:23 pm
I haven't spent much time on C since the committee spent years debating whether it should be possible or impossible to "cast away const". There are good arguments for and against either decision, which is a good indication that there are fundamental problems lurking in the language.
If you consider that relevant to memory-safety, then by the same logic Rust isn't memory-safe, because it allows the programmer to write unsafe code.

There are degrees of unsafety, as you are well aware

Quote

Quote from: tggzzz on March 04, 2024, 07:59:23 pm
Now, is is possible, within the language specification, to "cast away noalias"?
In general, I do not believe a programming language should cater to the least common denominator, i.e. to try and stop people from shooting themselves in the face with the code they write. I am not willing to trade any efficiency or performance for safety, because I can do that myself at run time.

I accept you are a perfect programmer that writes all the code in your application, and that you work with perfect compilers that correctly implement the full standard.

Lucky you.

Quote

In practice, whether casting away const or noalias/restrict should be allowed or not, depends on the exact situation and context. It is more about style and a part of code quality management tools an organization might use.

You miss the point.

If you can't cast away noalias/const then you can't write some tools, debuggers being the classic example.
If you can cast away noalias/const then the compiler can (and with higher optimisation levels, will) generate incorrect code.
That is an insoluble dilemma with only one sensible answer: the dilemma should be un-asked.

Now, what's your response to

Quote from: tggzzz on March 04, 2024, 07:59:23 pm

... If you or a library/debugger/etc does [cast away const/noalias], are there any guarantees about what happens - or are nasal daemons possible?

That question is valid, and can't be ignored.
Do other parties all agree with your response?

Quote

Simply put, the issue is social, not technological.

I start from assuming this world and its inhabitants.

I would like to live in A Better World, but so far I haven't succeeded.

Nominal Animal · « **Reply #67 on:** March 05, 2024, 12:00:50 am »

Quote from: SiliconWizard on March 04, 2024, 11:24:52 pm

Quote from: Nominal Animal on March 04, 2024, 09:26:56 pm
Now, if the intent is to get even idiots and LLMs to write "safe" code, then the language needs to be designed from the get go for those who otherwise would be sitting in a quiet corner stuffing crayons up their nostrils and eating white glue.

I'm not interested in those. Given the choice between dangerous-but-unlimited and safe-but-limited, I always choose the first one, because I can do "safe" myself. Again, the large majority of existing C code is crappy not because C itself is crappy, but because the majority of C users are not interested in writing non-crappy code. One can write robust, safe code even in PHP (gasp!), although it does require some configuration settings to be set to non-insane values.

Yeah, how dare you? You're a memory safety denier.

Joking aside, there are use cases for domain-specific and general purpose scripting languages like Lua and Python and JavaScript, that can be embedded in a "native" application or service, and let nontechnical users specify business logic or UI elements and actions in a "safe" manner.

I'm personally happy to provide for them, and embed whatever interpreter they prefer; I like such designs, actually. I'm just not interested in limiting myself to working with such programming languages only. What I want, is maximal compile-time verification without runtime overhead.

For those familiar with Fortran 95 and later, it is interesting to consider its array implementation. Essentially every array is passed as a triplet (origin, stride, count), which allows all regular linear slice operations to work without having to copy data. There is no reason one could not implement the same in C. In fact, I've mentioned a tiny variable-size vector-matrix library I created, where matrices and vectors are just views to arbitrary data –– lots of aliasing here! –– which relies on the same. For matrix element r,c, the offset relative to data origin is r*rowstride+c*colstride. This allows one to have a matrix, and separately a vector of its diagonal or off-diagonal elements. On desktop-class machines, the extra multiplication per element access is basically irrelevant, and for array scanning, it converts to additive (signed) constants anyway. Any kind of mirrored or transposed view is just a modification of the strides and data origin. As each vector is one-dimensional with a number of elements, and each matrix has the number of rows and columns it contains, runtime bounds checking is lightweight (two unsigned comparisons).

For object-oriented dynamically typed memory-safe language, use JavaScript. The current JIT compilers generate surprisingly effective code, but it isn't something you want to run on say a microcontroller. Unless compiled to some byte-code representation, although they tend to be stack-based and not register-based like AVRs and ARM Cortex-M cores are.

Quote from: tggzzz on March 04, 2024, 11:46:19 pm

I start from assuming this world and its inhabitants.

I start by claiming that it is impossible to create a world safe for all humans, and yet have any kind of free will or choice. Instead, I want to maximize the options each individual has. That includes tools that help, but do not enforce, with things like memory safety.

Quote from: tggzzz on March 04, 2024, 11:46:19 pm

I would like to live in A Better World, but so far I haven't succeeded.

I do not, because I cannot define exactly what a Better World would be, without modifying humans. (And that would be tyranny by definition.)

I am suggesting making incremental changes, with minimal disruption, so that programmers could apply the existing tools more effectively.
(Here, to apply compile-time bounds checking to all array accesses, after modifying the code to use array notation instead of pointer notation.)

You seem to suggest scratching everything we have, and replacing it with something new, that fixes all known problems at once. History shows that that rarely works, and usually leads to chaos and suffering. It has worked when the new has been a voluntary option, and the marketplace of human endeavours has preferred it, but even then, they just have replaced the old set of problems with new ones.
Therefore, the world of software not being A Better World is a social problem and not a technical one.

An incremental change for the better has better chances of affecting a Change towards Better, because it requires minimal effort from those using the language to adopt the new features. If the results have an competitive edge, then programmers will do it; otherwise they do not. Again, a social conundrum, not a technical one. Yet, the smaller the change, the smaller the social nudge needed.

I hope you are not suggesting to force a specific technical solution onto everyone? You'll have better luck becoming the Emperor of Earth than succeeding with that, I think. That kind of dreaming belongs in "What If?" fiction, not in any kind of technical discussion.

SiliconWizard · « **Reply #68 on:** March 05, 2024, 12:20:10 am »

Can be interesting to look at: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=rust

nctnico · « **Reply #69 on:** March 05, 2024, 01:01:57 am »

Quote from: Nominal Animal on March 04, 2024, 04:48:27 pm

Quote from: Marco on March 04, 2024, 04:28:30 pm
Quote from: Nominal Animal on March 04, 2024, 03:50:25 pm
practical request by those doing programming to implement things in practice

Indeed, some check writers have decided to stop listening to them though. In practice, the programmers will request money before C.
I'm including the check writers in "those doing programming to implement things in practice", i.e. as in entire companies.

I'm kind of trained that way early on in my career and still have lots of checks in my code to make it robust but it makes programming in C super tedious. But it is hard to convince others of programming with a similar approach. Mastering C to a level to do something useful is hard enough.

tggzzz · « **Reply #70 on:** March 05, 2024, 08:53:49 am »

Quote from: nctnico on March 05, 2024, 01:01:57 am

Quote from: Nominal Animal on March 04, 2024, 04:48:27 pm
Quote from: Marco on March 04, 2024, 04:28:30 pm
Quote from: Nominal Animal on March 04, 2024, 03:50:25 pm
practical request by those doing programming to implement things in practice

Indeed, some check writers have decided to stop listening to them though. In practice, the programmers will request money before C.
I'm including the check writers in "those doing programming to implement things in practice", i.e. as in entire companies.
I'm kind of trained that way early on in my career and still have lots of checks in my code to make it robust but it makes programming in C super tedious. But it is hard to convince others of programming with a similar approach. Mastering C to a level to do something useful is hard enough.

Just so, but it is impractical (if not impossible) for your code to check some things, e.g. aliasing.

Problems do arise when managers/businesses don't want to pay for thorough checks, since "non-functional" code looks like a waste, and is against TSS/agile religious practices.

(N.B. for the avoidance of doubt, there is some value in TDD/agile practices - but not in rigorous adherence to the religious tenets)

cosmicray · « **Reply #71 on:** March 05, 2024, 11:12:23 am »

Quote from: tggzzz on March 02, 2024, 12:18:56 pm

Quote from: cosmicray on March 02, 2024, 11:59:26 am
Because of this WH announcement (which I read yesterday), one of the footnotes led me to the Google Project Zero (which I have been aware of for awhile), and that led me to blog post about the NSO zero-click iMessage exploit https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-into-nso-zero-click.html. I was only vaguely aware of Pegasus and NSO (this was in the news a year or so back), but the actual exploit, and the mind set that it took to write it, is heart stopping.

This is likely a prime candidate for why software (in general) and those parts which are widely used (in particular) needs to have a much cleaner attack footprint. Who knew that an image parser could be manipulated in this way ?

It is also an illustration that, while your company/team may have perfectly adept C programmers, what about that library from another company and how it interacts with something else that your perfect programmers didn't develop.

Something that is under-appreciated about that exploit, is that you don't need to be connected to the internet for it to run. If you have/had an air-gapped (or firewalled) phone / computer / laptop / etc, the mere fact that you rendered that speially crafted PDF document (which could be a datasheet), is all it took. Once it infected, the air-gapped device might not be so air-gapped after all.

Nominal Animal · « **Reply #72 on:** March 05, 2024, 12:28:30 pm »

Quote from: tggzzz on March 05, 2024, 08:53:49 am

Problems do arise when managers/businesses don't want to pay for thorough checks

Here we are in violent agreement.

One could say there are two completely separate domains being discussed here. One is the one that affects the majority of code being written, basically being thrown together as fast as possible with the least effort, to get paid by customers who are satisfied with appearances. The other is the one I'm interested in, where reducing bug density and increasing the reliability of the software is a goal.

I am not really interested in the former. It does not mean that it is irrelevant domain; it is very relevant for real life purposes. I do believe it is better served by high-level abstraction languages like Python and JavaScript, and not by low-level languages, based on my own experience as a full-stack developer for a few years. (One example of this is the pattern where I recommend writing user interfaces in Python + Qt, with heavy computation and any proprietary code in a native library referred to via the ctypes module.)

I suspect a large swath of the former domain will be covered by code generators based on LLMs, because so much of it is basically copy-paste code anyway, with very little to no "innovation" per se. Code monkey stuff.

I am interested in the latter, but do not believe newer languages will solve all the problems: they will just replace the old set of problems with a new one, because humans just aren't very good at this sort of design, not yet anyway. I do hope Rust and others will become better than what we have now, but they're not there yet. For the reasons I described earlier, I suggest better overall results would be achieved faster by smaller incremental changes, for example those I described for C. History provides additional reasons: objective-C and C++, when those who wrote code decided the C tool they had was insufficient. (It is important to realize how often this happens: git, for example. It does not happen by creating a perfect language, then forcing others to use it. It only happens if you use it yourself to do useful stuff, and others decide they find your way more effective than what they use now.)

There are languages with quite long histories that are memory-safe. Ada was already mentioned; another is Fortran 95. (The interesting thing about Fortran 95/2003 is the comparison to C pointers: in Fortran, arrays are the first-level object type, with pointers only an extension to arrays.) Yet, these tend to only live in their niches. PHP is a horrible example of what kind of a mess you may end up with if you try to cater for all possible paradigms: for some things, like string manipulation, it has at least two completely separate interfaces (imperative and object-oriented). Unfortunately, Python shows partial signs of this too, what with its strings/bytes separation, and increasing number of string template facilities. Point is, even though these languages can be shown to be technically superior to C in many ways, they are not nearly as popular. Why?
Because technical superiority does not correlate with popularity, when humans are involved. To fix a popularity problem, a social problem, like programmers and companies being happy to produce buggy code and customers being happy to pay for buggy code, you need to apply social/human tools, not technological ones.

We do not get better software developers by teaching them the programming language du jour; we get better software developers by convincing them to try harder to not create bugs, to use the tools they have available to support them in detecting and fixing issues when they do happen. But most of all, we'd need to convince customers and business leadership that buying and selling buggy code is counterproductive, and that we can do better if we choose to. All we need to do is choose to.

Now that low-quality tech gadgets are extremely easily available from online stores, some humans are realizing getting the lowest price ones may not be the smart choice long-term: you end up paying more, because you keep buying the same crappy tools again and again. Or renting software, in the hopes that the vendor will fix the issues you see, and you won't be stuck at the previous version with all the bugs in it because the vendor chose to fix them in the next version instead, which you'd need to pay to upgrade to.

In a very real way, software companies today are in a very similar position to mining companies a century and a half ago. They, too, could do basically what they pleased, and had their own "company towns" where employees had to rent from the company and buy company wares to survive. Campaign contributions to politicians kept those companies operations untouched, until people fed up with it. I'm waiting for people to get fed up with how crappy software generally speaking is. I just don't want a bloody revolution, just incremental changes that help fair competition, if that is what people want.

Apologies for the wall of text once again.

DiTBho · « **Reply #73 on:** March 05, 2024, 12:37:03 pm »

Quote from: tggzzz on March 04, 2024, 11:46:19 pm

If you can cast away noalias/const then the compiler can (and with higher optimisation levels, will) generate incorrect code.

this was precisely the case for MIPS5++... a bloodbath that I remember well and that I wouldn't wish on anyone

coppice · « **Reply #74 on:** March 05, 2024, 12:43:55 pm »

Quote from: tggzzz on March 04, 2024, 11:46:19 pm

If you can cast away noalias/const then the compiler can (and with higher optimisation levels, will) generate incorrect code.

More basically, if you can cast away const then const stuff can't go into NV memory, which would make C pretty useless for many kinds of machine. "const" was an absolutely essential feature for the MCU world. For the very first pass of standardising C in the 1980s, const had to go into make it a suitable language for the embedded market.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Future Software Should Be Memory Safe, says the White House (Read 3744 times)

Share me