Author Topic: Hidden gems of GCC and CLANG for a safer embedded development  (Read 4103 times)

0 Members and 1 Guest are viewing this topic.

Offline YTusernameTopic starter

  • Regular Contributor
  • *
  • Posts: 92
  • Country: 00
    • Atadiat
Hidden gems of GCC and CLANG for a safer embedded development
« on: October 15, 2023, 07:31:04 am »
Some of us may know some of the mentioned flags in the embeddedartistry's article. If you want to extend your knowledge about GCC and CLANG beyond '-Wall', then the article is for you. The presented flags are for GCC 12+ and Clang 18+. Some of the eye-catching flags:

  • -Wthread-safety: warns about potential race conditions.
  • _FORTIFY_SOURCE=<n>: compiler built-ins to add in runtime bounds checking.
  • -ftrivial-auto-var-init=[uninitialized|pattern|zero]: initialize all uninitialized variables.
  • -fstack-protector: a “canary value” is added to the stack before other values are declared. Before returning from the function, the stored canary value in the stack is checked. If it has been changed, a stack buffer overflow has occurred, a callback is invoked, and the program is terminated.
  • -fsanitize=signed-integer-overflow: checks to ensure the result of +, *, and both unary and binary - do not overflow in the signed arithmetics.

 
« Last Edit: October 15, 2023, 10:38:44 am by YTusername »
 
The following users thanked this post: paf

Offline betocool

  • Regular Contributor
  • *
  • Posts: 129
  • Country: au
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #1 on: October 15, 2023, 09:57:04 am »
The link is empty?

Cheers,

Alberto
 
The following users thanked this post: YTusername

Offline YTusernameTopic starter

  • Regular Contributor
  • *
  • Posts: 92
  • Country: 00
    • Atadiat
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #2 on: October 15, 2023, 10:39:06 am »
Thanks. Fixed.
 

Offline 5U4GB

  • Frequent Contributor
  • **
  • Posts: 796
  • Country: au
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #3 on: October 16, 2023, 01:07:27 pm »
Some more very important ones that you should always enable:

  • -fwrapv: Makes the compiler match what the hardware does rather than have the leeway to do whatever random thing the compiler writers feel like doing.
  • -fno-delete-null-pointer-checks: What the name says, and yes, gcc will silently delete checks for null pointers if you don't use this.

Reference: "How ISO C became unusable for operating systems development", Victor Yodaiken, Proceeedings of PLOS'21.
 
The following users thanked this post: YTusername

Offline zilp

  • Frequent Contributor
  • **
  • Posts: 351
  • Country: de
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #4 on: October 16, 2023, 06:14:17 pm »
-fwrapv: Makes the compiler match what the hardware does rather than have the leeway to do whatever random thing the compiler writers feel like doing.

Mind you that this can significantly cost you in performance. Also, that's a rather confused way to describe what the option does.

-fno-delete-null-pointer-checks: What the name says, and yes, gcc will silently delete checks for null pointers if you don't use this.

GCC will not silently delete checks for null pointers from C programs. Obviously.

GCC will only eliminate dead code when you check that a pointer that you have already dereferenced (and that therefore can not be NULL in a C program) is not NULL.
« Last Edit: October 16, 2023, 10:48:18 pm by zilp »
 
The following users thanked this post: newbrain, Jacon, SiliconWizard, Torsten Robitzki

Offline 5U4GB

  • Frequent Contributor
  • **
  • Posts: 796
  • Country: au
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #5 on: October 21, 2023, 11:55:07 am »
Mind you that this can significantly cost you in performance.  Also, that's a rather confused way to describe what the option does.

I've never noticed any difference in any code I've benchmarked.  Perhaps you meant "under exactly the right conditions it may be possible to notice a slight difference in performance on special test cases"?

For the description part, once the magic "UB" words are uttered the compiler can do whatever it wants, including reformat your hard drive.  So "do whatever random thing the compiler writers feel like" seems accurate.

GCC will not silently delete checks for null pointers from C programs. Obviously.

GCC will only eliminate dead code when you check that a pointer that you have already dereferenced (and that therefore can not be NULL in a C program) is not NULL.

In other words it'll silently delete null pointer checks from programs if certain often nearly-impossible-to-detect conditions are met, exactly as I said  I once spent the best part of a day tracking down a segfault in an OSS project that was caused by this, the developers still have no idea what it was that triggered gcc to delete the NULL pointer check.  Another OSS project changed the way it uses gcc because they felt the presence of this behaviour made it too dangerous to compile code with. 

Other compilers (IAR, armcc, Keil, VS) don't do this, and even for gcc if you need to include a special compiler option just to fix something like this then the problem is the compiler, not its users.  Or at least the compiler's developers.

For a more detailed analysis, see for example the paper I referenced, "How ISO C became unusable for operating systems development", PLOS'21.
 

Offline zilp

  • Frequent Contributor
  • **
  • Posts: 351
  • Country: de
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #6 on: October 22, 2023, 01:25:08 am »
I've never noticed any difference in any code I've benchmarked.  Perhaps you meant "under exactly the right conditions it may be possible to notice a slight difference in performance on special test cases"?

No, I mean what I wrote.

For the description part, once the magic "UB" words are uttered the compiler can do whatever it wants, including reformat your hard drive.  So "do whatever random thing the compiler writers feel like" seems accurate.

That's not the confused part. Though it's kinda confused, too.

In other words it'll silently delete null pointer checks from programs if certain often nearly-impossible-to-detect conditions are met, exactly as I said

If your code is such that it's difficult to detect whether you are dereferencing NULL pointers, then ... your code should not be used in production anyway. What GCC does with your code is not your primary problem, and whether there is a NULL pointer check after the dereference is pretty much irrelevant.

I once spent the best part of a day tracking down a segfault in an OSS project that was caused by this, the developers still have no idea what it was that triggered gcc to delete the NULL pointer check.

So, which is it? They know that this caused the elimination of a NULL pointer check, or they don't know what caused the elimination of a NULL pointer check?

Also, it's just obvious nonsense that somehow this could cause segfaults. If you dereference a pointer, then check it for NULL, and then dereference it again, it's completely irrelevant whether the check is eliminated or not, as the first dereference will segfault, so execution never reaches the NULL pointer check anyway if the pointer is NULL.

Another OSS project changed the way it uses gcc because they felt the presence of this behaviour made it too dangerous to compile code with. 

... so what?

Other compilers (IAR, armcc, Keil, VS) don't do this, and even for gcc if you need to include a special compiler option just to fix something like this then the problem is the compiler, not its users.  Or at least the compiler's developers.

You don't need to include a special compiler option because there is nothing to fix. You are just completely confused about how compilers work. Also, other compilers also don't do anything defined when you dereference a NULL pointer, because there obviously aren't any sensible semantics for a NULL pointer dereference.

(The exception being when you write code that can access data at address 0, which just isn't a thing normally ... and that's the whole reason for why the option exists, so that you can use GCC to compile code for these exceptional cases if you need to.)
 
The following users thanked this post: newbrain

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 4106
  • Country: us
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #7 on: October 22, 2023, 04:49:04 am »
Also, it's just obvious nonsense that somehow this could cause segfaults. If you dereference a pointer, then check it for NULL, and then dereference it again, it's completely irrelevant whether the check is eliminated or not, as the first dereference will segfault, so execution never reaches the NULL pointer check anyway if the pointer is NULL.

That's just false.  The fact that you misunderstand this is a pretty good argument for the behavior being bad.  It is 100% possible to have code segfault on a null pointer exception because of a compiler removed null pointer check.

The key is that a null pointer dereference *may* cause a segfault but it isn't guaranteed to.  For instance, if the first dereference itself is optimized away for having no observable side effects.  Just doing pointer arithmetic on a null pointer is undefined even if you never derference it.  Even though there is no illegal memory access and no segfault, it still is undefined behavior and the compiler can still assume it doesn't happen and use that optimize away future null checks.

Of course the code is still by definition wrong.  People aren't really upset about the segfault, they are upset about where the segfault happens: not in the code that actually has the bug but somewhere totally unrelated that has an explicit null pointer test.

Quote
If your code is such that it's difficult to detect whether you are dereferencing NULL pointers, then

The problem is it's not "your" code, it's someone else's code.  If your code receives a pointer from "somewhere" that the compiler thinks it can prove is non-null, but the pointer is in fact null, that's where you have a bad day.  And as compilers get more powerful they see more and more of the program and can make more inferences, especially with LTO.  So it gets harder and harder to see where there might be a null pointer dereference.

The solution to this is not to tell people they are wrong or call it nonsense, that only means you don't understand the issue.  The only real (if partial) solution to this is ubsan tools that can find the undefined behavior where it occurs, rather than where the program fails.  It's still hard to find bugs that only show up in rarely production, but ubsan has advantages over other tools in that it can find behavior that is undefined even in situations where it is apparently benign.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9774
  • Country: fi
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #8 on: October 22, 2023, 09:49:45 am »
What ejeffrey said. 5U4GB probably mispresented and oversimplified the issue, but arrogantly oversimplifying it the opposite way does not solve anything. We should give them a benefit of doubt, it is well possible they had some very complex interaction where an esoteric bug somewhere goes unnoticed and then pops elsewhere, contributing factor being the compiler. People make mistakes, programming is difficult, and tools should be as helpful as possible.

Take a look how air crashes are investigated. The answer almost never is "they did not follow the rules, their fault". Whether the design of the machines and processes help prevent mistakes, or drive small mistakes worse, is under careful investigation every time.
« Last Edit: October 22, 2023, 09:51:38 am by Siwastaja »
 
The following users thanked this post: 5U4GB

Offline zilp

  • Frequent Contributor
  • **
  • Posts: 351
  • Country: de
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #9 on: October 22, 2023, 04:57:15 pm »
That's just false.  The fact that you misunderstand this is a pretty good argument for the behavior being bad.  It is 100% possible to have code segfault on a null pointer exception because of a compiler removed null pointer check.

The key is that a null pointer dereference *may* cause a segfault but it isn't guaranteed to.  For instance, if the first dereference itself is optimized away for having no observable side effects.  Just doing pointer arithmetic on a null pointer is undefined even if you never derference it.  Even though there is no illegal memory access and no segfault, it still is undefined behavior and the compiler can still assume it doesn't happen and use that optimize away future null checks.

Of course the code is still by definition wrong.  People aren't really upset about the segfault, they are upset about where the segfault happens: not in the code that actually has the bug but somewhere totally unrelated that has an explicit null pointer test.

OK, admittedly, they didn't talk specifically only about dereferences causing DCE, and also, of course, C doesn't guarantee a segfault in any case, even if you do dereference a NULL pointer ... but their posts here just have this strong "compiler writers are dumb because their compiler doesn't compile my code into what I've made up it should mean and instead strive to make code fail where the standard allows them to" vibe, which really isn't helpful for anything, and which also tends to be so vague, bordering on non-falsifiable, that you can't really address it  fully in a reasonably short answer, but letting it stand as supposed wisdom isn't really helpful either.

The problem is it's not "your" code, it's someone else's code.  If your code receives a pointer from "somewhere" that the compiler thinks it can prove is non-null, but the pointer is in fact null, that's where you have a bad day.  And as compilers get more powerful they see more and more of the program and can make more inferences, especially with LTO.  So it gets harder and harder to see where there might be a null pointer dereference.

The solution to this is not to tell people they are wrong or call it nonsense, that only means you don't understand the issue.  The only real (if partial) solution to this is ubsan tools that can find the undefined behavior where it occurs, rather than where the program fails.  It's still hard to find bugs that only show up in rarely production, but ubsan has advantages over other tools in that it can find behavior that is undefined even in situations where it is apparently benign.

Which is all true ... but the important point that needs to be pointed out to counter these compiler conspiracy myths is that compilers do all that in order to improve performance, not in order to make code malfunction, or just because the standard allows it.

Compilers try to infer as much as possible about a program in order to find the most efficient way to express it in machine code, and finding checks that, according to the definition of the language, can not be true, and eliminating the code that expresses such checks, is done because that, obviously, improves performance. And not only does it improve performance, but rather, it allows for more reliable/more secure code, because it allows you to add more sanity checks in your code (like, say, bounds checks) and to rely on the compiler to prove where they are unnecessary so that the performance impact of those checks is minimal. So, it's a *good* thing that a modern compiler with LTO might be able to eliminate unnecessary bounds checks across compilation units, say, where a human might have a hard time figuring out why exactly the check is unnecessary.

Now, all of this *sometimes* causes code to be eliminated that actually might prevent bad things from happening if it were kept. Which is unfortunate, and sometimes there might even be good reasons to make the compiler selectively less aggressive, but the important point is that there are usually babies in the bath water, and and no easy solution that reliably gets rid of just the water.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9774
  • Country: fi
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #10 on: October 22, 2023, 06:15:01 pm »
And not only does it improve performance, but rather, it allows for more reliable/more secure code, because it allows you to add more sanity checks in your code (like, say, bounds checks) and to rely on the compiler to prove where they are unnecessary so that the performance impact of those checks is minimal.

I pretty much agree with this. Even if it's frustrating, your program breaking because you made a mistake and compiler optimized something into unwanted territory is ultimately a good thing; it exposes the bug (even if in suboptimal (tedious) way), you spend a lot of time hunting for it, learn something, and remember it the next time - hopefully, unless you choose to blame the compiler. I have long ago stopped blaming the compiler for my own mistakes as I see how the benefits of modern-day optimization outweigh the problems it causes when compiling buggy code. Ultimately, what needs to be done is admit mistakes and start writing less buggy code.

Compiler writers have been extremely helpful in improving static analysis parts that can generate warnings about so many common and even rare bugs during compile time. This goes side by side with optimization; increasing performance targets (a good thing!) drive the static analysis capabilities of the compiler, which as a side effect allows giving better warnings, too. So the commonly seen idea that better optimizations make programming more dangerous and more prone to bugs is very wrong on average, although in some isolated cases that seems to happen.
 
The following users thanked this post: newbrain, Jacon, SiliconWizard, zilp

Offline 5U4GB

  • Frequent Contributor
  • **
  • Posts: 796
  • Country: au
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #11 on: November 05, 2023, 02:07:46 pm »
I've never noticed any difference in any code I've benchmarked.  Perhaps you meant "under exactly the right conditions it may be possible to notice a slight difference in performance on special test cases"?
No, I mean what I wrote.

And I meant what I wrote.  Isn't it great how we can both mean what we wrote but get the opposite results simply by asserting something?  Admittedly in my case I got that by taking measurements of actual production code, not sure where you got yours from.

If your code is such that it's difficult to detect whether you are dereferencing NULL pointers, then ... your code should not be used in production anyway.

It's not my code.  Also, the last time someone tried the the "all you need to do is write perfectly correct code" I ran some of the code he'd written through a source code analyser and found quite a lot of UB and other errors in there.  He stopped responding to the thread after that.  Care to post links to some of the code you've written so we can have a look?  (See also the quote at the end by the main architect of LLVM).

Also, it's just obvious nonsense that somehow this could cause segfaults. If you dereference a pointer, then check it for NULL, and then dereference it again, it's completely irrelevant whether the check is eliminated or not, as the first dereference will segfault, so execution never reaches the NULL pointer check anyway if the pointer is NULL.

It was something like (greatly simplified):

foo = bar->baz;
if bar == NULL return -1;
do_thing( foo );

The generated code would have been:

if bar == NULL return -1;
do_thing( bar->baz );

because foo was never used before the NULL pointer check.  So the code wouldn't have segfaulted if gcc hadn't silently removed the NULL pointer check without ever issuing a warning.

You don't need to include a special compiler option because there is nothing to fix. You are just completely confused about how compilers work. Also, other compilers also don't do anything defined when you dereference a NULL pointer, because there obviously aren't any sensible semantics for a NULL pointer dereference.

I know how compilers work, but I also know how the gcc developers work because I've had to interact with some of them in the past.  For compilers targeted at critical-applications use (the Keil, IAR, etc set I mentioned previously), the compiler developers tend to take the approach that the compiler should capture the intent of the software developers as much as possible.  For gcc, the compiler developers take the approach that if you squint at the C specs just right and know just how to interpret the nuances correctly then technically the behaviour they've implemented is allowed, and they have a great opportunity to show how much smarter they are than the idiots who write the code that gcc has to process by silently causing breakage.

Just out of curiosity, you're not a gcc developer are you?  You have the exact attitude of other gcc developers I've interacted with.

Finally, some quotes from the PLOS paper I cited:

Quote
it is currently argued that the standard interpretation allows implementations to take any action at all, not just for (say) an overflowing execution but for the entire program, if they detect a single feasible instance of undefined behavior. And there are lots of undefined behaviors.
[...]
Most C programs contain undefined behavior – certainly every operating system code base does. Perhaps more troubling, as [2] points out, this concept of undefined behavior makes C compilers unstable.
[...]
Kang [11] notes the "somewhat controversial practice of sophisticated C compilers reasoning backwards from instances of undefined behavior to conclude that, for example, certain code paths must be dead." can lead to "surprising non-local changes in program behavior and difficult-to-find bugs
[...]
And by 2011, Chris Lattner, the main architect of the Clang/LLVM compilers was echoing Ritchie’s warning [16]: To me, this is deeply dissatisfying, partially because the compiler inevitably ends up getting blamed, but also because it means that huge bodies of C code are land mines just waiting to explode. This is even worse because [...] there is no good way to determine whether a large scale application is free of undefined behavior, and thus not susceptible to breaking in the future.
 

Offline 5U4GB

  • Frequent Contributor
  • **
  • Posts: 796
  • Country: au
Re: Hidden gems of GCC and CLANG for a safer embedded development
« Reply #12 on: November 05, 2023, 02:15:17 pm »
"they did not follow the rules, their fault"

You've pretty much captured the gcc design philosophy in one sentence there, or at least the attitude of the gcc developers.  With a corollary that a lof of these rules are so obtuse and obscure that even the members of the JTC1/SC22/WG14 standards committee can spend months if not years arguing over them and, typically, not reaching any conclusion apart from adding another entry to the ever-growing UB catalogue.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf