Products > Programming

Deciphering some RISC-V assembly code

<< < (3/3)

“I made a discovery today. I found a computer. Wait a second, this is cool. It does what I want it to. If it makes a mistake, it's because I screwed it up. Not because it doesn't like me... Or feels threatened by me.. Or thinks I'm a smart ass.. Or doesn't like teaching and shouldn't be here...” — from “The Conscience of a Hacker” by Loyd Blankenship

Compilers almost never generate wrong code. Bugtrackers of GCC and Clang indicate that there are exceptions, but they are very rare and happen primarily with major new feature sets. Almost exclusively it is the programmer having wrong expectations from the code.

With C and C++, as well as any parallelized code, in my experience this is usually caused by failure to understand program’s meaning. Code has meaning attributed to it, which was never in that code. Typically because the programmer never formally learned the tool. Instead the knowledge is replaced with unfounded guesses: by comparison to similar constructs from elsewhere (e.g. basic arithmetic) or incidental behavior observed in the past.

This is not limited to the tools mentioned above, but it is most visible there. In C and C++ it seems to me, that it’s caused by three factors playing together:
* The languages are particularly abstract and detached from hardware, and having very complex model at the same time. This bites hard, if the actual meaning is replaced with guesses. This is also the primary source of security problems attributed to language itself, leading even seemingly experienced programmers (like Linus Torvalds) to madness. The problem is greatly inflated by the lack of easily available sources to study the language, even if you want to devote years of your life to learning just a single tool. In microcontroller world made even worse by vendor’s proprietary tools, which claim to be “flavors of C”, but invent their own meanings.
* Undefined behavior can be found(1) in any Turing-complete imperative or functional language, but C and C++ are  infamous for experiencing it at the most basic level. Explicitly clarifying it in the specs is not really helping, if nobody knows them in the first place.(2) The result is, that even a simple snippet of code may not have any defined behavior. In Java I must at least e.g. use java.util.HashSet, add something to it and then try to retrieve to get an UB. In C I can simply add two numbers together. Since it happens at a much more basic level, the consequences may also be much more severe.
* Both languages are very bare and boast high statement-level performance. Which leads to toolchains being focused on optimizing at this level.(3) If you have a lot of code with no well-defined behavior at the basic level and optimization focused at the same area, what you get is a lot of behavior that differs from what the programmer imagined it would be. Unlike the previous two, this is not a shortcoming of the language. It’s just probability conspiring against us. While in other languages poorly constrained code has high chance of matching expectations (even if expectations are wrong) and errors being in general mild in consequences, in C and C++ these chances are particularly low and the consequences are often catastrophic.
(1) This is strong “can”, not mere “may”.
(2) Even worse, it led to theories representing these clarifications as UB being intentionally added as a feature. From what I observed, usually by misinterpreting the lack of such clarification in other languages’ specs.
(3) <snide>Because otherwise the only thing keeping them in business would be the lack of better alternatives and inertia of the industry.</snide>


[0] Message Index

[*] Previous page

There was an error while thanking
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod