Author Topic: Previously unknown 'C' behavior (to me). (Read 18239 times)

andyturk · « **Reply #25 on:** April 23, 2017, 07:19:15 pm »

Quote from: tggzzz on April 22, 2017, 11:11:03 am

Some consider that if you don't know the many C/C++ "gotchas", then it is your fault for not understanding your tool. Others consider that it indicates the tool itself has problems.

tggzzz,

Your critiques of C and C++ are all over these forums. What would you recommend as an alternative?

rsjsouza · « **Reply #26 on:** April 23, 2017, 07:23:46 pm »

Quote from: nctnico on April 23, 2017, 07:07:44 pm

Quote from: tggzzz on April 23, 2017, 06:16:56 pm
Quote from: nctnico on April 23, 2017, 05:17:16 pm
This discussion is rather moot. At some point somebody is going to tell us we all need this in order not to hurt ourselves while eating:
The point is not whether a programmer hurts themselves with their tool, it is whether they hurt other people, e.g. the people that use the program directly or indirectly. And there are many many many examples of people being "disadvantaged" by programs written in C/C++ due any of the many many many reasons they can cause nasal daemons to appear at the least convenient moment.
If you want to measure by that metric then Delphi and Java are far far worse because they claim to make programming easy. Anyway, I don't think anyone is claiming C and/or C++ are the best languages around but for some applications there simply isn't a good alternative. In case of C++ you can use Boost and STL libraries which offer solutions for commonly used constructs (design patterns seems to be the phrase 'du jour') so chances of screwing things up badly are greatly reduced.

I agree. In the case of embedded (my area of work), the use of peripheral or HW-centric libraries helps prevent screw ups as well, as they intrinsically inherit years of experience in a particular platform. Obviously that such libraries must go through the regular cycle of fixing bugs and testing as well.

C is not perfect but, as mentioned by nctnico, other paradigms were tried but are not free of the effects of bad or smart-assery programming. More than once in my life I found the latter, which brings me the phrase: "Just because you can do really intricate constructs with C, it doesn't mean you should".

NorthGuy · « **Reply #27 on:** April 23, 2017, 08:21:28 pm »

Quote from: tggzzz on April 23, 2017, 06:16:56 pm

The point is not whether a programmer hurts themselves with their tool, it is whether they hurt other people, e.g. the people that use the program directly or indirectly. And there are many many many examples of people being "disadvantaged" by programs written in C/C++ due any of the many many many reasons they can cause nasal daemons to appear at the least convenient moment.

I don't see it that way. If you program firmware for a product, you take total responsibility for the programming. If you made it buggy and didn't test it well, ii is you who's hurting the clients. Not the C language, nor any other tool you used, but you.

Moreover, if you use any libraries, Linux, or whatever you decide to drag into your embedded project, you take full responsibility for all the bugs and vulnerabilities contained therein. And if these bugs and vulnerabilities happen to hurt your users, then this is absolutely, 100% your fault, because it is caused by your decision to drag all this stuff in.

It is, of course, your choice what language and what tools you want to use. You decide whether you want full freedom or an illusion of safety. And you're responsible for all the consequences. There's no tool of any sort which will prevent you from making bugs. And there will never be such a tool. Programming with less bugs (and less bloat for that matter) is a responsibility of a human.

hamster_nz · « **Reply #28 on:** April 23, 2017, 08:59:03 pm »

Quote from: brucehoult on April 23, 2017, 02:56:08 pm

Operations on unsigned, on the other hand, are much more rigidly defined. They happen to map efficiently to most modern CPUs, but other CPUs may be compelled to do something quite inefficient in order to get the results defined by C.

Signed/unsigned is a distraction - the same problem exists for "unsigned int" variables too:

Code: [Select]

$ cat check.c
#include <stdio.h>

int main(int argc, char *argv[])
{
  /* unsigned int is 32 bits */
  unsigned int a,b;
  unsigned int n0 = 24, n1 = 8;

  a = 0xFFFFFFFF;
  a = a << (n0+n1);

  b = 0xFFFFFFFF;
  b = (b << n0) << n1;

  printf("%08X should equal %08X\n", a, b);
}
$ ./check
FFFFFFFF should equal 00000000

I think that there is something deeper here. The same problem exists with division. With unsigned numbers, do you not agree that 'shift left by n' is equivalent to division by a power of 2^n?

Code: [Select]

$ cat check.c
#include <stdio.h>

int main(int argc, char *argv[])
{
  unsigned int a,b;
  unsigned int n0 = 1<<23, n1 = 1<<8;

  a = 0xFFFFFFFF;
  a = a / (n0*n1);

  b = 0xFFFFFFFF;
  b = (b / n0) / n1;

  printf("%08X should equal %08X\n", a, b);
  return 0;
}
$ ./check
00000001 should equal 00000001

Have a look at what happens when

unsigned int n0 = 1<<23, n1 = 1<<8;

is replaced with

unsigned int n0 = 1<<24, n1 = 1<<8;

I do know what is happening and why, but I quite like the 'equivalence breaking' between the shift and division operators - they are only the same within limited bounds then each breaks in a different way.

I don't think it is a 'C' vs 'a better language' problem. Any programming language that ends up generating or executing bit-shift opcodes will most likely exhibit this platform specific behavior unless the language designer has gone out of their way to make it defined - (e.g. by masking with the size of the data type being shifted). Even those writing in assembler might not see it coming.

PS. We all know that "x>>1" is not the same as "x/2" for signed values? If not, try where x = -1.

tggzzz · « **Reply #29 on:** April 23, 2017, 09:02:59 pm »

Quote from: nctnico on April 23, 2017, 07:07:44 pm

Quote from: tggzzz on April 23, 2017, 06:16:56 pm
Quote from: nctnico on April 23, 2017, 05:17:16 pm
This discussion is rather moot. At some point somebody is going to tell us we all need this in order not to hurt ourselves while eating:
The point is not whether a programmer hurts themselves with their tool, it is whether they hurt other people, e.g. the people that use the program directly or indirectly. And there are many many many examples of people being "disadvantaged" by programs written in C/C++ due any of the many many many reasons they can cause nasal daemons to appear at the least convenient moment.
If you want to measure by that metric then Delphi and Java are far far worse because they claim to make programming easy.

That is a bizarre chain of - and I use the word loosely - reasoning.

Quote

Anyway, I don't think anyone is claiming C and/or C++ are the best languages around but for some applications there simply isn't a good alternative. In case of C++ you can use Boost and STL libraries which offer solutions for commonly used constructs (design patterns seems to be the phrase 'du jour') so chances of screwing things up badly are greatly reduced.

There is an extremely perceptive comment by someone that will be remembered after we are long gone:
"There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult"
Tony Hoare, in his Turing Award lecture
http://zoo.cs.yale.edu/classes/cs422/2011/bib/hoare81emperor.pdf

In that he also notes about his Algol 60 implementation for a 2kIPS machine:
"Every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to - they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law." (My emphasis)

tggzzz · « **Reply #30 on:** April 23, 2017, 09:04:45 pm »

Quote from: andyturk on April 23, 2017, 07:19:15 pm

Quote from: tggzzz on April 22, 2017, 11:11:03 am
Some consider that if you don't know the many C/C++ "gotchas", then it is your fault for not understanding your tool. Others consider that it indicates the tool itself has problems.

tggzzz,

Your critiques of C and C++ are all over these forums. What would you recommend as an alternative?

I wouldn't be dogmatic, because there are several alternatives, each with their own set of advantages and disadvantages. An engineer would be aware of them.

BTW, you do me too much credit. Excellent C/C++ critiques are all over the web, in far more detail and with far more understanding that I have.

I've repeatedly pointed people towards the FQA, but here's another from someone that has decades (since the 60s) of experience finding and debugging foul problems in hardware and software: "What is an Object in C Terms?" http://www.open-std.org/jtc1/sc22/wg14/9350 Note the "wg14" in the URL; I presume everybody commenting on this thread knows what WG14 is.

One statement, chosen more or less at random, indicates the scope of the problems...
"C99 introduced the concept of effective type (6.5 paragraph 6), but it has had the effect of making a confusing situation totally baffling. This is because it has introduced a new category of types, it has invented new terminology without defining it, its precise intent is most unclear, and it has not specified its effect on the library. "

tggzzz · « **Reply #31 on:** April 23, 2017, 09:06:51 pm »

Quote from: rsjsouza on April 23, 2017, 07:23:46 pm

Quote from: nctnico on April 23, 2017, 07:07:44 pm
Quote from: tggzzz on April 23, 2017, 06:16:56 pm
Quote from: nctnico on April 23, 2017, 05:17:16 pm
This discussion is rather moot. At some point somebody is going to tell us we all need this in order not to hurt ourselves while eating:
The point is not whether a programmer hurts themselves with their tool, it is whether they hurt other people, e.g. the people that use the program directly or indirectly. And there are many many many examples of people being "disadvantaged" by programs written in C/C++ due any of the many many many reasons they can cause nasal daemons to appear at the least convenient moment.
If you want to measure by that metric then Delphi and Java are far far worse because they claim to make programming easy. Anyway, I don't think anyone is claiming C and/or C++ are the best languages around but for some applications there simply isn't a good alternative. In case of C++ you can use Boost and STL libraries which offer solutions for commonly used constructs (design patterns seems to be the phrase 'du jour') so chances of screwing things up badly are greatly reduced.
I agree. In the case of embedded (my area of work), the use of peripheral or HW-centric libraries helps prevent screw ups as well, as they intrinsically inherit years of experience in a particular platform. Obviously that such libraries must go through the regular cycle of fixing bugs and testing as well.

C is not perfect but, as mentioned by nctnico, other paradigms were tried but are not free of the effects of bad or smart-assery programming. More than once in my life I found the latter, which brings me the phrase: "Just because you can do really intricate constructs with C, it doesn't mean you should".

Agreed. The only caveat is that with C/C++ you don't need to have intricate constructs - even simple ones cause problems, as illustrated by this thread!

tggzzz · « **Reply #32 on:** April 23, 2017, 09:12:27 pm »

Quote from: NorthGuy on April 23, 2017, 08:21:28 pm

Quote from: tggzzz on April 23, 2017, 06:16:56 pm
The point is not whether a programmer hurts themselves with their tool, it is whether they hurt other people, e.g. the people that use the program directly or indirectly. And there are many many many examples of people being "disadvantaged" by programs written in C/C++ due any of the many many many reasons they can cause nasal daemons to appear at the least convenient moment.

I don't see it that way. If you program firmware for a product, you take total responsibility for the programming. If you made it buggy and didn't test it well, ii is you who's hurting the clients. Not the C language, nor any other tool you used, but you.

Moreover, if you use any libraries, Linux, or whatever you decide to drag into your embedded project, you take full responsibility for all the bugs and vulnerabilities contained therein. And if these bugs and vulnerabilities happen to hurt your users, then this is absolutely, 100% your fault, because it is caused by your decision to drag all this stuff in.

It is, of course, your choice what language and what tools you want to use. You decide whether you want full freedom or an illusion of safety. And you're responsible for all the consequences. There's no tool of any sort which will prevent you from making bugs. And there will never be such a tool. Programming with less bugs (and less bloat for that matter) is a responsibility of a human.

That would be a valid point if, and only if, programmers were held responsible for their failures. As it is they easily shelter behind EULAs and very very long disclaimers. They demonstrably have neither personal nor corporate responsibility.

In engineering disciplines the practitioners are indeed held responsible in civil law and, in many cases, in criminal law.

IanB · « **Reply #33 on:** April 23, 2017, 09:20:59 pm »

Is it possible to write an incorrect program? Yes.

Is it possible for any programming environment to prevent someone writing an incorrect program? It is not possible to catch all potential errors, so the answer is no.

Given this, it is all about degrees of incorrectness.

Every program ever written must first exist as a set of algorithmic steps in the mind of its creator. These algorithmic steps must then be translated into whatever real world programming environment will be used for implementation.

If the algorithmic steps are wrong, then the actual program will be wrong.

If the translation to a real system has errors, then the program will be wrong again.

Processes and tools can help to prevent mistakes, but ultimately the responsibility for correctness lies with humans, not machines. If you write a bad program, do not blame your tools.

On the subject of this thread, why would an algorithmic step call for shifting a 32 bit quantity by 32 or more bits in one operation? It's not a question of what the compiler or hardware do with this operation, it is about why do you logically need this operation to occur in your design?

tggzzz · « **Reply #34 on:** April 23, 2017, 09:37:56 pm »

Quote from: IanB on April 23, 2017, 09:20:59 pm

Is it possible to write an incorrect program? Yes.

Is it possible for any programming environment to prevent someone writing an incorrect program? It is not possible to catch all potential errors, so the answer is no.

Given this, it is all about degrees of incorrectness.

Every program ever written must first exist as a set of algorithmic steps in the mind of its creator. These algorithmic steps must then be translated into whatever real world programming environment will be used for implementation.

If the algorithmic steps are wrong, then the actual program will be wrong.

If the translation to a real system has errors, then the program will be wrong again.

Agreed so far.

Quote

Processes and tools can help to prevent mistakes, but ultimately the responsibility for correctness lies with humans, not machines. If you write a bad program, do not blame your tools.

That misses the point. You should choose the right tool (and use it appropriately), and you should avoid using inappropriate tools where better tools are available.

Hardware example (off the top of my head): don't choose and/or recommend using a multimeter with insulation clearances that are known to be defective. Or don't recommend using a scope to look at RF signals (an SA is almost always a more appropriate instrument).

hamster_nz · « **Reply #35 on:** April 23, 2017, 11:01:22 pm »

Quote from: IanB on April 23, 2017, 09:20:59 pm

On the subject of this thread, why would an algorithmic step call for shifting a 32 bit quantity by 32 or more bits in one operation? It's not a question of what the compiler or hardware do with this operation, it is about why do you logically need this operation to occur in your design?

I am moving my GPS receiver code from processing one sample at a time to processing 32 bits at a time. Because of pesky things like Doppler shift, sometimes a chip spans 15 samples, or other times it might span 17 samples - the upshot is that the phase isn't fixed, and slowly creeps. Because the sample rate is 16x the nominal Gold code chip rate the nominal 'chip' rate, 32 samples might cover up to three different code 'chips':

e.g. at one given point in time I might be 11111111111100000000000000001111, with the first set of 1s from bit 1021 of the gold code, the set of 0s from bit 1022, and the last set of ones is from bit 0 of Gold code (the codes are 1023 bits long).

I also need to create a mask, that represents which bits are in the next repetition of the Gold Code. in this case the mask will be 00000000000000000000000000001111, as the last four 1s are from the next repeat of the Gold Code.

I tripped over this issue, as I had 'n0' being the count of repeats from the oldest code bit. 'n1' being the count of bits from the middle code bit, and 'n2' being the count of bits form the most recent code bit - so as you would expect n0+n1+n2 = 32.

To make the mask quickly I am taking 0xFFFFFFFF, and then shifting it right by (n0+n1) - wanting branchless code for speed:
mask = 0xFFFFFFFF;
mask >>= n0+n1;

When the phase was such (or the doppler shift was such) that no bits were used from the most recent code bit it would mask out all the bits.

So now I am using this, and it is fine.
mask = 0xFFFFFFFF;
mask >>= n0;
mask >>= n1;

So it is a real world need / use case. It just was an interesting previously unknown behavior to me!

TNorthover · « **Reply #36 on:** April 24, 2017, 01:09:41 am »

Quote from: IanB on April 23, 2017, 09:20:59 pm

On the subject of this thread, why would an algorithmic step call for shifting a 32 bit quantity by 32 or more bits in one operation?

I've found shifts the same as the type's width come up pretty naturally, implementing rotates for example.

Unfortunately it's one of the harder things to specify at zero cost. Most CPUs have settled on 2s-complement arithmetic so the undefined behaviour with signed overflow is mostly good for loop optimizations in compilers these days. But real CPUs do lots of weird and wonderful things with out of range shifts (even 2 different behaviours in the same CPU).

Kalvin · « **Reply #37 on:** April 24, 2017, 06:52:34 am »

Quote from: NorthGuy on April 23, 2017, 02:30:48 pm

Quote from: Kalvin on April 22, 2017, 04:24:01 pm
As the C/C++ languages are broken and unsafe by design (meaning they do not fail on invalid operation, but instead continue execution with the invalid result)

This is what makes them more efficient. It is nothing broken or unsafe about it.

There's a popular misconception that the bugs needs to be detected by compilers and tools, but not by the programmer. This produced lots of bloat, but the software doesn't appear to be less buggy.

If that is the case, when the even simple microcontrollers have an exception trap for the division by zero. Some more advanced controllers have exceptions for misaligned data access. More advanced microprocessors may have segmentation fault exceptions. All these errors and exception are due to the programmer's errors.

The programming language needs to be able to verify that the actions generated are valid - either during compilation time or during execution time. Firstly, the programmer cannot know what kind of code the compiler is generating and secondly adding the validation code manually through out the source code makes it harder to read and maintain and the validation code may obstruct the meaning of the algorithm. Of course, there should be means to disable those features if the performance or code size is an issue, but they should be enabled by default.

If there is a violation during runtime, the system should rise an exception or call an exception handler and the programmer can get a notice of the problem and decide what to do - either continue with the wrong result or restart the system.

Addition:
The programming language should be constructed so that it allows the programmer to be able to write code in a secure manner so that the compiler would work hard during the compilation time to detect possible errors as early as possible. And the compiler should be able to emit code which checks the errors during runtime, for example checking the array boundaries and variable ranges. Of course, the programmer should be able to choose how strict checking the compiler will perform - if one wants to write sloppy code, that is just fine - but if someone wants to write strictly checked code the compiler should provide sufficient capability to check type compatibility, variable ranges, array access etc.

tggzzz · « **Reply #38 on:** April 24, 2017, 07:53:45 am »

Quote from: hamster_nz on April 23, 2017, 11:01:22 pm

So it is a real world need / use case. It just was an interesting previously unknown behavior to me!

The real nasty in your case is that it depended on compiler optimisation setting. In any sane system optimised externally-visible behaviour should be the same as unoptimised externally-visible behaviour (exception: faster or smaller code or or similar).

You should now be concerned that there is some other form of differing behaviour that you haven't spotted.
You should now be concerned that something might emerge when you recompile with the next version of the compiler (yup, that's a real life problem).
Ditto any library that you use.

hamster_nz · « **Reply #39 on:** April 24, 2017, 08:13:03 am »

Quote from: tggzzz on April 24, 2017, 07:53:45 am

Quote from: hamster_nz on April 23, 2017, 11:01:22 pm
So it is a real world need / use case. It just was an interesting previously unknown behavior to me!

The real nasty in your case is that it depended on compiler optimisation setting. In any sane system optimised externally-visible behaviour should be the same as unoptimised externally-visible behaviour (exception: faster or smaller code or or similar).

You should now be concerned that there is some other form of differing behaviour that you haven't spotted.
You should now be concerned that something might emerge when you recompile with the next version of the compiler (yup, that's a real life problem).
Ditto any library that you use.

I agree that it is nasty trap for those not in the know.

However the code I am now using is good - using two shifts, each less than 18 bits, will give the same result no matter which compiler... Also using uint_32 types to make sure the number of bits in each value is as expected.

janoc · « **Reply #40 on:** April 24, 2017, 09:18:58 am »

Quote from: tggzzz on April 24, 2017, 07:53:45 am

Quote from: hamster_nz on April 23, 2017, 11:01:22 pm
So it is a real world need / use case. It just was an interesting previously unknown behavior to me!

The real nasty in your case is that it depended on compiler optimisation setting. In any sane system optimised externally-visible behaviour should be the same as unoptimised externally-visible behaviour (exception: faster or smaller code or or similar).

You should now be concerned that there is some other form of differing behaviour that you haven't spotted.
You should now be concerned that something might emerge when you recompile with the next version of the compiler (yup, that's a real life problem).
Ditto any library that you use.

That's the ideal case, I agree. In the real world we have to deal with compiler/optimizer bugs, unfortunately.

And you still didn't show us that tool that would be up to your standards. E.g. I would love to write embedded code in Haskell where many of these classes of bugs are simply impossible due to the design of the language. But alas, the micro I am writing the code for is not able to run it.

So we can be doing this

until the cows come home or getting real work done in imperfect languages like C or even assembler instead. Heck, the Space Shuttle code was written in a high level assembly language and there was never a critical error that caused a problem in-flight found. The difference in the code quality was not the tool but the engineering process used to produce it (there is this well known article on it: https://www.fastcompany.com/28121/they-write-right-stuff ).

Don't blame the tools for engineer's mistakes - if you know that the tool has flaws, build safeguards in your process to compensate for it. The same as you do for human errors.

hamster_nz · « **Reply #41 on:** April 24, 2017, 10:32:08 am »

Just to show that my project is heading somewhere. from running at 10 seconds of CPU time per GPS channel per second (or a minute to process one second of samples), It now takes 0.02 seconds to process each channel - about 500x quicker than the original single sample at a time processing. Here's about 0.4 s of the received I/Q values - you can clearly see the BPSK data. Next up to add the two carrier and late/early tracking loops.

With a bit of luck and hard work I will soon have a real time GPS receiver up and running..

tggzzz · « **Reply #42 on:** April 24, 2017, 01:51:56 pm »

Quote from: hamster_nz on April 24, 2017, 08:13:03 am

Quote from: tggzzz on April 24, 2017, 07:53:45 am
Quote from: hamster_nz on April 23, 2017, 11:01:22 pm
So it is a real world need / use case. It just was an interesting previously unknown behavior to me!

The real nasty in your case is that it depended on compiler optimisation setting. In any sane system optimised externally-visible behaviour should be the same as unoptimised externally-visible behaviour (exception: faster or smaller code or or similar).

You should now be concerned that there is some other form of differing behaviour that you haven't spotted.
You should now be concerned that something might emerge when you recompile with the next version of the compiler (yup, that's a real life problem).
Ditto any library that you use.

I agree that it is nasty trap for those not in the know.

However the code I am now using is good - using two shifts, each less than 18 bits, will give the same result no matter which compiler... Also using uint_32 types to make sure the number of bits in each value is as expected.

Keep at the back of your mind that there are many many other such pitfalls with the definition and implementation of C/C++. I've already pointed to the "FQA" and "What is an object".

tggzzz · « **Reply #43 on:** April 24, 2017, 02:03:37 pm »

Quote from: janoc on April 24, 2017, 09:18:58 am

Quote from: tggzzz on April 24, 2017, 07:53:45 am
Quote from: hamster_nz on April 23, 2017, 11:01:22 pm
So it is a real world need / use case. It just was an interesting previously unknown behavior to me!

The real nasty in your case is that it depended on compiler optimisation setting. In any sane system optimised externally-visible behaviour should be the same as unoptimised externally-visible behaviour (exception: faster or smaller code or or similar).

You should now be concerned that there is some other form of differing behaviour that you haven't spotted.
You should now be concerned that something might emerge when you recompile with the next version of the compiler (yup, that's a real life problem).
Ditto any library that you use.

That's the ideal case, I agree. In the real world we have to deal with compiler/optimizer bugs, unfortunately.

Indeed, but that kind of starting point leads towards not trying to improve things. If you has a castle to build, what would you think of someone that suggested building it on sand rather than rock?

Do you think it is better to start by choosing tools that have more or fewer inherent problems?

Quote

And you still didn't show us that tool that would be up to your standards. E.g. I would love to write embedded code in Haskell where many of these classes of bugs are simply impossible due to the design of the language. But alas, the micro I am writing the code for is not able to run it.

That's an example of why I won't give a dogmatic answer.

Nonetheless I'm sure there are other less ambiguous/undefined languages that would run on your processor.

Quote

Don't blame the tools for engineer's mistakes - if you know that the tool has flaws, build safeguards in your process to compensate for it. The same as you do for human errors.

Indeed. But often the best thing is to avoid the unnecessarily dangerous tool in the first place.

Don't try and open a jamjar with a carving knife: use a jar opener or, if that isn't available, use an oyster knife

rstofer · « **Reply #44 on:** April 24, 2017, 03:34:59 pm »

The idea that there exists a perfect programming language with perfect run-time libraries (somebody wrote them!) that absolutely prevents programming errors is complete fantasy. Ada might be as close as it gets and it's a freaking nightmare! Even then, the error checking is totally dependent on the programmer. It's just as easy to code around the internal checks as it is with any other language. It is also possible to write very defensive code but that can be done in any language. Like taking the shift count modulo 32 and handling the possible zero shift count.

We don't need languages with more features (C++, Java) or oddball features (in my view, Python), we need simpler languages without features. K&R C comes to mind. Not that we couldn't screw up but at least the language didn't guide us down the path to destruction (objects).

Still, shifting a 32 bit value left by 32 places could reasonably be expected to be problematic. NO, I wouldn't have thought of it! I'm not that smart, before the fact. After the fact, I could see where the compiler would only allow 5 bits of shift count (0..31) and 32 modulo 32 is 0 so no shift occurs. Seems reasonable. After the fact... Why there are two results is interesting. I guess the compiler writer (another programmer) didn't consider the effect of optimization.

I think I would put the masks in an array. I think it might be faster to grab the pre-built mask than it would be to keep shifting a mask over and over. I don't KNOW that, but I might look at the assembly code to see.

I always liked Pascal... And I still use Fortran...

tggzzz · « **Reply #45 on:** April 24, 2017, 04:19:49 pm »

Quote from: rstofer on April 24, 2017, 03:34:59 pm

The idea that there exists a perfect programming language with perfect run-time libraries (somebody wrote them!) that absolutely prevents programming errors is complete fantasy. Ada might be as close as it gets and it's a freaking nightmare! Even then, the error checking is totally dependent on the programmer. It's just as easy to code around the internal checks as it is with any other language. It is also possible to write very defensive code but that can be done in any language. Like taking the shift count modulo 32 and handling the possible zero shift count.

Of course it is possible to produce bad programs in any language; that's so trivially true that I didn't think it was worth saying! But that's not the point.

The question is whether the language is sufficiently well defined as to allow you to predict what will happen when executed - or whether there are surprises that language lawyers can shelter underneath.

Quote

We don't need languages with more features (C++, Java) or oddball features (in my view, Python), we need simpler languages without features. K&R C comes to mind.

That's a more interesting observation.

IMNSHO C++ is so baroque as to be unusable. Java-the-language is becoming more complex than necessary, but the language and VM definition is demonstrably pretty sound w.r.t. unexpected behaviour.

Now C. There's a reasonable argument to be made that many of the compromises in the later versions of C are due to it being pulled in two directions: as a low-level language for embedded programming, and as a high level language for general applications. Since the two directions are fundamentally at odds with each other, it is unsurprising that the language is neither "fish nor fowl" but is a came (i.e. a horse designed by a committee) or an eierlegende-Wollmilchsau. In that case a return to (something near) K&R C would be attractive.

But it won't happen; hence the need to use alternatives.

Quote

Still, shifting a 32 bit value left by 32 places could reasonably be expected to be problematic. NO, I wouldn't have thought of it! I'm not that smart, before the fact. After the fact, I could see where the compiler would only allow 5 bits of shift count (0..31) and 32 modulo 32 is 0 so no shift occurs. Seems reasonable. After the fact... Why there are two results is interesting. I guess the compiler writer (another programmer) didn't consider the effect of optimization.

Don't feel too bad about it! The vast majority of C/C++ practitioners are the same, whether or not they like to admit it.

NorthGuy · « **Reply #46 on:** April 24, 2017, 04:51:04 pm »

Quote from: rstofer on April 24, 2017, 03:34:59 pm

Why there are two results is interesting. I guess the compiler writer (another programmer) didn't consider the effect of optimization.

In the first case, the shift is done at run-time. The amount of the shift is calculated, then the shift is done using a CPU command, which results in a shift by 0 bytes (CPU unly uses last 5 bits of the argument), that is no shift at all. The value of the "a" was -1, which remains unchanged, and then minus one is assigned to the result:

Code: [Select]

$ gcc -o check check.c -Wall -pedantic
$ ./check
FFFFFFFF should equal 00000000

In the second case, with the higher optimization level, the compiler has figured out that all the operands of the expression are always the same, so the whole expression can be pre-calculated at the compile time. Compiler is likely to use 64-bit arithmetic, so the result of the shift is zero. At the run time, the compiler simply assigns zero to the destination variable:

Code: [Select]

$ gcc -o check check.c -Wall -pedantic -O4
$ ./check
00000000 should equal 00000000

rstofer · « **Reply #47 on:** April 24, 2017, 04:54:37 pm »

Quote from: tggzzz on April 24, 2017, 04:19:49 pm

Quote from: rstofer on April 24, 2017, 03:34:59 pm
Still, shifting a 32 bit value left by 32 places could reasonably be expected to be problematic. NO, I wouldn't have thought of it! I'm not that smart, before the fact. After the fact, I could see where the compiler would only allow 5 bits of shift count (0..31) and 32 modulo 32 is 0 so no shift occurs. Seems reasonable. After the fact... Why there are two results is interesting. I guess the compiler writer (another programmer) didn't consider the effect of optimization.

Don't feel too bad about it! The vast majority of C/C++ practitioners are the same, whether or not they like to admit it.

How many people have ever read the formal definition of ANY language. About the only definition that might be readable by mere mortals is assembly language and even that will be based on the quality of the hardware description.

At the edges, what should happen when a 32 bit value is shifted left about a bazillion places? Let's suppose the definition actually provides an answer. How do we know that the compiler writer met the requirements? How can we ever test everything. Yes, I know GCC comes with a validation suite but I can't say that I ever looked at it. Is it complete? Well, it's pretty clear something slipped by...

This idea of correctness is a serious problem in the nuclear industry, as it should be. The verification programs were written in a very specific dialect of Fortran on a very specific computer with very specific libraries and there would be no changes. Ever! That created a situation where obsolete hardware was being maintained indefinitely simply because nobody wanted to go back through the task of verifying the verification programs. The NRC had already bought off on the existing code/hardware. I don't think there was a 'patch of the month' program.

A reactor is no place to find out that Intel has another bug in the FPU.

And what about that problem in the 14th decimal place of the SINH function? What's with that? No, I don't know if there is a real problem, I'm making it up. But you don't know that! Now everybody is going to go looking at the e^x code.

GeorgeOfTheJungle · « **Reply #48 on:** April 24, 2017, 05:01:54 pm »

Quote from: tggzzz on April 24, 2017, 04:19:49 pm

Don't feel too bad about it! The vast majority of C/C++ practitioners are the same, whether or not they like to admit it.

Yes, and the trap is quietly sitting there just waiting to catch the next unsuspecting poor guy.

rstofer · « **Reply #49 on:** April 24, 2017, 05:11:16 pm »

Quote from: GeorgeOfTheJungle on April 24, 2017, 05:01:54 pm

Quote from: tggzzz on April 24, 2017, 04:19:49 pm
Don't feel too bad about it! The vast majority of C/C++ practitioners are the same, whether or not they like to admit it.

Yes, and the trap is quietly sitting there just waiting to catch the next unsuspecting poor guy.

Or not...

A defensive programmer might have masked the sum of the two shift counts with 0x1F to force it into range or perhaps even deliberately handled the situation where the sum is greater than 31. I guess if you don't know how the compiler is going to react, you can take the position of defending against it.

I wish I were that smart and that, with luck, I might have done something like the above. Probably not...

What is more problematic is finding this kind of thing in somebody else's code. It's easy to see the problem when there are constants for the shift count. It might be more difficult if the count was a member of a struct pointed to by a pointer that was passed around shamelessly. Or, hey, it could be been a member of a union so the value could have different representations and the union could be in a struct. How cool is that?


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Previously unknown 'C' behavior (to me). (Read 18239 times)

Share me