EEVblog Electronics Community Forum

Products => Computers => Programming => Topic started by: snarkysparky on December 01, 2020, 02:33:09 pm

Title: c compiler required to obey parenthesis or not
Post by: snarkysparky on December 01, 2020, 02:33:09 pm
So I have a function that evaluates a polynomial fit for a thermistor to give degC as a function of A/D count.

I implemented it in 32 bit integer for ARM M0+ processor for speed. 

The third order term     (Count * Count * Count) * 0.0000000038824

Is implemented by this code to have a 20 bit fractional part.


RetVal =  ((((74636 * Count)>>10) * Count)>> 15) * Count;

It is crucial that the compiler obey the ordering of my parenthesis to prevent overflow.


Is it an obligation of a good C compiler to honor parenthesis order no matter what optimization level or other circumstance?

Thanks






Title: Re: c compiler required to obey parenthesis or not
Post by: tggzzz on December 01, 2020, 03:49:34 pm
I think you will find it can optimise the code in any way it sees fit, provided the end result is the same.

The problem is that very few people understand all the ramifications of what isn't required of a C compiler's output. Some surprising things can allow the compiler to generate nasal dæmons; one classic example is signed integer overflow.

I leave the rest of the discussion to the language lawyers.
Title: Re: c compiler required to obey parenthesis or not
Post by: golden_labels on December 01, 2020, 04:27:22 pm
Is it an obligation of a good C compiler to honor parenthesis order no matter what optimization level or other circumstance?
You have asked a question that is more complex and ambiguous than you may expect.

Any compiler calling itself a “C compiler” should compile the C language properly. And C has strict rules about both the associativity and precedence of operations in an expression. That includes sub-expressions in parentheses. Any expression, which is not causing UB, must produce exactly the same result on any compiler using types with the same properties. Therefore if you write something, any properly working compiler will give an output that is identical to what you have written. The usual issue is that people don’t really understand what they write. ;)

Now comes the problem of ambiguity in the question. You have asked about an order of operations. That has three meanings. It may mean the abstract order of operations, one defined by operators associativity and precedence. In that case the above answer holds. You skipped the variable types, so the image is incomplete, but in general that expression is always equivalent to the one below. Under the assumptions that there is no UBs, that you have chosen suitable types and that Count is never negative:
Code: [Select]
a = 74636;
a *= Count;
a /= 1024;
a *= Count;
a /= 32768;
a *= Count;
RetVal = a;

The second meaning is the actual, low-level implementation. The exact sequence of machine code instructions. A compiler is allowed to reorder them and shift around code freely, possibly even removing them completely, as long as the result on the virtual machine C implements is the same. In other words: if the abstract behavior, as observed from wthin that piece of code, is the same. For an a variable that is 64-bit and certainly 8-bit unsigned Count, a compiler is allowed to write it as below (assuming I haven’t made some mistake ;)):
Code: [Select]
a = Count;
b = a;
prepare_stack_for_calling_some_function();
a *= a;
a *= b;
call_some_other_function();
b = 18659;
a *= b;
b >>= 23;
a /= b;
cleanup_call_to_other_function();
RetVal = a;

The third meaning is how it is actually executed. I can’t tell about any ARM M0+ processor specifically, but for examle on x86_64 architectures you should expect that the processor may reorder instructions itself if that doesn’t affect the abstract outcome as seen from the perspective of the virtual machine the CPU is implementing. That may affect timings and cache contents.
Title: Re: c compiler required to obey parenthesis or not
Post by: Nominal Animal on December 01, 2020, 04:46:46 pm
It is crucial that the compiler obey the ordering of my parenthesis to prevent overflow.
Generally speaking, the C compiler does.  In ambiguous cases, you can enforce that, by using numeric casts instead of plain parentheses: Since C99, cast to a numeric type type, ie. (type)(expression) or (type)value, limits the value or expression to the range and precision of that type.  The compiler can still generate whatever code it wants, but it must do so in a way that that part of the expression is limited to that range and precision.

Note that given numeric arguments to an arithmetic operation (+ - * / %), the compiler is required to do the operation at the precision and range of the operands (after the type promotions dictated by the standard).  It is not allowed to willy-nilly do it at a lower precision or range.

In your case,
    RetVal =  ((((74636 * Count)>>10) * Count)>> 15) * Count;
will be evaluated in the order indicated by the parentheses.  (The reasons are language-lawyerism, and deal with the model used in the C standard, and how changing the order would reduce the range unexpectedly.  For the same reason, you can expect the multiplication to be done first for a*b/c.  For expressions involving function calls or && || , ?: sequence points add a bit of complexity on top.)

In short, if one writes a + b, the compiler is expected to generate code that yields the correct result, even if it had some instruction that worked for a smaller range of values than what a and b can represent.  Most compilers do provide options, like GCC's unsafe-math-optimizations (affecting floating-point arithmetic), that do allow some subset of such standard-breaking optimizations, but they are not normally enabled.

On an architecture with 32-bit integer type (due to C integer promotion rules) and Count being an unsigned int, the first multiplication overflows for Count>=57546.  However, the second multiplication overflows for Count>=7677.  The third won't overflow for Count<=7676, so on such architectures, 0 <= Count <= 7676 is guaranteed to not overflow for that expression.