EEVblog Electronics Community Forum

Products => Computers => Programming => Topic started by: snarkysparky on November 05, 2020, 02:49:23 pm

Title: Using ampersand to take adress with space or not
Post by: snarkysparky on November 05, 2020, 02:49:23 pm
I found this in some source code.

        /* Set the vector table base address */
        pSrc = (uint32_t *) & _sfixed;

There is a space between the & and  _sfixed.    Does the compiler still understand it's taking the address or would it not assume a bitwise AND.

Thanks
Title: Re: Using ampersand to take adress with space or not
Post by: retiredfeline on November 05, 2020, 03:15:58 pm
There is no ambiguity, space or not. A cast is not valid as an operand of an and.
Title: Re: Using ampersand to take adress with space or not
Post by: SiliconWizard on November 05, 2020, 03:58:20 pm
Operators in C are not defined in terms of whitespace around them (but if you can find a valid counter-example please expose it.) (Keeping in mind an operator in C can be several characters long - so obviously "&=" would not be equivalent to "& =".) There is no rule in C (AFAIK) that requires any kind of unary or binary operator to make a single token (meaning no separators around it) with its operands in order to be valid (as opposed to prefixes and suffixes, which are not operators per se.) The unary "&" operator is not a prefix.

Here, none of ")&", "&_" of even ")&_" could be a C operator, so even without ANY whitespace around the "&", it should compile fine. Conversely, whitespaces won't change the meaning.

This kind of reminds me of a recent thread... (in which a "+" was seen as part of a constant's exponent instead of an operator: different context, but yet another example of why whitespaces are not criterions for determining operators.)

Point is: as retiredfeline said, the "&" here can't be an infix, binary operator in this context, because (uint32_t *) is not a valid left value for such an operator. But for that to be true, the compiler has to know (uint32_t *) is a cast and not a valid value.
Title: Re: Using ampersand to take adress with space or not
Post by: retiredfeline on November 05, 2020, 09:25:32 pm
A very very long time ago, so long ago that many people don't even know about it, the assignment operators in C were =-, =+, and so forth. As you realise, this caused problems with expressions like a=-2; it could be that the author meant to assign -2 to a but longest token lexing made it =-. A change to the current forms -=, +=, and so forth fixed that.
Title: Re: Using ampersand to take adress with space or not
Post by: SiliconWizard on November 06, 2020, 02:44:14 pm
A very very long time ago, so long ago that many people don't even know about it, the assignment operators in C were =-, =+, and so forth. As you realise, this caused problems with expressions like a=-2; it could be that the author meant to assign -2 to a but longest token lexing made it =-. A change to the current forms -=, +=, and so forth fixed that.

Are you sure? If so, would you have any reference of that?

This sounds odd, as C comes from the B language, and those compound operators in B already existed, and in the current form:
https://www.thinkage.ca/gcos/expl/b/manu/manu.html#Section6_6 (https://www.thinkage.ca/gcos/expl/b/manu/manu.html#Section6_6)
Title: Re: Using ampersand to take adress with space or not
Post by: Rick Law on November 06, 2020, 07:12:13 pm
A very very long time ago, so long ago that many people don't even know about it, the assignment operators in C were =-, =+, and so forth. As you realise, this caused problems with expressions like a=-2; it could be that the author meant to assign -2 to a but longest token lexing made it =-. A change to the current forms -=, +=, and so forth fixed that.

Hence the habit of "when in doubt, add parenthesis" is a good habit to keep.   "a=(-2)"  would be clearer than  "a = -2" particularly with these "proportional space font" editors.
Title: Re: Using ampersand to take adress with space or not
Post by: golden_labels on November 07, 2020, 02:45:45 am
Are you sure? If so, would you have any reference of that?
It was like that in the K&R C. Sections 7.14.2 – 7.14.11 of the “C Reference Manual” as published in 1975 by Ritchie.

However, even in K&R C there would be no ambiguity of a kind different than interpreting a prefix decrement operator (--) as two numeric negations (-). The manual is not explicitly stating that the parser should be greedy, but the way the syntax is expressed makes non-greedy approach impossible. Therefore a=+b must be interpreted as a, followed by =+, followed by b, and not as a, =, (+b). That is of course under assumption that C of that time would have unary +, which it had not.

Hence the habit of "when in doubt, add parenthesis" is a good habit to keep.   "a=(-2)"  would be clearer than  "a = -2" particularly with these "proportional space font" editors.
That is a good habit if there is ambiguity — either because the language introduces one or because it’s likely the reader is likely to misinterpret the expression due to not knowing the language well enough or other popular languages using a similar syntax with different precedence/associativity rules. Nowadays there is no such ambiguity in C and even in the 70s using spaces was sufficient to make that unambiguous. Using too many parentheses is as harmful and using not enough of them.
Title: Re: Using ampersand to take adress with space or not
Post by: brucehoult on November 07, 2020, 04:17:11 am
Point is: as retiredfeline said, the "&" here can't be an infix, binary operator in this context, because (uint32_t *) is not a valid left value for such an operator. But for that to be true, the compiler has to know (uint32_t *) is a cast and not a valid value.

More than that, the compiler has to know that "uint32_t *" is a type, which means it has to know that "uint32_t" is a type.

Code: [Select]
int uint32_t = 42;
pSrc = (uint32_t *) & _sfixed

In this case, "(uint32_t *)" has to be a syntax error as the multiply has no right hand operand.

It is one of the big warts in C syntax that the parser must have access to the symbol table to understand many constructs. And the human programmer also.
Title: Re: Using ampersand to take adress with space or not
Post by: Syntax Error on November 09, 2020, 05:59:59 pm
At a glance, to me this suggests _sfixed has overloaded types. The one that's derived to pSrc is of type uint32_t . So maybe the compiler clobbers the space character? Of what type/s is _sfixed defined as?
Title: Re: Using ampersand to take adress with space or not
Post by: SiliconWizard on November 10, 2020, 04:04:55 pm
Point is: as retiredfeline said, the "&" here can't be an infix, binary operator in this context, because (uint32_t *) is not a valid left value for such an operator. But for that to be true, the compiler has to know (uint32_t *) is a cast and not a valid value.

More than that, the compiler has to know that "uint32_t *" is a type, which means it has to know that "uint32_t" is a type.

Code: [Select]
int uint32_t = 42;
pSrc = (uint32_t *) & _sfixed

In this case, "(uint32_t *)" has to be a syntax error as the multiply has no right hand operand.

It is one of the big warts in C syntax that the parser must have access to the symbol table to understand many constructs. And the human programmer also.

This definitely reminds us of this earlier thread with a small "fight" about parsing C.

One thing that it shows in both cases is that a "proper" (read: "ok for general use") C parser must be "biased" towards assuming the source code it parses IS correct, meaning if there could be any "doubt", it must prioritize correct constructs over potential syntax errors. Otherwise  there would be so many potentially erroneous constructs that programming in C would be a living hell.

In this example, we haven't even explored all possible interpretations:
In "(uint32_t *) & _sfixed", as you said, the "*" operator *could* be a multiply. Sure as you said, as it has a type as a left-hand operand, and no right-hand operand, it' s more likely to be the type expression "uint32_t *". But it's all in this: "more likely". For all you know, it could be a double mistake from the programmer, who might have meant a multiply, with one bogus operand and one missing operand. Here, if you don't have access to the symbol table, you may still choose to assume "uint32_t" is a type - as anything else would make the expression invalid - and check if it really is later on.

This bias is similar to the problem we saw in this other thread (with hex constants), so there must be a priority for possibly correct constructs over possible syntax errors. And yes, it's again not nearly as easy as it may seem.