EEVblog Electronics Community Forum

Products => Computers => Programming => Topic started by: peter-h on October 24, 2022, 08:46:03 am

Title: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 24, 2022, 08:46:03 am
This seems to be the standard construct

Quote
uint32_t fred;
uint8_t buf[512];
fred = *(volatile uint32_t*) &buf[24];

The 32 bit integer fred is stored in buf[24] to buf[27];

I know this is endian-dependent but let's forget that for now. If you wanted endian-proof code you would do something like

Code: [Select]
fred = buf[24]|(buf[25]<<8)|(buf[26]<<16)|(buf[27]<<24);

which runs fast enough on a CPU with a barrel shifter.

I've been programming in C for 2-3 years now and have written a lot of code, which works 100%, but I avoid stuff I can't understand, and in this case I don't get the thinking behind why

Code: [Select]
*(volatile uint32_t*)

is needed.

I use that construct all over the place e.g. flash programming in a boot loader, RAM tests, etc. Obviously it works.

I suppose one could have also used

Code: [Select]
memcpy (&fred, &buf[24],4);

 :)

Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: Nominal Animal on October 24, 2022, 09:53:03 am
If you wanted endian-proof code you would do something like

Code: [Select]
fred = buf[24]|(buf[25]<<8)|(buf[26]<<16)|(buf[27]<<24);

which runs fast enough on a CPU with a barrel shifter.
Wrong conclusion.  Your C compiler will take that expression, and optimize it to a single load or a single load with byte swaps if possible (if it can verify the alignment is sufficient for 32-bit loads), and only when it cannot, will it compile to single-byte loads.

The above pattern is ubiquitous, and therefore well recognized by both GCC and Clang on all architectures where it can be simplified to an unaligned 32-bit load.

An alternate option is to use two aligned 32-bit loads, and rotate the target 32-bit word into the upper or lower one.  This is rarer, and not easily optimized by current C and C++ compilers to optimal code, so this approach is usually only done for bit streams:
Code: [Select]
// Little-endian byte order and bit indexing, 32-bit architecture
uint32_t  get_u32c(const void *buf, uint32_t  bit_offset)
{
    const uint32_t  w0 = *(uint32_t *)(((uintptr_t)buf + bit_offset/8) & (~(uintptr_t)3)),
                    w1 = *(uint32_t *)(((uintptr_t)buf + bit_offset/8 + 4) & (~(uintptr_t)3));
    const uint_fast8_t  shift = (8 * (uintptr_t)buf + bit_offset) & 31;
    return (w0 >> shift) | (w1 << (32 - shift));
}

// Little-endian byte order and bit indexing, 64-bit architecture
uint32_t  get_u32d(const void *buf, uint32_t  bit_offset)
{
    const uint64_t  w = *(const uint64_t *)(((uintptr_t)buf + bit_offset/8) & (~(uintptr_t)7));
    return (uint32_t)(w >> ((8*(uintptr_t)buf + bit_offset) & 63));
}

A third option is to copy the data to an aligned unit:
Code: [Select]
uint32_t  get_u32b(const void *buf)
{
    union {
        uint32_t  u32;
        unsigned char  c[4];
    } result = { .c = { ((const unsigned char *)buf)[0],
                        ((const unsigned char *)buf)[1],
                        ((const unsigned char *)buf)[2],
                        ((const unsigned char *)buf)[3] } };
    return result.u32;
}
While this is not as common as the or-of-shifted-bytes one, GCC and Clang do optimize the last one to a single load on x86-64 (which does allow unaligned 32-bit loads).

You can explore all four versions here at Compiler Explorer (https://godbolt.org/z/PGhb5Tdh7) (includes the source of above).
You don't really need to be very assembler-savvy; comparing the number of instructions in the different variants on your preferred architecture and compiler is sufficient.

I don't get the thinking behind why
Code: [Select]
*(volatile uint32_t*)
is needed.
It tells the compiler that it is not allowed to deduce from surrounding code what the data in the buffer is, or ought to be.

It is necessary when the data in the buffer might be modified by something that the compiler does not see/cannot deduce is happening, for example an interrupt, DMA transfer, or hardware state changes.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 24, 2022, 12:31:26 pm
Interesting - thanks.

I did know what "volatile" does but I have seen that construct used all over the place where there is no realistic/meaningful possibility of the source memory being modified by another process. For example ST use it for CPU FLASH programming. But I guess there may be a reason there because one is writing to a memory address which can change after being programmed!

So
Code: [Select]
fred = *(uint32_t*) &buf[24];
would have done the same job.

It sounds like
Code: [Select]
memcpy (&fred, &buf[24],4);
would also get optimised by the compiler into some simple code.

I've done decades of assembler but never really done it on the arm32.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: magic on October 24, 2022, 12:39:26 pm
This seems to be the standard construct
I doubt it is even standard compliant, unless uint8_t happens to be equivalent to char, and it doesn't need to be.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: rstofer on October 24, 2022, 01:55:02 pm
I did know what "volatile" does but I have seen that construct used all over the place where there is no realistic/meaningful possibility of the source memory being modified by another process. For example ST use it for CPU FLASH programming. But I guess there may be a reason there because one is writing to a memory address which can change after being programmed!

I thought the 'volatile' part of the declaration prevented the compiler from optimizing away writes to memory/registers without subsequent reads.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: langwadt on October 24, 2022, 02:01:12 pm
Interesting - thanks.

I did know what "volatile" does but I have seen that construct used all over the place where there is no realistic/meaningful possibility of the source memory being modified by another process. For example ST use it for CPU FLASH programming. But I guess there may be a reason there because one is writing to a memory address which can change after being programmed!

volatile also tells the compiler to do the write even when it thinks it isn't needed
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: DavidAlfa on October 24, 2022, 02:10:48 pm
But how are ensuring buffer[24] is 32-bit aligned?
Unless you tell the compiler to align buf, it could be placed in any way, casting a byte address as int32 will potentially cause a misaligned access, triggering an exception depending on the system.
The only safe way I can think off would be:

__attribute__((aligned(4))) uint8_t buf[512];

That way you could cast buffer[0, 4, 8, 12, 16, 24...] as int32 safely.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: Nominal Animal on October 24, 2022, 02:30:50 pm
unless uint8_t happens to be equivalent to char, and it doesn't need to be.
It just happens to be equivalent to unsigned char on all architectures that GCC, Clang, et al. currently support.  Which is the reason it is so often used as equivalent to unsigned char, the char type having special provisions regarding storage representation in the C standard.

I did know what "volatile" does but I have seen that construct used all over the place where there is no realistic/meaningful possibility of the source memory being modified by another process.
Well, it is possible for the contents of the Flash memory to be changed in an interrupt context, or even due to a hardware fault or a cosmic ray (which is just radiation not absorbed by the atmosphere).

If code wants to be bulletproof, i.e. shield even against the unrealistic/nonsensical situation where the contents indeed are modified concurrently, it uses the (*(volatile type*)pointer) idiom.

Just because you and I do not see a realistic/meaningful possibility of the memory being modified in unseen ways, does not mean there is none such.  ;)

Besides, it costs nothing.  It is like the way I like to use static inline for accessor functions, and static for local functions, even though I know perfectly well the two are exactly equivalent: static inline is exactly as likely to be inlined or not as plain static is.  In some cases, it is used as a "cognitive load reducer" for those reading the code.

I can imagine being a vendor library developer, and sprinkling volatile even in places where not strictly required (because the compiler will have no ancillary knowledge to avoid the access anyway), just to avoid questions like "why isn't there a volatile here? I think it is causing a bug in my code" because they do not understand the true meaning of the volatile keyword.

I thought the 'volatile' part of the declaration prevented the compiler from optimizing away writes to memory/registers without subsequent reads.
Technically,
Quote from: C standard
Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.

In for example C11 footnote 111, regarding assignments, the standard says "The implementation is permitted to read the object to determine the value but is not required to, even when the object has volatile-qualified type."

So, thinking that volatile forces the compiler to do the access is not actually correct; it only forces the compiler to generate code that behaves strictly according to the rules of the C standard abstract machine.

It is only on current hardware architectures, including all architectures supported by GCC, Clang, Intel Compiler Collection, etc., that the only meaningful way to ensure the rules of the C standard abstract machine are followed, is to ensure that
    *(volatile *type)pointer;
and
    expression = *(volatile *type)pointer;
generates code that explicitly loads a value of type type from pointer, and that
    *(volatile *type)pointer = expression;
generates code that explicitly stores the value of the expression at pointer.

Neither is guaranteed by the C standard.  Both are just practical results on how the C abstract machine can be implemented in current architectures.
If one had followed e.g. LKML in the last decade or two, one would remember several threads on how speculative execution and compiler optimizations affect this.  Currently, there are several details where all the above compilers agree and produce effectively the same code, even though the C standard says (or can be interpreted as saying that) the Behaviour is Undefined; just because the compiler users managed to convince the compiler developers that there was a single sane useful use case, and that practical use case overrides any committee-sitters opinions.

This is exactly why I do value the C standard, but believe the practical reality overrides the theory outlined by the standard.  (Indeed, in the past, up to C99, the standard only codified existing behaviour agreed upon by multiple compilers; it was only in C11 that we got Annex K and "new stuff" not implemented by any compiler, because of commercial interests by a single company.)

I do not mean to imply in any way that understanding the C standard would not be useful, because it definitely is, and I believe is quite important for anyone writing any kind of portable code, or code compiled with anything except a specific version of a specific compiler.  I just do not think it is the last word on anything: the last word is the actual tools we use, the practical real world.  I call those who do believe the standard is or should be the last word language-lawyers, and it is an unfair and derisive term, but as I see it, C is and has to be a practical tool and not a theoretical one, because it is still the closest thing we have to a good embedded and systems programming language.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: Nominal Animal on October 24, 2022, 02:44:45 pm
But how are ensuring buffer[24] is 32-bit aligned?
Unless you tell the compiler to align buf, it could be placed in any way, casting a byte address as int32 will potentially cause a misaligned access, triggering an exception depending on the system.
The only safe way I can think off would be:

__attribute__((aligned(4))) uint8_t buf[512];

That way you could cast buffer[0, 4, 8, 12, 16, 24...] as int32 safely.
I prefer making buffers with alignment requirements the native type, i.e. uint32_t buf[512/4]; here.  (That is, because the data is accessed as uint32_t in a key point, not because I assume uint32_t is 32-bit aligned; it may not be.)

To access any byte/char in the buffer, one can always use ((unsigned char *)buf)[index] (to access a value) or ((unsigned char *)buf + byte_offset) (to get a pointer to a specific byte/char).  So, it is not a limitation at all, but does affect how one intuitively thinks about the buffer.

The reason for my preference is that it seems that compilers can more efficiently optimize access to the array members this way.  That is, they seem to generate better code, at least for the use cases I've tried.

Again, Compiler Explorer aka godbolt.org is an excellent resource for this, because it lets one explore the exact machine code different versions and different compilers generate for specific architectures with specific compiler options.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 24, 2022, 02:59:31 pm
unless uint8_t happens to be equivalent to char, and it doesn't need to be.
It just happens to be equivalent to unsigned char on all architectures that GCC, Clang, et al. currently support.  Which is the reason it is so often used as equivalent to unsigned char, the char type having special provisions regarding storage representation in the C standard.
I know your stance about the standard, but as an answer to magic, the standard is what actually mandates the equivalence between uint8_t and unsigned chars.
It's not that it "just happens" to be equivalent...

Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 24, 2022, 03:28:28 pm
Technically,
Quote from: C standard
Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.

In for example C11 footnote 111, regarding assignments, the standard says "The implementation is permitted to read the object to determine the value but is not required to, even when the object has volatile-qualified type."

So, thinking that volatile forces the compiler to do the access is not actually correct; it only forces the compiler to generate code that behaves strictly according to the rules of the C standard abstract machine.
I beg to differ here.
Footnote 111) needs to be read in the very specific context it is placed:
Quote from: C11 standard, 6.5.16 Assignment operators, §3
An assignment operator stores a value in the object designated by the left operand. An assignment expression has the value of the left operand after the assignment, 111) but is not an lvalue.
So, what the footnote is saying is that the abstract machine needs not reread the value of a volatile lvalue to determine the value of the assignment expression.
This reading makes sense, because the text is talking about a write in an object (assignment operator), not a read.
As, e.g., in the following:
Code: [Select]
int plain_int;
int volatile volatile_int;
void f(int an_int)
{
    plain_int = volatile_int = an_int*3;
}
The abstract machine/implementation needs not read volatile_int (after the mandatory write to it) to get the value to write to plain_int (regardless whether the write to plain_int happens or not, as it's not volatile).
This can still be "surprising" as volatile_int might be a 'write 1 to clear' register (so the read value would have been different), but your examples have, in my reading, perfectly defined behaviour.

"5.1.2.3 Program execution", §2, guarantees that accessing a volatile object is regarded as a side effect.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 24, 2022, 03:30:15 pm
My CPU (32F417) supports unaligned access transparently, which may be why this works. It's an interesting point though; most of my instances of "buf[512]" are locally on some function stack, and I don't know if it will be aligned. Probably if buf[] is the first local variable declared, it will be, but if I have uint8_t x; before it, it may not be. Someone has just told me that members of structs are aligned on the machine word size (4 bytes in my case), unless "packed". But otherwise I have not worried about alignment, until I came across this
https://www.eevblog.com/forum/programming/packed-attribute-warning/ (https://www.eevblog.com/forum/programming/packed-attribute-warning/)

If one's CPU doesn't support unaligned access then there will be a huge number of gotchas all over the place, presumably.

BTW I think
Code: [Select]
__attribute__((aligned(4))) uint8_t buf[512];

is
Code: [Select]
uint8_t buf[512] __attribute__((aligned(4))) ;
in GCC.

I haven't played with unions but know the idea. I just haven't needed them.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 24, 2022, 03:42:09 pm
But how are ensuring buffer[24] is 32-bit aligned?
Unless you tell the compiler to align buf, it could be placed in any way, casting a byte address as int32 will potentially cause a misaligned access, triggering an exception depending on the system.
The only safe way I can think off would be:

__attribute__((aligned(4))) uint8_t buf[512];

That way you could cast buffer[0, 4, 8, 12, 16, 24...] as int32 safely.
I would prefer, if C11 or later is used, to write:
Code: [Select]
_Alignas(uint32_t) uint8_t buf[512];
(or using 'alignas' after including <stdalign.h>)

This would guarantee safe alignment of 0,4,8 etc offsets in the array, and non-undefined behaviour casts, and is independent of the actual alignment value of uint32_t.
But, really, I like Nominal Animal proposal better.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 24, 2022, 03:59:42 pm
Someone has just told me that members of structs are aligned on the machine word size (4 bytes in my case), unless "packed".
No, this is not true.
Though the standard does not impose any requirement that the padding be minimized ("6.7.2.1 Structure and union specifiers", §15, 17) apart from forbidding initial padding, members are in general aligned to their natural alignment.
See here (https://godbolt.org/z/cbPd5Gbsa).

As for arrays of char/(u)int8_t, they have no stricter requirement than a single char, so they can (and will) be misaligned for anything larger, as you correctly say.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 24, 2022, 05:46:58 pm
Interesting. I found the 512 byte USB MSB buffer which was a member of a struct, nothing aligned, so I put the align attribute on that within the struct; I take it that is allowed.

I am sort of surprised one can't do something like

uint32_t fred;
uint8_t buf[512];
fred = buf[24];

because you can shoot yourself in the foot so easily in C so why can't you shoot yourself in the foot with that? Probably because it is valid but loads just the lowest byte of fred, from buf[24].

How about

fred = &buf[24];
fred = &fred;

That's how it would be done in assembler.

My problem is not understanding pointer notation :) I've seen so many bugs come from that so I avoid them.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: magic on October 24, 2022, 06:08:07 pm
It's not that it "just happens" to be equivalent...

  • If uint8_t exists, it is exactly 8 bit wide (no padding allowed). "7.20.1.1 Exact-width integer types"
  • 8 is also the minimum number of bits for any object that is not a bit field, namely CHAR_BIT. "5.2.4.2.1 Sizes of integer types <limits.h>"
  • CHAR_BIT is also by definition the bit width of a char. "6.2.6 Representations of types"
  • So, (unsigned) char cannot be larger than an uint8_t (as it would violate 2.) and cannot be smaller (as the minimum for CHAR_BIT is 8 ).
  • Hence, if  uint8_t exists, it must be equivalent to unsigned char
There is more to a type than bit count.

The reason I care about "char equivalence" is because char is subject to special aliasing rules which make casting to/from any other type always legal. I'm not that much of a language lawyer and not 100% sure if there is any guarantee that uint8_t will work correctly for this purpose if it exists, I always find such use suspicious. Other than char, using differently typed pointers to the same object is generally UB minefield.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 24, 2022, 06:18:58 pm
Interesting. I found the 512 byte USB MSB buffer which was a member of a struct, nothing aligned, so I put the align attribute on that within the struct; I take it that is allowed.

I am sort of surprised one can't do something like

uint32_t fred;
uint8_t buf[512];
fred = buf[24];

because you can shoot yourself in the foot so easily in C so why can't you shoot yourself in the foot with that? Probably because it is valid but loads just the lowest byte of fred, from buf[24].

How about

fred = &buf[24];
fred = &fred;

That's how it would be done in assembler.

My problem is not understanding pointer notation :) I've seen so many bugs come from that so I avoid them.
The first snippet will do exactly what it says: assign the content of buf[24] (an uint8_t) to fred (a uint32_t) - i.e. fred will contain a value between 0 and 255.

The second snippet is not really meaningful, i imagine you intended something like:
Code: [Select]
fred_p = &buf[24];
fred = *fred_p;
Now what happens depends on the nature of fred_p. If it is declared as an uint32_t * you should get a warning from the compiler (at least, with sane options), and at the end fred will contain an uint32_t value, taken from buf[24..27].
If, instead it is an uint8_t *, the result will be the same as the first snippet.

EtA: And yes, pointer notation might be a bit confusing at the beginning.
Unary & is the "address of" unary operator. Eats its operand, and spits its address.
Unary * is the "indirection" operator. Eats its operand (which must be a non-void pointer) and gives you back the pointed to object.
& and * have an use also as bitwise AND, and multiplication respectively.
But, rejoice, if you were learning C++, you would also have references (using the same & character of the address operator) and r-value references using &&...
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 24, 2022, 06:38:43 pm
There is more to a type than bit count.
[...]
I always find such use suspicious. Other than char, using differently typed pointers to the same object is generally UB minefield.
Yes, there is.
But not in this case.
uint8_t is not a type, but a typedef, i.e. an alias to an existing type ("7.20.1.1 Exact-width integer types").
That type, given all the constraints above, is bound to not have padding bits, and no trap representations.
It cannot be anything but unsigned char or an alias thereof.

Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: ejeffrey on October 24, 2022, 06:56:21 pm
As far as I know, (u)int8_t, if it exists, is always a typedef for unsigned/signed char.  I don't know if there is a way to read the standard where it could be a distinct type that happens to have identical behavior to char, but for sure that is the way it is always actually implemented.  Systems where char is not 8 bits are not allowed to have uint8_t at all.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 24, 2022, 08:25:08 pm
I use & frequently; the pointer notation I still find confusing. Especially the double * at the start of this thread. Pure pointers I never use.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: magic on October 24, 2022, 09:09:06 pm
That type, given all the constraints above, is bound to not have padding bits, and no trap representations.
It cannot be anything but unsigned char or an alias thereof.
But does the standard guarantee that alias analysis will treat a uint8_t pointer the same as char pointers?
Never make a blind assumption that it cannot possibly alias some uint32_t pointer in the same scope?

I don't care about representation compatibility. What I mean is that char is a special type explicitly permitted to be used in the sort of code that has been posted, but uint8_t is not defined to be so.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: coppice on October 24, 2022, 09:39:22 pm
Someone has just told me that members of structs are aligned on the machine word size (4 bytes in my case), unless "packed".
If you think about it, that can't be true. It would allow doubles in structs to be misaligned, which nobody sane wants to happen.

There can be variations between compilers, but you usually get each item packed to its natural boundary if you don't specify the packing options. Beware when using arrays of structs when space is tight, as the waste can really add up. Pack will solve the space, but might degrade performance. Rearranging the order of the items in the struct can often get them packed snuggly without the performance hit.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 24, 2022, 09:56:36 pm
But does the standard guarantee that alias analysis will treat a uint8_t pointer the same as char pointers?
Yes, as typedef does not introduce new types, only synonyms of existing type.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 24, 2022, 10:08:38 pm
I use & frequently; the pointer notation I still find confusing. Especially the double * at the start of this thread. Pure pointers I never use.
Oh yes, you absolutely do use pointers.

'buf' is an array, but the name of an array in practically all contexts (Note£) is converted to a pointer to its first element.

When you write write buf[24], you are actually using a shorthand for (*(buf+24)) - see also Note# - that is, getting the value of what is pointed by buf+24. Since buf - in this context - has become a pointer to uint8_t, you are getting the value of the 24th uint8_t after the beginning of buf - see Note¤.

As for the "double" * in the OP:
Code: [Select]
*(volatile uint32_t*) &buf[24];
one '*' is inside the cast.
A cast indicates to the compiler that you want to convert a value of one type to a value of another type.

The value to be converted here is &buf[24] => &(*(buf+24)) => buf+24 (as & and * are like matter and antimatter) so, a pointer to an uint_8.
The value you want to convert it to is the type inside () in the cast: a pointer (*) to a volatile uint32_t.
So, what the cast expression does is reinterpreting the bits that were a pointer of one kind, to be a pointer of different kind.

Now, you prepend '*' to the cast. '*' is, as said the indirection operator: it takes the pointer value to its right and retrieves the value of the object it's pointing to.
As the value to the right of * is a pointer to volatile uint32_t, what you get is a value of type volatile uint32_t, which you can now store in fred.


Note£ except when the operand of sizeof or & operators. So sizeof buf yields 512, and &buf a pointer to an array of 512 uint8_t .

Note# the [] operator takes two operands, and is commutative, so 24[buf] is in fact correct C and does exactly the same thing as buf[24].

Note¤ Pointer arithmetic in C add or subtracts in units equal to the size of the object pointed to. So if p is a pointer to uint8_t, when you add 1 you get the address of the next byte in memory, if q is instead a pointer to uint16_t, when you add 1 you get an address two bytes higher than q, etc. etc.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: magic on October 25, 2022, 05:59:38 am
But does the standard guarantee that alias analysis will treat a uint8_t pointer the same as char pointers?
Yes, as typedef does not introduce new types, only synonyms of existing type.
Okay, I gather you insist that on standard implementations uint8_t must be a proper typedef to one of the standard types, and this leaves only the chars because the ints are guaranteed to be wider.

That sounds fair enough and it probably is true, but as you see it took you some effort to prove. And I assure you that your proof will break down when they introduce short short int in C++31 and C33.

I guess I just don't see any sane reason to use uint8_t here.
If you know it's the same as char, simply use char.
If you require an 8 bit char, static assert on CHAR_BIT.
Most likely, you don't even require it, and with a few sizeof() here and there the code could be made more portable (and readable).

I still don't like this fad.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 25, 2022, 07:09:53 am
I think people use uint8_t to clearly indicate that the value is used as 0-255.

And int8_t as -128 to +127 (or -127 to +127 in a lot of code); I have never used this because I am always "controlling" some hardware and 7 bits is pretty useless. Didn't stop Honeywell designing an autopilot using int8 for the control loop, with interesting results (google kfc225) ;)

Whereas a "char" is just ambiguous. I use it for storing text and stuff like that. It is less typing because a lot of standard functions expect a char and a uint8_t has to be cast with (char*) all over the place.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: DiTBho on October 25, 2022, 07:33:56 am
you insist that on standard implementations uint8_t must be a proper typedef to one of the standard types, and this leaves only the chars because the ints are guaranteed to be wider.

That sounds fair enough and it probably is true, but as you see it took you some effort to prove

Yes, * right * it makes me think it was for good reason that I developed my own C-like language  :D

that C-language stuff is confusing and all rotten.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 25, 2022, 08:05:22 am
Yes, * right * it makes me think it was for good reason that I developed my own C-like language  :D

that C-language stuff is confusing and all rotten.
Why, then, did you develop a C-like language instead of an Ada-like or Modula2-like language?
(just being a prick, you should know me by now...)
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: DiTBho on October 25, 2022, 02:21:11 pm
Whereas a "char" is just ambiguous

"char" must be banned
casting must be banned, especially casting to/from char
char_t must  be used instead, and restricted ONLY to strings

uint8_t, uint16_t, uint32_t, uint64_t, uint128_t, uint256_t, uint512_t, ... ---> unsigned numbers
sint8_t, sint16_t, sint32_t, sint64_t, sint128_t, sint256_t, sint512_t, ... ---> signed numbers
char_t ---> strings

char_t must be part of the unsigned class
char_t must only have comparing operators { !=, ==, >, <, >=, <= }
char_t must use special uc-functions when you need to add or subtract something
            op={ +, - }           
            f(op, char_t)->(char_t):
                    -> (op,autotype<-sizeof(char_t))
                    -> char_t
            autotype and its inner operator must be automatically managed by the language
            so, char_t can be { ASCII-7bit, ASCII-8bit, something-fancy-16bit, ...}

my-C is very "nazy" about this  :D
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: DiTBho on October 25, 2022, 02:37:59 pm
Why, then, did you develop a C-like language instead of an Ada-like or Modula2-like language?

This way I can recycle old projects without the need to rewrite them from scratch  :D

Practically, everything that is already MISRA{95,2000} & DO178{B,C}-level{A,B} compliant will also be  already 90% compliant with my-C

       raw -> MISRA -> DO178 -> commit: 100% C compliant, 90% my-C compliant

The remaining 10% means ... adapt the subtleties, which is always a pleasure, never a pain like converting a raw C source into a MISRA-C-compliant source

DO178{B,C}-level{A,B} adds other layers of polishing on the top of a MISRA-compliant source
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: SiliconWizard on October 25, 2022, 09:34:38 pm
That's a reasonable rationale, although it obviously limits you. I had done some work on some evolution of C myself, bringing mainly modules and some generics, but I gave up and moved on (so far). In the end, I wasn't so sure it was worth the trouble.

One language I would consider as an inspiration could be Modula-3. The language report is available if you look a little. Pretty neat in a number of aspects. The CM3 project is back to active: https://github.com/modula3/cm3 . As is, the language is interesting but there sure are some things I would change.

Or, a "leanified" version of Ada. Of course, whatever would need to be removed is yet to be defined.

So, we're still back to C. Endlessly. ;D

Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: ejeffrey on October 26, 2022, 06:32:18 am
But does the standard guarantee that alias analysis will treat a uint8_t pointer the same as char pointers?
Yes, as typedef does not introduce new types, only synonyms of existing type.
Okay, I gather you insist that on standard implementations uint8_t must be a proper typedef to one of the standard types, and this leaves only the chars because the ints are guaranteed to be wider.

That sounds fair enough and it probably is true, but as you see it took you some effort to prove. And I assure you that your proof will break down when they introduce short short int in C++31 and C33.

I guess I just don't see any sane reason to use uint8_t here.
If you know it's the same as char, simply use char.
If you require an 8 bit char, static assert on CHAR_BIT.
Most likely, you don't even require it, and with a few sizeof() here and there the code could be made more portable (and readable).

I still don't like this fad.

What fad?  Data types with an explicit well defined size and behavior?  And I don't really get what you are saying about using sizeof() instead?

(u)int8_t is preferable to char in all situations except text.  The biggest reason is that the standard allows char to be signed or unsigned.  Any time the language specifies promotion to int (which is very often) it can be zero or sign extended, it's an implementation/platform choice.  So for instance the  version of the code in the OP that does byte assembly for endian conversion:

Code: [Select]
uint32_t fred = buf[24] | (buf[25]<<8) | (buf[26]<<16) | (buf[27]<<24);

is wrong if buf is defined as char[] instead of uint8_t[]. Yet it will behave correctly on some platforms while failing on others.  This is not some theoretical "allowed by the standard but never exists in practice" detail, ARM usually uses unsigned char and x86/amd64 uses signed.  GCC has an option to change it because a lot of code is written expecting one behavior and gets ported to platforms with the other.

You can write out "signed char" or "unsigned char" everywhere, but (u)int8_t is shorter and more common.  Even for text data people often do  comparisons that could depend on the sign behavior so it would be ideal if all char variables had a defined ranges, but they don't and all the standard library string handling functions use char in their signatures.  All basic ASCII characters are positive and integer value comparison is mostly meaningless for non-ASCII data, so it's not a big deal in practice, so maybe that's not a big deal anyway.

If you are strictly using a generic buffer and always casting to another type before manipulating the data, then obviously the type doesn't really matter.  But if you ever might want to treat the data as an array of bytes, it makes sense to use a type that has defined behavior.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: DiTBho on October 26, 2022, 08:46:20 am
is wrong if buf is defined as char[] instead of uint8_t[]. Yet it will behave correctly on some platforms while failing on others.  This is not some theoretical "allowed by the standard but never exists in practice" detail, ARM usually uses unsigned char and x86/amd64 uses signed.  GCC has an option to change it because a lot of code is written expecting one behavior and gets ported to platforms with the other.

Indeed on hppa2 it fails.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 26, 2022, 09:13:16 am
This is all really interesting.

The stdlib string compare funcs, expecting "char", must do weird things for char codes > 127. You will still get the right result for equality but the < or > will be unreliable.

Quote
uint32_t fred = buf[24] | (buf[25] << 8 ) | (buf[26]<<16) | (buf[27]<<24);
is wrong if buf is defined as char[] instead of uint8_t[]. Yet it will behave correctly on some platforms while failing on others. 

That's another lesson for me. I ought to check if all the buffers are uint8_t.

Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: DiTBho on October 26, 2022, 09:26:35 am
That's another lesson for me

People who gain the constant support of others on the forums and then still produce shit code.
That's *THE* problem with opensource.
Ignorant and lazies usually never grow, because they can always gain the constant support of others.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 26, 2022, 10:54:51 am
Quote
People who gain the constant support of others on the forums and then still produce shit code.
That's *THE* problem with opensource.
Ignorant and lazies usually never grow, because they can always gain the constant support of others.

Who are you referring to?

If you mean me, well thank you, but I am working totally alone, with nobody to ask for help. So I use forums quite a bit. I learnt C from almost zero, 2 years ago, after decades of assembler. The fact that my posts often generate a lot of responses indicates that these are real issues which catch people out.

What regular expression would pick up
Code: [Select]
 char * [*]
?
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: Nominal Animal on October 26, 2022, 12:00:23 pm
The stdlib string compare funcs, expecting "char", must do weird things for char codes > 127.
Well, look at the <ctype.h> isclass(code) (https://man7.org/linux/man-pages/man3/isalpha.3.html) functions.  Given char ch, you do NOT call them isclass(ch), you explicitly must use isclass((unsigned char)ch) instead.  So, it's not "weird".  The only reason the interface specifies an int is because they have to work for all character codes, plus EOF.

Besides, the C standard itself says that <string.h> functions "shall interpret [each character] as if it had type unsigned char" (e.g. C11 7.24.1p3).

In other words, practical implementations of current C libraries internally treat char for strings as unsigned char; it's just that the API is fixed to char.

Quote
uint32_t fred = buf[24] | (buf[25] << 8 ) | (buf[26]<<16) | (buf[27]<<24);
is wrong
, period.  It relies on integer promotions, which is not reliable if buf elements are of a signed type.  If any of them have a negative value, then the high bits of fred will be set, because uintN_t types by definition use twos complement format.

The correct expression, and the one you really, really should be using, is
Code: [Select]
    uint32_t  fred =  (uint32_t)(buf[24])
                   | ((uint32_t)(buf[25]) << 8)
                   | ((uint32_t)(buf[26]) << 16)
                   | ((uint32_t)(buf[27]) << 24);
The reason is twofold:
Feel free to disagree, anyone, but relying on automatic type promotions and incorrect assumptions in C, leads to annoying bugs.  The most common example is the assumption that int and pointers are the same size, and that given void *p, then (void *)((int)p) == p.  This particular bug is common enough that most recognize it in this form, and yet most use int for array index variables, and end up wondering when their code suddenly either locks up (neverending loop) or produces garbage (code loops over only part of the data) given sufficient amounts of data.

In short, I've seen time and time again how avoiding having to write the full expressions lead to annoying bugs.  In the balance, you have known correct expressions on one side, and your preference for brevity in the other.  Do be honest about what really matters to you.



In this and related threads, I've suggested to use the explicit type appropriate for how the data is used.  Here, it means that if you do intend to alias the buffer contents by other types –– i.e., that it actually consists of fields of varying types and sizes (in bytes) ––, you use unsigned char.

unsigned char is a special type. Since C11 (6.2.6.1p3, p4), the standard guarantees that unsigned char is a binary type of CHAR_BIT bits with no padding bits; that it can describe values between 0 and 2CHAR_BIT-1, exactly; no more, no less.  Any object of size n bytes can be copied to an array of unsigned char [n] to obtain its object representation.  In other words, the C11 standard says that unsigned char is the type you can use for manipulating the representation of any object.

Now, if the data is consumed in aligned 32-bit units, and most fields do not cross a 32-bit boundary, then the better choice is uint32_t buf[bytes/4]; instead.  Remember that you can use a simple cast expression, ((unsigned char *)buf) to access the same buffer as if you had declared it as unsigned char buf[bytes];, so this is a choice whose indirect effects –– it will be aligned, up to 32-bit access via uint32_t of elements that do not cross a 32-bit boundary is trivial, and so on –– should be considered carefully.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 26, 2022, 01:16:35 pm
Quote
By explicitly casting the data element to a sufficiently large unsigned integer type you ensure the correctness of the result, instead of relying on implicit integer promotions.  Although you could use some other type than the result of the entire expression, using the same type avoids any surprises.

May I clarify that

Code: [Select]
uint32_t  fred =  (uint32_t)(buf[24])
                   | ((uint32_t)(buf[25]) << 8)
                   | ((uint32_t)(buf[26]) << 16)
                   | ((uint32_t)(buf[27]) << 24);

is equivalent to the above without the uint32_t casts if buf is type uint8_t ?

I also tend to AND values with 0xff e.g.

Code: [Select]
                uint32_t fact_size = xxxxxxx
                uint8_t ssa_buf[512];

ssa_buf[4]=(fact_size & 0x000000ff);
ssa_buf[5]=(fact_size & 0x0000ff00) >> 8;
ssa_buf[6]=(fact_size & 0x00ff0000) >> 16;
ssa_buf[7]=(fact_size & 0xff000000) >> 24;

fact_size is a uint32_t but the above shifts should leave all bits other than the byte of interest at zero.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: Nominal Animal on October 26, 2022, 01:35:03 pm
May I clarify
It is not a clarification.  It is an admission that you'd rather make an assumption than write a few more characters.  That's fine; it's perfectly acceptable business decision.  But don't make it sound like it is anything different than an assumption to avoid having to write the longer expressions.

      ssa_buf[4]=(fact_size & 0x000000ff);
Byte masking is only useful if the code will work even when individual elements of ssa_buf can be larger than an 8-bit byte, but still have the same semantics (i.e., the low 8 bits contain the information, even when the array elements themselves are larger than that).

If it does not, then the byte masking is not only superfluous, it is misleading.

(Typically, no extra code will be generated if any optimizations are enabled, so the main drawback is how it can mislead us humans.)
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 26, 2022, 01:50:05 pm
Quote
It is not a clarification.  It is an admission that you'd rather make an assumption than write a few more characters.  That's fine; it's perfectly acceptable business decision.  But don't make it sound like it is anything different than an assumption to avoid having to write the longer expressions.

What I was getting at is the definition of C.

If you do

fred = buf[25] << 8

then

- buf[25] is extracted as a uint_8 byte
- promoted to a int32 (on arm32/gcc) - the "integer promotion" thingy
- shifted left 8, so the original byte is now in bits 9-15 of the int32
- cast into the destination type (uint32)
- loaded into the destination


Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: Nominal Animal on October 26, 2022, 02:00:30 pm
Consider this function I often use to parse a command-line argument of type int:
Code: [Select]
#include <stdlib.h>
#include <string.h>
#include <errno.h>

int  parse_int(const char *from, int *to)
{
    const char *ends;
    long  val;

    if (!from || !*from)
        return -1;  /* NULL or empty string */

    ends = from;
    errno = 0;
    val = strtol(from, (char **)&ends, 0);
    if (errno || ends == from)
        return -1;  /* Error in conversion */

    while (*ends == '\t' || *ends == '\n' || *ends == '\v' ||
           *ends == '\f' || *ends == '\r' || *ends == ' ')
        ends++;
    if (*ends)
        return -1;  /* Garbage at end of string */

    if ((long)((int)val) != val)
        return -1;  /* ??? */

    if (to)
        *to = val;
    return 0;
}
The question is, given that we have long val, what use does the expression ((long)((int)val) != val) have?

The answer is, because the C standard says that a cast to a numeric type limits the range and precision of the cast expression to that of the cast type, that aforementioned expression is true if and only if val cannot be represented by an int.

C casts are powerful, because they convey specific limitations or requirements for a value, without generating any particular machine code (like a function call or anything like that) for it, only change the code they generate related to the cast expression itself.

(Just consider how you would have checked against overflow there yourself.  INT_MIN and INT_MAX from <limits.h>, perhaps?  That would indeed be more readable, but would also generate additional code.)

Okay, but what does any of this have to do with the thread at hand?

If you were to explore with the various suggestions posted at Compiler Explorer (https://godbolt.org/z/PGhb5Tdh7), you'd see that I'm pushing for unambiguous code that compiles to acceptable machine code.  Casts and their effects are a perfect example of that.  (I am not going for any "perfect" or "best", because I'm perfectly willing to trade a cycle here, a dozen there, for readable, unambiguous, easily maintained –– if verbose –– code.)

And okay, I might be proselytizing a bit about how we should not be afraid of being verbose when it is useful; that being succinct by relying on not very well known implicit behaviour (like the exact integer promotion rules) is not optimal.  I'd go as far as saying it is hacky.  Feel free to disagree, however; I've hopefully made the reasons for these statements clear.  (Other than making a list of links to example bugs created when people rely on assumptions and incorrect assumptions, but that would be too depressing for me  :'(.)
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 26, 2022, 02:29:42 pm
May I clarify that

Code: [Select]
uint32_t  fred =  (uint32_t)(buf[24])
                   | ((uint32_t)(buf[25]) << 8)
                   | ((uint32_t)(buf[26]) << 16)
                   | ((uint32_t)(buf[27]) << 24);

is equivalent to the above without the uint32_t casts if buf is type uint8_t ?
The above code, without the cast and with an uint8_t buf, is another case where Undefined Behaviour rears its ugly head.

You yourself analysed correctly what happens in a similar case:
fred = buf[25] << 8

then

- buf[25] is extracted as a uint_8 byte
- promoted to a int32 (on arm32/gcc) - the "integer promotion" thingy
- shifted left 8, so the original byte is now in bits 9-15 of the int32
- cast into the destination type (uint32)
- loaded into the destination

In the code without the cast, this is fine for values 0, 8, and 16 of the shift amount (and an unsigned type for buf[]), but falls apart when when we shift by 24 a value in [0..256).
The operands of the << operator are now of type int (as you say!), and according to "6.5.7 Bitwise shift operators", §3 and 4:
Quote from: C11, emphasis mine
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand.[...]
The result of E1 << E2 is E1 left-shifted E2 bit positions; [...]
If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is
the resulting value; otherwise, the behavior is undefined.

So, no the code is not equivalent and elicits UB for all values of buf[27] > 127.

Will it work? Probably, on all practical architectures, but it's still not conforming and UB.
A (too) smart compiler might notice the UB and produce optimized code that does not give the expected result.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 26, 2022, 03:01:39 pm
So I got away with the x[y] << 24 because the 32 bit value (in my application it is stuff like code size; always< 1MB on a 32F417) cannot ever be bigger than 3 bytes. In fact the code would work for any 32 bit value which has bit 31 = 0.

But I went through the code in my project and a lot of cases omit that cast, yet seem to work, in code written by others, including ST, and which ought to have failed because the MS byte will sometimes be > 127. So I don't get it.

Anyway I searched for
>>24
>> 24
and edited in that (uint32_t) and will now do a ton of testing to make sure it still works.

Unbelievable!

Well, the mod has broken various things...
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: Nominal Animal on October 26, 2022, 03:31:31 pm
If one wanted to be really, really explicit, then
Code: [Select]
uint32_t  fred =  (uint32_t)((uint8_t)(buf[24]))
               | ((uint32_t)((uint8_t)(buf[25])) << 8)
               | ((uint32_t)((uint8_t)(buf[26])) << 16)
               | ((uint32_t)((uint8_t)(buf[27])) << 24);
In this case, each array element is first cast into an 8-bit unsigned integer type.  Then, each is cast to a sufficiently large unsigned integer type (best use the same type as the result of the entire expression, although fast unsigned types –– uint_fast32_t here –– would also work equally well), and then shifted to their final position.  The four are then binary-OR'd together to get the final result.

When the cast is to an unsigned integer type, the conversion is done using modulo arithmetic (C11 6.3.1.3p2), regardless of whether the original integer type is signed or unsigned.  Thus, a cast to uint8_t type is effectively equivalent to a binary AND with 255.  Similarly, a cast to uint16_t is effectively equivalent to a binary AND with 65535, and a cast to uint32_t to a binary AND with 4294967295.  (The same applies to all unsigned integer types: unsigned char and & UCHAR_MAX, unsigned short and & USHRT_MAX, unsigned int and & UINT_MAX, unsigned long and & ULONG_MAX, and if supported, unsigned long long and & ULLONG_MAX.)
The key to remember is that it is modular and not saturating conversion.

Why "effectively equivalent"?  The practical result in terms of the C abstract machine are the same, but the cast is often easier for the compiler to optimize than the binary AND.

So I got away with the x[y] << 24 because the 32 bit value (in my application it is stuff like code size; always< 1MB on a 32F417) cannot ever be bigger than 3 bytes. In fact the code would work for any 32 bit value which has bit 31 = 0.
You "got away" with it, because the elements are of unsigned integer type, and your compiler uses modular arithmetic for left shift on the signed int type.

First, each element gets promoted to int (but never become negative; they'd be promoted to unsigned int if the value could not be represented by int – the C integer promotion rules are quite explicit).  Then, each element is shifted left.  If the seventh bit in the highest byte is set, we invoke Undefined Behaviour.

To repeat what Newbrain already wrote above, but in a different form just so that readers here understand (because it is kinda important):

Consider the case where int E1 = 128, and we do result = E1 << 24.  Technically, since INT_MAX = 2147483647 on your architecture, E1 << 24 is Undefined Behaviour, because:
Quote from: C11 6.5.7p4
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
We fall in the otherwise, the behaviour is undefined case, because E1 is signed (but nonnegative value), but E1×224 = 231 = INT_MAX+1 > INT_MAX.

However, there is so much code out there relying on the same behaviour that you are, that all current C compilers behave as if E1 was cast to the corresponding unsigned type first, then the shift applied, and finally the result cast to the original signed type.  So, I wouldn't be too worried about a compiler generating wrong machine code (compared to the obvious programmer intent) here, although a strict reading of the C11 standard says it could do anything it wants, including produce nasal daemons.  If they generate wrong machine code for this, a lot of other existing code will miscompile too, and not work; compiler users are quite unhappy about such changes, and tend to switch to using a different compiler instead.

Anyway, I don't like the subtext of "getting away with it" here, because really, it is all about what the logic of your expressions is based on.
Put bluntly, I don't think you are "getting away with" something, I just think you are relying on things without knowing you are relying on them, and want to inform you of exactly what you are relying on when using such expressions.  (It is not wrong per se to rely on them, but for long-term maintenance et cetera, you do want to document them at least.)

This is also exactly the reason why I sometimes insist on technically incorrect/incomplete, but intuitively constructive, analogs and explanations.
The worth of intuitively understanding exactly what you are basing your expectations on when writing such seemingly simple expressions in C is, in my opinion, very high.  The same applies to pointers and aliasing, and unions and type punning as well.  It is what leads to better code, in my opinion.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 26, 2022, 03:34:49 pm
Thank you - it explains why it works.

Now I will restore a dozen files from yesterday's backup :)

Quote
a lot of other existing code will miscompile too

Yeah - I found a lot of examples, and not from me either.

Quote
First, each element gets promoted to int (but never become negative; they'd be promoted to unsigned int if the value could not be represented by int – the C integer promotion rules are quite explicit).  Then, each element is shifted left.  .

That suggests to me that the (uint32_t) cast is potentially needed only on the byte being shifted left by 24.

Quote
If the seventh bit in the highest byte is set, we invoke Undefined Behaviour

How does the compiler know if bit 7 will be 1, at compile time?
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 26, 2022, 05:47:04 pm
That suggests to me that the (uint32_t) cast is potentially needed only on the byte being shifted left by 24.
[...]
How does the compiler know if bit 7 will be 1, at compile time?
1. Yes, the other shifts are safe - that's why I singled that out.

2. It does not and cannot. The UB is potential depending on the runtime data.
Still it might be able to understand that there a potential UB, where the standard makes no requirement on the emitted code behaviour.
In this specific example, no big deal, but in others the compiler can (and will) take shortcuts so the code works best for defined case, and anything can happen in the UB case, giving a "wrong" result (the result cannot really be "wrong", as any result is right when UB comes into play).

If one is absolutely certain that buf[27] will never, ever, exceed 127 then you are fine. But:
a. it's quite difficult to prove (in general)
b. this kind of assumptions should be well commented/documented
So, in the end, better write compliant code. It's less work, manual and mental.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 26, 2022, 06:04:49 pm
So if I have say

Code: [Select]
uint32_t data=buffer1[i]|(buffer1[i+1]<<8)|(buffer1[i+2]<<16)|(buffer1[i+3]<<24);

it is more correct to have

Code: [Select]
uint32_t data=buffer1[i]|(buffer1[i+1]<<8)|(buffer1[i+2]<<16)|((uint32_t)buffer1[i+3])<<24);

There are many cases where the former way is used e.g. in handling of an IP (pre-IPV6 obviously; that is a lot more involved), whose bytes can and are just about any value. Yet, this code is widely used and it works.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: DiTBho on October 26, 2022, 06:25:33 pm
Who are you referring to?

those to whom this patch was aimed
Quote
fbdev/sis: use explicitly signed char
the same problem, over and over again, after spending hours on forums and mailing lists telling people the difference between signed and unsigned.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: Nominal Animal on October 26, 2022, 07:41:25 pm
Quote
If the seventh bit in the highest byte is set, we invoke Undefined Behaviour
How does the compiler know if bit 7 will be 1, at compile time?
Consider a hardware architecture that has an arithmetic shift left, i.e. multiplication by a power of two, that either saturates or wraps around to a positie value.  There could be a non-C reason why an architecture has such an instruction.  We already have binary and arithmetic shifts right, where binary shift rotates in zeroes, and arithmetic shift copies of the most significant bit on twos complement architectures.

In that case, a compiler would be well within the standard and use that instruction.  The end result would be that the sign bit of the int would always be zero, and assuming 32-bit architecture, so would be the most significant bit of the result.

There are many cases where the former way is used e.g. in handling of an IP (pre-IPV6 obviously; that is a lot more involved), whose bytes can and are just about any value. Yet, this code is widely used and it works.
Yes, sure.  My point is that it works when a specific set of assumptions are fulfilled, only.

Because relying on unstated and undocumented assumptions leads to bugs –– just consider the "ints are the same size as pointers" debacle we had to deal with when porting code to first LP64 architectures! ––, I'm trying to show you and others what those assumptions are.

This is part of my nefarious plan.

You see, the next step in my nefarious plan is to convince you and others to document these assumptions, centrally, in a README-like file, and with comments referencing each such item in that set wherever applicable (perhaps at the beginning of each source file).  This reduces the long-term maintenance cost, and makes combating bit-rot (as assumptions change as time passes, again recall sizeof(int) == sizeof(void *) which used to be the rule) much easier and lower cost.

At some point, at least some of us will port at least some of our embedded ILP32 code to Aarch64 (ARMv8-a) or later 64-bit architectures.  Sure, GCC and Clang do support ILP32 ABI at compile/link time, but that'll bite you if you ever have more than 4GiB address space: then, you really do have to switch to LP64.  Why assume you'll only ever work in ILP32 in C, and limit yourself?  It is extremely easy to ossify oneself and stop learning and adapting, if you start choosing the "least effort" way.

We do know from the Cobol folks that as long as you are good and experienced enough, there will be some demand.  So as a business decision, it is a valid choice to focus on ILP32 only, though.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 26, 2022, 08:51:39 pm
Quote
Consider a hardware architecture that has an arithmetic shift left, i.e. multiplication by a power of two, that either saturates or wraps around to a positie value.

I have never (significantly) programmed arm32 in assembler but normally CPUs have a shift left with a zero fed in from the right. A 1 fed in from the right would be completely pointless. Then you have rotation (usually via the carry bit) instructions but one would never use those for a left shift because one needs to clear the carry at each step. But surely in the modern context a << 24 would be done with either a barrel shifter or with byte extraction.

How about a right shift i.e. uint8 >> 24. Is there not a similar problem there?

I will for sure document this issue; I have an extensive hardware/software design document for this project.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: Nominal Animal on October 27, 2022, 07:54:00 am
Quote
Consider a hardware architecture that has an arithmetic shift left, i.e. multiplication by a power of two, that either saturates or wraps around to a positie value.

I have never (significantly) programmed arm32 in assembler but normally CPUs have a shift left with a zero fed in from the right.
That has nothing to do with this.  What I mean is this:

Signed arithmetic shift left with a positive argument:
      0b01111000'00000000 << 1 = 0b01110000'00000000
      0b01110000'00000000 << 1 = 0b01100000'00000000

Signed arithmetic shift left with a negative argument:
      0b11111000'00000000 << 1 = 0b11110000'00000000
      0b11110000'00000000 << 1 = 0b11100000'00000000

A saturating one would simply do an additional binary OR with the carry (bit shifted out) and all low-order bits.

Essentially, signed arithmetic shift left keeps the sign bit intact, the same exact way an arithmetic shift right does.
The above implements modular arithmetic on both signed and unsigned values, keeping the sign intact.  The saturating one would quarantee the magnitude of the value never decreases.

What use would such an instruction have, besides symmetry?  Several, especially considering how often wraparound to negative values yields unwanted results.  Again, this is arithmetic shift left, not binary.  A single-cycle saturating or modular signed multiplication would achieve the same thing, though, with even more uses.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: nctnico on October 27, 2022, 03:50:44 pm
Reading the thread it becomes clear to me that 'undefined behaviour' in the C standard should read as 'compiler dependant behaviour' which in turn has to be driven by the programmer. Any 3 of the methods (binary shift, arithmetic shift and saturating arithmetic shift) can be correct. It is up to the compiler to implement default behaviour and up to programmer to decide what is preferred if there is a choice.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: SiliconWizard on October 27, 2022, 09:34:45 pm
Nah. The std makes a difference between implementation-defined, and undefined behavior. For good reasons.
While it's acceptable (as long as full portability is not a concern) to rely on the former, relying on any specific implementation for UB is just wrong. Don't do it. One good reason is that, if anything is tagged as UB in the std, compilers are free to implement those UBs any way they see fit - including not even caring about what happens - and CHANGE actual behavior at will without ANY notice. UBs are, by definition, non-binding stuff.

Now if you keep insisting and come up with bright arguments such as "but xxx compiler has always treated this one UB in this particular way", this goes in the same basket as the "It compiles, so it worksTM" category. ;D
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: magic on October 28, 2022, 05:47:33 am
UB never happens.

If you write
Code: [Select]
if (x)
  do some UB;
else
  do other stuff;
the compiler may not only replace this block with simply
Code: [Select]
do other stuff;
but compile other code after and before this block under assumption that x is surely false.

Try to deal with this :box:
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: newbrain on October 28, 2022, 10:31:18 am
Nah. The std makes a difference between implementation-defined, and undefined behavior. For good reasons.
While it's acceptable (as long as full portability is not a concern) to rely on the former, relying on any specific implementation for UB is just wrong. Don't do it. One good reason is that, if anything is tagged as UB in the std, compilers are free to implement those UBs any way they see fit - including not even caring about what happens - and CHANGE actual behavior at will without ANY notice. UBs are, by definition, non-binding stuff.

Now if you keep insisting and come up with bright arguments such as "but xxx compiler has always treated this one UB in this particular way", this goes in the same basket as the "It compiles, so it worksTM" category. ;D
Good points.
Not caring: clang does that, at least here:
See this example (https://godbolt.org/z/3vrKsfcrY) vs the same with gcc (https://godbolt.org/z/czPMnPfKv)

I would only add that the standard also describes unspecified behaviour.
The major difference with implementation defined behaviour is that for the latter, documentation is mandatory, while not for the former.
In both cases we are considering the behaviour of conformant (if not strictly comformant, if it includes output affecting unspecified or ID behaviour) code.

Definitions from C11:
Quote
3.4
behavior
external appearance or action
3.4.1
implementation-defined behavior
unspecified behavior where each implementation documents how the choice is made
EXAMPLE An example of implementation-defined behavior is the propagation of the high-order bit when a signed integer is shifted right.
3.4.2
locale-specific behavior
behavior that depends on local conventions of nationality, culture, and language that each implementation documents
EXAMPLE An example of locale-specific behavior is whether the islower function returns true for characters other than the 26 lowercase Latin letters.
3.4.3
undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
EXAMPLE An example of undefined behavior is the behavior on integer overflow.
3.4.4
unspecified behavior
use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance
EXAMPLE An example of unspecified behavior is the order in which the arguments to a function are
evaluated
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: Nominal Animal on October 28, 2022, 10:59:57 am
Now if you keep insisting and come up with bright arguments such as "but xxx compiler has always treated this one UB in this particular way", this goes in the same basket as the "It compiles, so it worksTM" category. ;D
Do note that my arguments are not that.

I only point out that if basically all C compilers have treated something implementation-defined or undefined behaviour the same way, so that there is a lot of existing C code out there relying on that behaviour, then it is safe to pile on and rely on it, regardless of how someone interprets the standard, because real world always beats theory and the text of the standard.  Sure, a new version of a compiler may decide to do something stupid and break a lot of existing code, but its users won't be happy about the breakage, and will shift to something else.  You can find some of these in POSIX and BSD, especially POSIX threads.

That does not mean that one can just try a small program and see if it works like one thinks it should, and if it does, go ahead and rely on it; NOT AT ALL.

What it means, actually, is that the C standard is a human contract between compiler developers and compiler users, and as such, is not perfect.  Some argue that whenever there is a conflict between the C standard and existing code, the C standard should always be considered the authority.  I disagree, because the standard text is not always unambiguous, and even when it is, it is written by humans and thus not necessarily correct.  In such cases, I argue that the huge mass of existing code and the expectations they embody, is the authority, because only that way is a compiler useful.

It is for this reason I do recommend reading the C standard, but also the sources of widely used C projects, and perhaps the POSIX.1 standard (IEEE Std. 1003.1 (https://pubs.opengroup.org/onlinepubs/9699919799/), freely available on the net), specifically the system interfaces (https://pubs.opengroup.org/onlinepubs/9699919799/idx/functions.html) section.  Conflicts do exist between these.  If we start arguing about them in the context of the C standard, we get bogged down into "language lawyerism", which yields no results because trying to convince others that your specific interpretation of the standard text is unlikely to cause others to spend a lot of effort to change their code to conform to your interpretation.
Instead, I say that we accept an imperfect world, and look at the consensus among the kind of C projects we see ourselves in, and conform to that instead.  That way, if we fall, we all fall –– and in that case, we just switch to a "better" C compiler (in reality, one that agrees that our consensus is the useful one).

Better yet, knowing the risks lets us talk to the compiler developers, so that if our favourite compiler(s) are veering towards an unfortunate choice (from our perspective), we can inform them, and ask them to gauge the importance of their specific interpretation of the text of the standards, the benefits of the new interpretation, and the benefits of not breaking a lot of existing code.  It is a human decision, not a technical one.  Even the standards themselves are developed at this level!

I did not invent any of this, of course, I am just describing what kind of an attitude lets a long-term C developer in all complexity levels (from freestanding environments on microcontrollers to GUI application development using Gtk) solve the conflicts in a rational manner without falling into despair.

When there is no conflict, do trust the C standard; I definitely do.
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: peter-h on October 28, 2022, 03:08:32 pm
The problem is that so many things can bite you when developing a product.

Unlike probably 99% of people here, I have never worked for anybody since leaving univ in 1978 (Z80 days) so never been "paid to do something". What happens instead is that I get to deal with any s**t a customer discovers in one of my products :) So... I am bloody careful how I do stuff, because the buck doesn't stop anywhere else. This is dramatically different to being a "paid dev" who gets paid for doing the job and then gets paid a second time for fixing his mistakes :) And unless he is absolutely spectacularly crap, or the company went bust as a result of some cockup, he will not get fired.

So I would rank staying with the same compiler, on a given project family, for ever, at around the same level (of risk avoidance) as going to huge trouble to address some undefined behaviour areas. I think I got bitten only once by not knowing C and it was integer promotion (I posted about it; it was the if x == ~y thing where y was a uint8 which upon bit inversion got silently promoted to 0xffffff-something, but once I learnt that (the code never worked) I knew it. I still think it is a stupid feature of C.

However against the above I've had to deal with all kinds of timing dependencies. A 168MHz CPU can easily break min timings on various chips, so code you used, or found on github (some ex 16MHz AVR code, which had only a 50% chance of ever having worked anyway), doesn't work. Or works sometimes. And then nasty stuff like loop replacement by stdlib functions which aren't in your project, if -O2 or -O3 are used (but at least you get a proper crash then).
Title: Re: Why is loading a multi byte scalar variable from some address so complicated?
Post by: JPortici on October 28, 2022, 03:49:35 pm
Unlike probably 99% of people here, I have never worked for anybody since leaving univ in 1978 (Z80 days) so never been "paid to do something".
well, i'm paid to do something

Quote
What happens instead is that I get to deal with any s**t a customer discovers in one of my products
and that is one of some things i'm paid for. Trust me that i do not add bugs intentionally, because fixing old stuff is a huge waste of time as my job is developing new stuff, but when there is a problem i HAVE to find a solution, or else. I don't see much of a difference than your situation, other than earning much less money while having much more benefits as i'm an employee. The product either work or it doesn't, there isn't "mostly works", that's an excuse we tell the boss/customer to take some breath off our backs.

Regarding C, as you keep finding out there is too much freedom in the language which means that you have to tell it exactly what to do. I hate implicit casts and conversions, they bite you whenever you're not looking, so i cast variables when necessary, or choose the appropriate type. Also reminds me what i wanted to do in the first place when i look at that part of the code again next year.

Write clean C, be as precise as you can in your intentions, let the compiler figure out how to implement stuff, and don't try to be too clever unless you are absolutely certain you need to. Don't be surprised if your cleverly written C using hoops and tricks perform as well as a simple and clear sequence of statements, optimizers are really good these days.