EEVblog Electronics Community Forum

Products => Computers => Programming => Topic started by: etnel on July 15, 2021, 06:50:03 pm

Title: 2 byte value into 2 separate 1 byte values
Post by: etnel on July 15, 2021, 06:50:03 pm
Hello, I'm currently trying to store 2 values of 2 bytes each to the EEPROM to my PIC16f1827. The EEPROM stores single bytes. How would I divide my value into 2 bytes to write them? And how do I then read them into one single value again?
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: Kleinstein on July 15, 2021, 07:39:43 pm
In C there is the union structure. This way the same memory space can be accessed in different ways, e.g. as an integer or as 2 bytes.
Which of the byte is the high and low byte depends on the CPU used, but for storage this should no be an issue.

An alterntive would be to use a pointer and do a cast ont he pointer type.

A protable way is to use bit masks and shifts for a conversion.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: golden_labels on July 16, 2021, 01:42:03 am
(Language not specified; assuming C from the context)

The second option offered by Kleinstein is fine. Casting to a char* is portable and well-defined in this particular scenario: the data is not expected to be exchanged with other code, so endianess, ranges, alignment and trap representations are not an issue. The exception to this is using a different compiler or options and wishing to maintain the same EEPROM data. If that’s the case, you may consider the third option. But I would not waste time on doing so until such a neccessity arises. Not laziness; quite opposite: code quality. This is the simplest method, with no potential for introducing subtle bugs, and easiest for compilers to implement efficiently.

Unions in this case give no advantages: that would only cause more noise in the code. The behaviour is identical.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: TheCalligrapher on July 16, 2021, 03:06:50 am
Hello, I'm currently trying to store 2 values of 2 bytes each to the EEPROM to my PIC16f1827. The EEPROM stores single bytes. How would I divide my value into 2 bytes to write them? And how do I then read them into one single value again?

Very simple. If `V` is your two-byte value, then lower [8-bit] byte is obtained as `V % 256` and upper byte is obtained as `V / 256`. If you so prefer, you can replace division by shifting and masking. Makes no difference. One might also add that it is preferrable to perform operations like that in the domain of an unsigned integer type.

This is always more efficient (or as efficient as) that any alternative approaches based on type punning through unions, casting to `unsigned char *` or somesuch.  Never use type punning unless you absolutely have to.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: AntiProtonBoy on July 16, 2021, 05:33:50 am
memcpy is your friend here.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: BBBbbb on July 16, 2021, 07:28:17 am
just cast as suggested in first 2 comments
Avoid complicated solutions and ones depending on additional libraries.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: janoc on July 16, 2021, 07:47:58 am
I don't get some of those answers. Why to mess with pointers and structures? Uh and memcpy? Seriously?

Assuming this is C and your data are really 16 bit, this is all you need to do:

Code: [Select]
uint16_t original_data;

uint8_t b1,b2;

b1 = (uint8_t) (original_data & 0xff);
b2 = (uint8_t) ((original_data >> 8) & 0xff);

Using a struct/union is actually potentially buggy, because of alignment issues - the compiler can and will insert invisible "padding", depending on the constraints of the target platform. So that is non-portable and may bug on some platforms and work on others. It will be probably OK in the OP's case but then they may try to serialize the data and will wonder why on some platform the struct has 4 bytes size and on another just 3 ... (no idea about that OP's PIC - PIC compilers can do weird stuff).

Using modulo/division actually does not behave the same as shifting/masking - it implicitly depends on the byte order of the platform/data! With shifting and masking you are making that choice explicit. Again, may work in this case but it is a poor habit to get into, doesn't scale to wider data types and it is a good source of endian-related bugs.

Also, don't use types like "char" or "int" when you need exact sizes. Those are poorly defined, the standard only says that char is at least 8 bits and that int needs to be wider than char but not how many bits they really have.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: BBBbbb on July 16, 2021, 07:59:37 am
I don't get some of those answers. Why to mess with pointers and structures? Uh and memcpy? Seriously?
I (very strongly) agree about memcpy comment, but pointer to array has it's advantage.
Usually there are available functions for multiple byte writes, where you just pass the address of the first one.
just doing something like:
e2pW( (char*)&dataIn16bits , address, #ofBytes2wirte)
is simple, quick and easy.

Of course, some checks about byte order are on the user to be done.

Not saying your proposition is bad, it's clean and simple to understand, real textbook (even with the precautionary masking), but the cast one is quick and efficient, very useful when working on targets with little resources.

Title: Re: 2 byte value into 2 separate 1 byte values
Post by: TheCalligrapher on July 16, 2021, 08:52:45 am
is simple, quick and easy.

... and incorrect. Type-punning solutions are vastly inferior to arithmetic solutions in more ways than one. But if you are dead set on using memory reinterpretation, at least remember not to use `char`. It is either `unsigned char` or `uint8_t`.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: BBBbbb on July 16, 2021, 01:45:09 pm
is simple, quick and easy.

... and incorrect. Type-punning solutions are vastly inferior to arithmetic solutions in more ways than one. But if you are dead set on using memory reinterpretation, at least remember not to use `char`. It is either `unsigned char` or `uint8_t`.
could you elaborate on the vast inferiority (apart from "safer" and more readable) and focus the exact use case.
I'm really interested if you have another angel, mostly efficiency related.

also what would be the difference in the above example between signed and unsigned char (save to eeprom)?
Anyways should cast to whatever is the receiving parameter is in the provided e2p write function.
regarding types we can play the target/compiler/C standard game all day, and I could say he might not have uint8_t available... there is not much data shared by the OP to discuss it, yes uintX_t is absolutely superior, but have your own typedef header file checked and adjusted for each project is probably the only universal way for portability
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: janoc on July 16, 2021, 01:55:00 pm
...yes uintX_t is absolutely superior, but have your own typedef header file checked and adjusted for each project is probably the only universal way for portability

Uuuuh, no! That's actually a pretty good way to shoot yourself in the foot when an algorithm assumes one size and you have helpfully typedeffed it to something else in some obscure header.

stdint.h is available pretty much everywhere with a reasonably recent C compiler (reasonably recent - released in the last 15 years or so). It is part of the now old C99 standard, not even the more recent C11 one.

Title: Re: 2 byte value into 2 separate 1 byte values
Post by: BBBbbb on July 16, 2021, 03:03:57 pm
...yes uintX_t is absolutely superior, but have your own typedef header file checked and adjusted for each project is probably the only universal way for portability

Uuuuh, no! That's actually a pretty good way to shoot yourself in the foot when an algorithm assumes one size and you have helpfully typedeffed it to something else in some obscure header.

stdint.h is available pretty much everywhere with a reasonably recent C compiler (reasonably recent - released in the last 15 years or so). It is part of the now old C99 standard, not even the more recent C11 one.
I can tell you from personal experience that that is not an option in many automotive projects.
And in many more applications std is unwelcomed.

yes typedeffing requires caution, and that should be on your checklist when porting to a new target, but it gives you an universal solution in a single file, even for legacy projects (yes there are projects still obeying C89/90)

OP should be using in his personal projects unitX_t, but since we're all here nitpicking on nuances of C ...

I stay by my original statement - for his use case casting is a simple and effective solution. Based on the "difficulty" of his question it does not look like it's a serious commercial project with future of porting. More like learning project. And to understand what's going on behind this casting, would teach him some tricks about C and memory (as would this argument here).
His biggest problem might be running out of resources (which I noticed in some hobby projects to be an issue) due to many unoptimized code reuse from online examples, or unnecessary importing of libraries and here casting helps a little bit.

Title: Re: 2 byte value into 2 separate 1 byte values
Post by: TheCalligrapher on July 16, 2021, 04:35:56 pm
could you elaborate on the vast inferiority (apart from "safer" and more readable) and focus the exact use case.

I'm just referring to the general principle: involving memory into an operation that doesn't have to involve memory is always a pessimization. In other abstract terms, it seems that many participants here somehow assumed that the original operation is applied to an lvalue, while in reality there's absolutely nothing lvalue-specific neither in the general problem not in the original question.

(I won't even mention byte-ordering dependency. Within the context of the orignal question it is not clear whether it is a problem at all.)

One can try to defend type-punning solutions by claiming that their compiler always generates the most optimal code for such solutions, but in reality this is just compiler's being smart enough to see through the intent and undoing the pessimization.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: janoc on July 16, 2021, 06:34:30 pm
I can tell you from personal experience that that is not an option in many automotive projects.

You mean that automotive projects are still being built (as opposed to old machines maintained) with C compilers from before the year 2000? I think in that case there are bigger problems to worry about than the lack of stdint.h ...

And in many more applications std is unwelcomed.

Not sure what you mean by "std". stdint.h is literally a header full of typedefs for your compiler & platform, nothing else. It is part of the C language spec, not even the standard library, unlike e.g. malloc or the string handling functions.


His biggest problem might be running out of resources (which I noticed in some hobby projects to be an issue) due to many unoptimized code reuse from online examples, or unnecessary importing of libraries and here casting helps a little bit.

If you are squeezing the code so much that saving few bytes of flash (because that's all what you could save by pointer casting instead of shifting/masking, if that) makes the difference, then really there are bigger issues there.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: BBBbbb on July 16, 2021, 08:11:00 pm
...

Seems you have not worked on products that require longer support... nor safety critical ones...

Ok imagine you drive a car made in early 2010s. What do you think when the development for different ECUs started for that car? 5 years before could easily be the case, but it could be a reuse of the ECU from the previous generation (even older) with just added feature, where the manufacturer is looking to keep the development on a tight budget and does not want to migrate to newer tools, drivers, standards... (this costs a lot on safety critical projects)

Now imagine you need to comply with an industry standard like MISRA 2004 (remember it's 2005, so this is the latest and greatest one...), but oh MISRA 2004 does not support C99? how come? Well no commercial C99 embedded compiler is available at the time of its publication.

OK now fast forward to 2021 and your car from early 2010s. You'd be pretty pissed if you can't buy a spare part, now wouldn't you?
So the manufacturer is still making ECUs for your car, but somewhere along the line sourcing of some components became difficult, changes were made to the design, so some modifications of code had to be made to reflect those changes (e.g. a new OCX with longer settling time, that now needs to be accounted for in code)
So there is a programmer somewhere working on your new spare ECU, that is making some modifications on a C89 compliant code...

Medical might be even worse if it is an FDA approved product. Any single change is an enormous PITA in money, time and effort... That's actually why US is 2-3 generations behind Europe regarding some specialized medical devices that don't have consumables (the cash cow of the medical device industry). Manufacturers don't see the ROI on some of those devices.

Oh and byte counting, yup that's common too at low level embedded that ships in high volume. I was on one project counting free bytes manually in the intel hex file, because memory map tool was not doing a got job for the specific uC. (and no, a better one was not available, as it is often the case with uC not available to the public)
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: Siwastaja on July 17, 2021, 07:45:31 am
In a nutshell: "Always implement your own stdint.h because in some automotive environment unchanged from 1990's, stdint.h may not be available".

How terrible advice.

Definitely always use the standardized typedefs in stdint.h. If not available in some esoteric environment, indeed write your own. I strongly suggest then following the standard naming of stdint.h.

This is an interesting thread to read; such basic beginner questions make people pop up left and right showing off some very interesting bubbles they live in.

janoc's piece of code is the most sane and most easily understood; also most widely used.

Aliasing packed structs in unions have its uses, though, and should be considered when the manual maintenance of manual shift operations becomes too big and error prone of a task, but initially it needs some care and understanding.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: AndersJ on July 17, 2021, 08:27:07 am
Code: [Select]
uint16_t original_data;

uint8_t b1,b2;

b1 = (uint8_t) (original_data & 0xff);
b2 = (uint8_t) ((original_data >> 8) & 0xff);

Is the mask with FF neccessary?
Isn’t the typecast to uint8_t all that is needed?
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: BBBbbb on July 17, 2021, 03:00:09 pm
In a nutshell: "Always implement your own stdint.h because in some automotive environment unchanged from 1990's, stdint.h may not be available".

How terrible advice.

And who advised that?
Please go back and read the thread with a bit more concentration.
It was a following sentence from me:
Quote
regarding types we can play the target/compiler/C standard game all day, and I could say he might not have uint8_t available... there is not much data shared by the OP to discuss it, yes uintX_t is absolutely superior, but have your own typedef header file checked and adjusted for each project is probably the only universal way for portability

I explicitly said that if you want to play the nitpicking game - saying always use stdint has it's issues as well in some cases, and I gave an example. And all this originated from the bashing of type casting in a very specific example, were it just gets the job done efficiently. I really don't like bashing features without looking into exact context. I asked for a bit more details on downfall in the exact example, not sure I got it (maybe I missed it), just standard "general" against arguments...
packed directive gets the same treatment, but it has it's application as you pointed out. Yes it's bad in more than 99% cases, but there is an application.

Another thing that I find a bit surprising is how some perceive standards and it's propagation to projects.
Look at C++20, not sure apart from VS any other compiler supports it fully. How long do you think it needed C99 to become industry standard? (not talking only about automotive)
You might have had wide support for x86 in 1 or 3 years, but what about compilers for application specific MCUs? Do you think those are a small part of the embedded industry? And then you have industry specific standards that come...
Also not all projects have a development cycle of up to a year, some are developed for 3-5 years, and then put into production for 10+years...
So saying oooh you're working in the 90s is very shortsighted.

If I had to guess, most used standard today in the embedded industry world is C99 (not counting things like mobile phones and other devices that have SoCs more powerful than my previous PC and most dev being done in Java or some other higher level language).

...such basic beginner questions make people pop up left and right showing off some very interesting bubbles they live in
That's the beauty of this forum
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: SiliconWizard on July 17, 2021, 05:40:11 pm
In a nutshell: "Always implement your own stdint.h because in some automotive environment unchanged from 1990's, stdint.h may not be available".

How terrible advice.

Indeed. It's mind-boggingly wrong. :horse:

If you work with compilers that do not support C99 at least, well obviously you don't have access to stdint (or it may be that they do but your team's rules are to write in pure C90 for instance). In that case, your team probably has come up with ad-hoc solutions for years already, or sometimes decades. Of course you'll use them then - and you'll have no choice either. Unless you're the boss, in heavily regulated industries, you usually just shut up and do as you're told.

Now if you have access to compilers supporting C99 and can (or are mandated to) use C99 or later, not using stdint would be completely dumb.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: SiliconWizard on July 17, 2021, 05:54:50 pm
As to the original question, if we have to assume that the OP meant to use C (which they did not specify), there are of course ways that are more or less portable.
The most portable way would just be to use MSB = N / 256 and LSB = N % 256 (and conversely use arithmetic for the inverse operation). That's guaranteed to work on any platform, whatever the endianness is and whatever the compiler is. Playing with memory access "usually" works, but is not portable. Using union is even worse: nothing guarantees that "overlaid" members in unions actually point to the same memory area exactly (in practice they may be aligned differently). The standard explicitely says that IIRC. It may work on a specific target with a specific compiler and specific options (yes, that's a lot of "specifics"), but is 100% non-portable. Avoid this.

Also note that memory access or union tricks may remotely look more efficient to you (so you may want to trade off portability), but it's not even. Any decent compiler will usually optimize any of the 3 possibilities above yielding pretty much the exact same object code. Likewise, using shifts and bitwise 'and' won't make a difference. Any decent compiler will translate "MSB = N / 256; LSB = N % 256;" and "MSB = N >> 8; LSB = N & 0xFF" in exactly the same way. Do not bother trying to micro-optimize this.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: etnel on July 17, 2021, 06:19:00 pm
Wow, some of you made me think this was harder than it actually was.
Well, this is my shot at it, works flawlessly. I don't know about the speed etc, but it is fine for my current application.
Code: [Select]
uint16_t original;

uint8_t left,right;

//to read

//*reading 2 bytes from eeprom*
original = 0;

original = right | (left<<8);



//to write

right = original;
left = original >> 8;

//*writing 2 bytes to eeprom*

Title: Re: 2 byte value into 2 separate 1 byte values
Post by: Cerebus on July 17, 2021, 06:24:58 pm
Code: [Select]
uint16_t original_data;

uint8_t b1,b2;

b1 = (uint8_t) (original_data & 0xff);
b2 = (uint8_t) ((original_data >> 8) & 0xff);

Is the mask with FF neccessary?
Isn’t the typecast to uint8_t all that is needed?

In terms of making the compiler happy and getting the semantics you want - probably not necessary.

In terms of clearly indicating your intentions to the next programmer to read the code - priceless.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: SiliconWizard on July 17, 2021, 06:57:37 pm
Code: [Select]
uint16_t original_data;

uint8_t b1,b2;

b1 = (uint8_t) (original_data & 0xff);
b2 = (uint8_t) ((original_data >> 8) & 0xff);

Is the mask with FF neccessary?
Isn’t the typecast to uint8_t all that is needed?

In terms of making the compiler happy and getting the semantics you want - probably not necessary.

In terms of clearly indicating your intentions to the next programmer to read the code - priceless.

Yes. A couple things to note:
- The compiler (at least GCC) will be happy even if you omit the '(uint8_t)' cast and the '& 0xFF' mask. Even at '-Wall'. Yep. For it to emit a warning at all, you need 1/ to enable the '-Wconversion' option (which is not enabled by default with '-Wall') and 2/ not do neither the cast nor the masking. Even just the masking without the cast will silence warnings. Conversion issues are actually very important to catch IMHO, and are particularly easy to introduce in C - so I almost always enable the '-Wconversion' flag.
- Failing to understand integer promotion rules can lead to bugs as well. Wouldn't come into play in the above code, but for the inverse operation, it could: 'original = right | (left<<8);'. In this expression, left which is a uint8_t will be promoted to int before getting shifted left. In this case, it shouldn't be a problem as long as 'int' on the given platform is at least 16-bit. If it isn't, then you'll just get 0. ('int' type is at least 16-bit on most plaforms these days, even 8-bit targets, but this can cause issues for wider integers.) To mitigate this, you'd have to explicitely cast 'left' to the wider type before shifting it.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: golden_labels on July 18, 2021, 12:04:53 am
Many similar comments have appeared since my post and I am late to answer each separately, so this is an aggregate response.

Type punning in general is unportable and, unless unions are used, is actually undefined in C. I would completely agree with you and usually I strongly oppose such solutions, but this is not a general case. This is a constrained scenario in which the goal is to obtain byte-array representation of a value, a representation to never be used by a different piece of code. And C has that very operation defined. Word “bytes” is the key here; C treats char specially: it has no trap representations (if one would care about that), it has no alignent requirements, every object can have each and every char of it read and written, and every byte-copy of an object is always guaranteed to be fine. The operation is explicitly defined by C as being always valid. There is no way it can ever go wrong.

The alternative solution, based on arithmetic, may be be valid, but may as well contain a bug. Merely understanding what the arithmetic solution does in a single (platform, compile, settings) triple takes more time and effort than applying the byte-access version, which is always valid. On top of that most C programmers never learned C and may easily misinterpret such code.(1)

To sum it up, we have two solutions. One explicitly defined to always work properly, the other that may contains bugs. The first one is also “visually” doing what it is asked to do. Which one should be chosen in that situation? I opt for the first one.

As for performance, for modern compiler they are likely to generate code that is nearly as fast — possibly even identical code. In particular creating a temporary array and copying data to it using memmove is most likely to generate optimal code: a no-op introducing zero cost, reading directly from registers used to pass the serialized value.
____
(1) If you don’t believe, ask people here to explain: what data types are involved in etnel’s code and if line 6 of that code contains (or not) signed oveflow.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: Siwastaja on July 18, 2021, 08:09:20 am
The alternative solution, based on arithmetic, may be be valid, but may as well contain a bug.

This is a very good point. Personally, I worked with hand-maintained bit shift fests until one day I hunted for a bug for hours; a bug which was caused by mistyping < instead of <<. Then I started thinking, there has to be a better way.

The problem with the arithmetic version, whether based on / % * or << >> | &, doesn't matter, is that it is error prone manual work. This is the preferred solution when the operations are small, simple and the number of such operations is small. Good for beginners, test programs, and so on. Just typecast everything using the stdint typedefs, and use a lot of parentheses everywhere.

This is also why I think the OP of this thread should go forward with janoc's approach. But it's not the generic solution.

Because when you start writing a full parser with dozens of data fields and you have 100 lines full of such bit shifting fest, it just falls apart. Then, you need indeed to look at the C standard to find out that interpreting any arbitrary data as an array of bytes, or char*, is indeed well defined; it's not only normal, it's also allowed by even the strictest views of language lawyers. Now if you extend this to a more generic pattern involving reverse operation, which is indeed required to take the advantage of the pattern, you need to understand the traps and their workarounds. Tough luck, C ain't easy. And practical C is different to "standard C". And microcontroller code is never perfectly portable anyway. Maybe you can accept the fact that you won't migrate the code from a little endian machine to a big endian one, or vice versa.

So really, I do see the bitshifting data partitioning as a beginner pattern. It's not bad, but it's not scalable. When you learn more about C, you start understanding how to avoid such error-prone process and have some better, but on more powerful tools, there are traps you need to know about.

Regarding alignment; most proper CPUs give ALIGNMENT ERROR interrupt (for example, busfault), making it easy to see what happened, why, and where, and fix it. Instead, a mistyped bitshift somewhere is completely hidden and only causes some certain parameter or field to act strangely. Almost impossible to debug; you may not even know you have a problem.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: TheCalligrapher on July 18, 2021, 04:02:11 pm
Regarding alignment; most proper CPUs give ALIGNMENT ERROR interrupt (for example, busfault), making it easy to see what happened, why, and where, and fix it. Instead, a mistyped bitshift somewhere is completely hidden and only causes some certain parameter or field to act strangely. Almost impossible to debug; you may not even know you have a problem.

Wow... Defending an error prone hardware-level hack on the basis that this hack is likely to fail hard, thus alerting the author that his error-prone hardware-level hack actually lived up to its name an ended up with an error!

It is like building cars with wheels that fall off. You know "It doesn't go far, but at least you always know what the problem is!"

I'm having hard time convincing myself that the above quoted author is not intending this as a tongue-in-cheek statement.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: PlainName on July 18, 2021, 06:51:18 pm
Quote
The problem with the arithmetic version, whether based on / % * or << >> | &, doesn't matter, is that it is error prone manual work.

Pretty much all programming is error-prone manual work! We know that = is often used when == is intended, but we don't use hardware-level hacks instead of assignments or comparison. This (the merge/separation of bytes) is so common that it should be easy to spot a mistype, but even easier to use a macro of inline function. And this is also why we have unit tests - if you're not sure your code does what you mean it to do, you can just test it before locking it and never having to give it another thought.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: SiliconWizard on July 18, 2021, 07:15:48 pm
Agree with you two here. With that reasoning, a simple incrementation is also error-prone and should be avoided. Indeed, "n += 1;"... but you could mess it up and write "n += 2;" or "n -= 1;" instead.

All programming is error-prone manual work. Actually, if some programmer couldn't get a basic arithmetic operation right, I would have little reason to trust any of their work.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: AndersJ on July 18, 2021, 07:35:06 pm
Agree with you two here. With that reasoning, a simple incrementation is also error-prone and should be avoided. Indeed, "n += 1;"... but you could mess it up and write "n += 2;" or "n -= 1;" instead.

Why are increments written in that error prone and incomprehensible way?
Why not write N = N + 1?
Avoids all confusion, is readable and impossible to misunderstand.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: fourfathom on July 18, 2021, 07:49:51 pm
Why not write N = N + 1?

Because "N" is likely to actually be  "someRidicuouslyLongVariableName", and I'd rather not have to type that twice.  Anyway, I'll use the "++" operator if it's available .

As for the two separate byte problem, I prefer the shift and mask technique, but then I'm a hardware engineer.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: PlainName on July 18, 2021, 07:50:54 pm
That's a good question. Ignoring stuff like saving space (which is actually useful if names are long), I think the added benefit is that they are easily differentiated. For isntance,

ThisVariable = ThatVariable + 2;

You gotta  actually read these to figure out that it's not += 2, and generally we don't read but just look at the shapes. Carrying on...

ThisVariable++;

Straight away we know it's a simple increment, not add 2 or 3 or whatever.

Perhaps not definitive reasons, but I think they add enough to allow them without resorting to 'lazy typists' stuff :)

Title: Re: 2 byte value into 2 separate 1 byte values
Post by: golden_labels on July 19, 2021, 03:16:10 am
All programming is prone to error, but there is no reason to make bugs more likely.

In this situation I am less concerned about mistyping code. More about having to take care about data types and their ranges. If not that most architectures use two’s complement and compilers use the simplest approach to implementing arithmetic, accidently masking errors in arithmetic, huge portion of C code would explode. :)
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: brucehoult on July 19, 2021, 04:35:57 am
The alternative solution, based on arithmetic, may be be valid, but may as well contain a bug.

This is a very good point. Personally, I worked with hand-maintained bit shift fests until one day I hunted for a bug for hours; a bug which was caused by mistyping < instead of <<. Then I started thinking, there has to be a better way.

Having even *one* single unit test with well designed random-looking input values and the expected output will catch < instead of <<, or shifting by the wrong amount, or extracting field 3 into output 2, or or or ...


Thirty years ago I thought correct programs could be written by using a bondage-and-discipline language and strictly following type checking and not using "unsafe" constructs and so forth. I looked down on such hacks as using macros instead of templates, or #defines and conditional compilation, or running sed over source code, or writing little bits of perl to generate C source, or doing hacky address arithmetic, or using an untyped or dynamically typed language. If you couldn't do something strictly in-language in Pascal or Ada or OCaml then you shouldn't be doing it.

I was wrong.

Even following all the disciplines in those languages, there are far more errors you can still make than errors they prevent. And they put a whole lot of barriers in the way of effective programming.

It's much better to write things in the way that lets you have the smallest amount of hand-written code. For example, if you're tempted to manually copy&paste&edit 20 copies of something -- totally evil! -- because you can't make that into a function or it's beyond the capabilities of macro processing (or your language doesn't have macros) *don't* *do* *that*. Automate the copying and editing using perl or sed so that you can fix a bug in the code being duplicated in one place, or so that you can change the way you edit it.

Put unit tests in place to check that the end result works as expected. Add preconditions and postconditions to functions and loops.


It's basically impossible to make code correct by design -- at least in a normal commercial setting. It's different if you're programming medical or aerospace or nuclear equipment where they are happy to spend 100x more per line of code. But in a normal setting, the only way to ensure code is correct is testing, and ONCE YOU'RE TESTING APPROPRIATELY it makes very little difference what programming language or technique you use.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: Siwastaja on July 19, 2021, 07:00:05 am
It's much better to write things in the way that lets you have the smallest amount of hand-written code. For example, if you're tempted to manually copy&paste&edit 20 copies of something -- totally evil! -- because you can't make that into a function or it's beyond the capabilities of macro processing (or your language doesn't have macros) *don't* *do* *that*

This is EXACTLY what I was saying above. Apparently others than brucehoult and golden_labels did not get it.

No, I strongly disagree with the others. If you have a nicely designed struct with 57 variables holding your state and you are constructing each of the 57 variables by writing 57 LoC each with various combinations of casting, bitshifting and masking, you are really doing it wrong. Yes, programming requires careful work, but repeated manual copy-paste-like (doesn't matter if you avoid the actual copy-pasting and rewrite manually) should still be avoided. Human brain is good in creative work, but does a lot of mistakes (say, several %) when repeating mechanical "simple" work. This is why automation exists.

Yes, you can solve the problem of being error-prone by rigorously always writing complete unit tests for such construction/deconstruction functions, but this is a lot of work just to test a mechanical piece which only exists due to lack of suitable language features. Also, unit testing does not easily reveal portability issues because arithmetic on C is non-portable (unless you Do It Right, but you can't test against this just by simple unit tests. Maybe a good static analysis tool is available, I don't know, or you can automate running unit tests on gazillion compilers and platforms...)

If type-punning packed struct through union - a hack that exists on the borderline between the "standard C" and "real-world, constrained C" - is out of question and the arithmetic way is the only acceptable, there is really no other option than to auto-generate the structs and the parser from a common description language. This is an actually useful real-world solution I have nothing against. On microcontrollers and other well-definited restricted environments where endianness is fixed and all compilers used support packed attribute, I go the most writable, most readable, most maintainable, least risky way of using the structs directly, but I indeed do accept it isn't a generic solution either. Call it a "hack" if you wish, I don't care.

Finally, the whole Internet works based on C code sharing packed structs like this, and it works, given the programmer knows what they are doing, and what the limitations are. (For example, endianness conversion macros are used.)

And yes, I'm serious about "fail hard". The "hacky" way has approximately 2-4 possible problems (padding, alignment, endianness...) most of which are either trivially dealt with (just one correctly placed attribute needed!), or cause serious, easily noted error (instant bus fault). The arithmetic way seems easy on surface but has more possible traps, not limited to typing errors but also operator precedence, integer promotion rules, numerical ranges, implicit casting, all the implementation specified rules like signed integer overflow, and so on and so on. This is so risky that unit testing for every input combination is highly recommendable.

I would never trust a programmer who claims they are so excellent they make zero mistakes in large sets of "simple" arithmetic operations. Take note, SiliconWizard. I would have expected better commentary from you with your excellent track record of your posts.

Actually why I do recommend the arithmetic version for beginners is to expose them to all the problems, it's highly likely they don't get it right the first time. I didn't, I hit different problems for years until finally nailing it, and after nailing it all that's left are simple accidents. Because arithmetics in C can't be avoided, better expose oneself to it.

Arithmetic version is not guaranteed to be portable either because in original C standard, type widths are implementation specified, and internal implicit conversions convert into these types, number literals are of varying widths... So you need to exactly do the right thing with explicit casting. So portability comes from careful work and understanding. Having seen some portability problems, I'd hazard a guess most of the actual portability problems are related to type handling in arithmetic; NOT "low-level hacks". Latter are typically not everywhere but constrained somewhere, easily dealt with.

Sadly, I think there is no generic solution.

What I don't like in this reoccurring discussion is the mental dishonesty of double stardards. If the programmer makes a mistake in the direct struct usage pattern, it's a proof of the pattern being unusable and dangerous. But if the programmer makes a mistake with the arithmetic solution, it's just a problem of the programmer not being careful enough, and solution is to be more careful and not to make any mistakes.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: brucehoult on July 19, 2021, 10:05:13 am
It's much better to write things in the way that lets you have the smallest amount of hand-written code. For example, if you're tempted to manually copy&paste&edit 20 copies of something -- totally evil! -- because you can't make that into a function or it's beyond the capabilities of macro processing (or your language doesn't have macros) *don't* *do* *that*

This is EXACTLY what I was saying above. Apparently others than brucehoult and golden_labels did not get it.

No, I strongly disagree with the others. If you have a nicely designed struct with 57 variables holding your state and you are constructing each of the 57 variables by writing 57 LoC each with various combinations of casting, bitshifting and masking, you are really doing it wrong. Yes, programming requires careful work, but repeated manual copy-paste-like (doesn't matter if you avoid the actual copy-pasting and rewrite manually) should still be avoided. Human brain is good in creative work, but does a lot of mistakes (say, several %) when repeating mechanical "simple" work. This is why automation exists.

Here's an example of what I'm talking about, using that task of extracting fields from a stream of bytes as an example.

I'm assuming the fields can be any size from 1 to 32 bits, are little-endian, and arbitrarily aligned on bit boundaries.

Rather than ad-hoc extracting each field I'd make a macro (or, better, inline function if you can force your compiler to reliably do that):

Code: [Select]
#include <stdint.h>
#include <stddef.h>

// max size 32 bits                                                                                                                 
#define EXTRACT_FIELD(FIELD, BASE, OFFSET, SIZE) \
do { \
  size_t byteOff = (OFFSET)/8, bitOff = (OFFSET)%8; \
  char *CHAR_BASE = (char*)(BASE); \
  uint32_t val = CHAR_BASE[byteOff] >> bitOff; \
  if (bitOff+SIZE >  8) val |= (uint32_t)(CHAR_BASE[byteOff+1]) << ( 8-bitOff); \
  if (bitOff+SIZE > 16) val |= (uint32_t)(CHAR_BASE[byteOff+2]) << (16-bitOff); \
  if (bitOff+SIZE > 24) val |= (uint32_t)(CHAR_BASE[byteOff+3]) << (24-bitOff); \
  if (bitOff+SIZE > 32) val |= (uint32_t)(CHAR_BASE[byteOff+4]) << (32-bitOff); \
  FIELD = val & ((1<<SIZE)-1);                                          \
 } while (0);

int foo(char *p){
  int r;
  EXTRACT_FIELD(r, p, 40, 8);
  return r;
}

int bar(char *p){
  int r;
  EXTRACT_FIELD(r, p, 10, 8);
  return r;
}

If there are bugs in the macro (maybe!), at least they can be fixed in one place. The same if you come up with a more efficient scheme.

The code generated is not too bad with -O1.

With RV32 gcc cross-compiler:
 
Code: [Select]
00000000 <foo>:
   0:   00554503                lbu     a0,5(a0)
   4:   8082                    ret

00000006 <bar>:
   6:   00154783                lbu     a5,1(a0)
   a:   8789                    srai    a5,a5,0x2
   c:   00254503                lbu     a0,2(a0)
  10:   051a                    slli    a0,a0,0x6
  12:   8d5d                    or      a0,a0,a5
  14:   0ff57513                andi    a0,a0,255
  18:   8082                    ret

Native system compiler (Clang) on my Mac:

Code: [Select]
0000000000000000 ltmp0:
       0: 00 14 40 39                   ldrb    w0, [x0, #5]
       4: c0 03 5f d6                   ret

0000000000000008 _bar:
       8: 08 04 c0 39                   ldrsb   w8, [x0, #1]
       c: 09 08 40 39                   ldrb    w9, [x0, #2]
      10: 29 65 1a 53                   lsl     w9, w9, #6
      14: 28 09 48 2a                   orr     w8, w9, w8, lsr #2
      18: 00 1d 00 12                   and     w0, w8, #0xff
      1c: c0 03 5f d6                   ret

Title: Re: 2 byte value into 2 separate 1 byte values
Post by: PlainName on July 19, 2021, 11:06:57 am
Quote
What I don't like in this reoccurring discussion is the mental dishonesty of double stardards.

Quote
If you have a nicely designed struct with 57 variables holding your state and you are constructing each of the 57 variables by writing 57 LoC each

Well, actually, the problem we really get is when edge cases are presumed to be the deciding factor. AFAIA, no-one was suggesting using shifts or rotates or whatever to achieve the 3249 lines you suggest would happen1. Solutions pretty much always depend on the problem to be solved, and in the case of a humongous struct the the solution is likely to be somewhat different to a single 16-bit variable.

Don't forget - the OP asked about a single instance of two-byte data in external storage, and that's what the solutions are (or should have been) aimed at resolving. If you think those solutions would automatically be applied to, say, an ISO 96660 in-memory filesystem then you're just picking obscure cases to have a fight, IMO.

---
1 OK, there might be a macro that implements such shifts, but then the compiler might do that under the bonnet anyway. Let's not end up working through the assembler then microcode then transistor gates in an attempt to prove something.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: Siwastaja on July 19, 2021, 02:34:51 pm
Well, actually, the problem we really get is when edge cases are presumed to be the deciding factor. AFAIA, no-one was suggesting using shifts or rotates or whatever to achieve the 3249 lines you suggest would happen1.
...
Don't forget - the OP asked about a single instance of two-byte data in external storage,

Don't know where you got 3249, maybe by cutting the quote purposefully changing the meaning of the last "each", but no, that was exactly what was suggested, either on purpose or not, but I was completely clear in my earlier post where I started by stating that the ad-hoc arithmetical approach is best for the OP's "one 16-bit variable case", and also that it only breaks down when that comes to 100 LoC constructed the same way, showing that the approach is not generic or scalable; and also that the correct tool depends on the task. Read again, there's nothing unclear in that; I never forgot the OP's case.

Yet exactly all that was shot down by ridiculous straw man arguments ignoring basically everything I wrote.

If people can't handle wall of text, they can choose not to comment on it. Don't do hit-and-run posting looking for a few key words then assuming what is meant instead of reading.

OP's question was already answered and I extended on it and stand by everything I wrote because the big picture is all correct if you take the time to actually read and understand it.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: SiliconWizard on July 19, 2021, 04:40:16 pm
Just a thought, but next to religion and politics, there's usually nothing more heated than a discussion about programming. It really tends to trigger very strong and opiniated discussions. It just keeps amazing me. Thing is, any topic on programming has a very high probability of ending up in a fight.

I don't think I've ever seen similar things happening with discussions on pure electronics, although I reckon topics on digital electronics also tend to be that way much more than on analog electronics. It may be du to the set of possibilities being so much greater in terms of how to implement one particular feature, or even about methods. Maybe it's also due to concepts being often more complex, and thus, harder to express "right" and easier to misunderstand.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: SiliconWizard on July 19, 2021, 04:56:37 pm
In this situation I am less concerned about mistyping code. More about having to take care about data types and their ranges. If not that most architectures use two’s complement and compilers use the simplest approach to implementing arithmetic, accidently masking errors in arithmetic, huge portion of C code would explode. :)

Arithmetic *done right* in C is actually nothing trivial. That's a fact. A number of rules to know and master. Not rocket science, but I admit some can be confusing. And you sometimes have to brush up on them even when you're experienced. That's a very general issue and is of course not a reason for avoiding arithmetic. But that's yet another point in C for which you definitely can't avoid *learning*. As some of us said earlier, for some reason, many people seem to have a problem with actually learning C.

With that said, cases for which you would nitpick using arithmetic or not using it are extremely subjective - that's something, apart from the above point, that you need to realize. And that's probably because of this high level of subjectivity that discussions get very heated.

Quote from: Siwastaja
but I was completely clear in my earlier post where I started by stating that the ad-hoc arithmetical approach is best for the OP's "one 16-bit variable case", and also that it only breaks down when that comes to 100 LoC constructed the same way, showing that the approach is not generic or scalable;

As absolutely any possible construct, except maybe for the most trivial ones. Any form of code duplication is to be avoided at all costs.
If you have to repeat a given conversion/extraction even more than just once, use macros, functions, whatever else available in a given language that gets you to factor this.

With that said, the problem, as I evoked above, is that whatever one considers "trivial" enough not to require factorization is relatively subjective, so even with the things above in mind, we will never avoid strong arguments about a particular approach in a particular case.

Title: Re: 2 byte value into 2 separate 1 byte values
Post by: brucehoult on July 20, 2021, 12:55:33 am
No comments on the wonders (or evils) of modern C compiler optimisation?

Is it not remarkable to see really quite a lot of code in that macro (could have been an inlined function) reduce down in some cases to just a single machine code instruction?

Constant expression evaluation and dead code elimination do most of the work.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: PlainName on July 20, 2021, 02:24:08 am
Quote
No comments on the wonders (or evils) of modern C compiler optimisation?

Is it relevant?

Let's say we take a look and don't like what the compiler spits out. Are we going to rewrite the compiler to out preference? Perhaps the output is pretty damn good. Is out C code going to produce the same output with a different compiler? Will we refuse to use compiler A because of the not awfully good optimisation?

Or, horrors of horrors, would we change out C source to persuade the compiler to do a certain thing? If we did that, why the hell are we writing in C and not assembler?

Sometimes optimisation is important, but until then it doesn't really matter what comes out so long as it works. And didn't a certain D Knuth note that premature optimisation is the root of all evil...

(edit: seems that my preferred Dell SK8115 keyboard is throwing a sulk and putting 't' where I really put 'r'. Perhaps it thinks I'm about to replace it.)
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: AntiProtonBoy on July 20, 2021, 03:13:48 am
I don't get some of those answers. Why to mess with pointers and structures? Uh and memcpy? Seriously?

Assuming this is C and your data are really 16 bit, this is all you need to do:

Yes. Seriourly.

1. Because memcpy is simpler than your shift suggestion:

Code: [Select]
uint16_t original_data;
uint8_t b[2];

memcpy( b, &original_data, sizeof( b ) );

2. In some situations, when strict aliasing rules are in effect, memcpy is the only safe way to transfer multi-byte POD structure and preserve layout.

3. Compilers routinely use memcpy for copying blocks of data to a sequence of bytes, and optimise such operations really well.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: brucehoult on July 20, 2021, 03:27:53 am
Quote
No comments on the wonders (or evils) of modern C compiler optimisation?

Is it relevant?

Yes.

Quote
Let's say we take a look and don't like what the compiler spits out. Are we going to rewrite the compiler to out preference?

I do that from time to time, yes.

Quote
Perhaps the output is pretty damn good. Is out C code going to produce the same output with a different compiler?

It's going to produce the same answer with any C compiler, certainly. The exact code generated may be different.

Quote
Will we refuse to use compiler A because of the not awfully good optimisation?

Hell, yes!

Quote
Or, horrors of horrors, would we change out C source to persuade the compiler to do a certain thing?

Absolutely, in the performance-critical places in our code.

Quote
If we did that, why the hell are we writing in C and not assembler?

Because guiding a C compiler is a heck of a lot easier and less error-prone than writing in assembler, and usually gives faster code too -- certainly over a large body of code.

Quote
Sometimes optimisation is important, but until then it doesn't really matter what comes out so long as it works. And didn't a certain D Knuth note that premature optimisation is the root of all evil...

Quite apart from speed, code size is often very important, and correctness always is.

Using a generic and tested macro or function is much more likely to give correct code than is writing ad-hoc C code to (in this case) extract aligned and unaligned bitfields of various sizes and locations from a buffer.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: brucehoult on July 20, 2021, 03:30:35 am
I don't get some of those answers. Why to mess with pointers and structures? Uh and memcpy? Seriously?

Assuming this is C and your data are really 16 bit, this is all you need to do:

Yes. Seriourly.

1. Because memcpy is simpler than your shift suggestion:

Code: [Select]
uint16_t original_data;
uint8_t b[2];

memcpy( b, &original_data, sizeof( b ) );

I agree with this, but note that it works only if the data is packed to the byte level, not the bit level.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: DiTBho on July 20, 2021, 08:17:24 am
sizeof(char) returns 2 on DSP43.
never trust that "char" is 1 byte
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: DiTBho on July 20, 2021, 08:37:02 am
It is like building cars with wheels that fall off. You know "It doesn't go far, but at least you always know what the problem is!"

that's precisely what happened to me when I build a go-kart driven by a kerosene turbine  :o
by crashing on the hay bale along the track I learned how to avoid some design mistakes.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: PlainName on July 20, 2021, 10:24:35 am
Quote
    Is it relevant?

Quote
Yes.

Interesting answers, thanks!

I would dispute more than one of them, but I think we are in a different class (for instance, often I don't have a choice of compiler and I don't yet recall ever having the option to write (or rewrite) one).

However, the big one is changing the C source to suit the compiler output. Now, I have done this to circumvent a bug (choice: do it and it works, don't do it and it will crash) but normally I think we consider C to be the lowest level multi-device compatible language. So 'optimising' the C code for the compiler is just making it non-portable and adding in stuff that makes no sense unless one is hip to the why. And doing that when it's not even needed (that is, before you know the code is too slow or whatever) seems quite wrong - it is making it compiler-specific for no practical gain.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: brucehoult on July 20, 2021, 11:44:07 am
sizeof(char) returns 2 on DSP43.
never trust that "char" is 1 byte

That absolutely violates the C standard -- sizeof(char) is *defined* to be 1.

Not 1 byte. Just 1. Whatever size a char is, is the unit you measure other things in.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: DiTBho on July 20, 2021, 12:42:13 pm
That absolutely violates the C standard -- sizeof(char) is *defined* to be 1.
Not 1 byte. Just 1. Whatever size a char is, is the unit you measure other things in.

feel free to send a legal letter to Texas Instruments :D
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: SiliconWizard on July 20, 2021, 05:03:46 pm
No comments on the wonders (or evils) of modern C compiler optimisation?

To be fair, I did mention the fact modern compilers would often compile either of the approaches suggested in this thread producing almost the same code.
In particular, using bit fields (which is a landmine in terms of portability) with GCC usually yields the exact same assembly code as using bitwise operations. At least I've seen this in tests I had done a couple years ago when I was trying to figure out if one method would be more efficient (regardless of portability or readability) than the other. Turned out the answer was that it made no difference in the general case.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: gf on July 20, 2021, 08:00:55 pm
sizeof(char) returns 2 on DSP43.
never trust that "char" is 1 byte

That absolutely violates the C standard -- sizeof(char) is *defined* to be 1.
Not 1 byte. Just 1. Whatever size a char is, is the unit you measure other things in.

Mostly correct, but not exactly.

sizeof operator (https://en.cppreference.com/w/c/language/sizeof)
Quote
...
Returns the size, in bytes, of the object representation...
...

sizeof(char) is indeed alwas 1 (byte), but a byte (in terms of C) is not necessarily required to have a size of eight bits (for instance if the smallest addressable unit of memory in a particular computer has a different size).

Arithmetic types (https://en.cppreference.com/w/c/language/arithmetic_types)
Quote
...
1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
...
Note: this allows the extreme case in which bytes are sized 64 bits, all types (including char) are 64 bits wide, and sizeof returns 1 for every type.
...

[ cppreference.com is admittedly not the official C standard document, but I'd be surprised if the latter woud say anything different. ]
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: SiliconWizard on July 20, 2021, 09:48:11 pm
Yes.
C defines the byte like so:
Quote
byte
addressable unit of data storage large enough to hold any member of the basic character
set of the execution environment
NOTE 1 It is possible to express the address of each individual byte of an object uniquely.
NOTE 2 A byte is composed of a contiguous sequence of bits, the number of which is implementation-
defined.  The least significant bit is called the low-order bit; the most significant bit is called the high-order
bit.

But sizeof(char) returning 2 would indeed be completely non-compliant.

Now of course one may question this definition of "byte" used in the C standard, as it's otherwise widely recognized as a word made of 8 bits, but the definition is given and still clear.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: golden_labels on July 20, 2021, 11:00:44 pm
2. In some situations, when strict aliasing rules are in effect, memcpy is the only safe way to transfer multi-byte POD structure and preserve layout.
Not only memmove (or the old memcpy). Any access through a char* and (un)signed versions is always valid, as those are an exception to the aliasing rules.(1) And I doubt that exception is likely to disappear, as language’s internal consistency depends on it. Otherwise all the guarantees WRT byte representation would become meaningless.

sizeof(char) returns 2 on DSP43.
never trust that "char" is 1 byte
I believe everyone here assumes the compiler works properly, as not making that assumption adds a whole new level of complexity that can’t even be dealt with without specifying the exact situation and compiler.
____
(1) 9899:2011 6.5§7
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: DiTBho on July 21, 2021, 06:40:10 am
I believe everyone here assumes the compiler works properly, as not making that assumption adds a whole new level of complexity that can’t even be dealt with without specifying the exact situation and compiler.

The Taxas Instrument one? Yes, for the DSP43 it works good, just you have to deal with some quirks. It took me over a week to understand why my imported code didn't work as expected, then I found that sizeof(char) returns 2, and ... frankly it was more frustrating than shocking: you cannot change the toolchain, you cannot modify it, and you find that in the meanwhile the managers in the US have already approved the toolchain for the job (without knowing anything  :palm: ), so you cannot say "hey? it violates this and that ... do ... do you hear me? ... ", who cares? You have been hired for the job, now you have to deal with it, so you can only push your C-quirks under a layer of assembly and go on.

C can be frustrating, this is the typical example.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: brucehoult on July 21, 2021, 06:47:57 am
C can be frustrating, this is the typical example.

That is *not* C.

It might look very like C, but it's not C.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: DiTBho on July 21, 2021, 06:54:33 am
Quote
specifying the exact situation and compiler

that's why our projects all come with a project builder, which checks if the installed toolchain and ecosystem is qualified for the job

Code: [Select]
# b

   firmware-5.128(ppc64.le/f2)

0) clean
1) configure
2) compile
3) analyze
4) ICE@192.168.1.21
5) misc

name: firmware-5.128-64bit-f2
image: elf
note: engineering version
qualified_host: passed
qualified_toolchain: passed

All the binaries of the toolchain have a certificate, you cannot alter a single bit of them, and "qualified" means someone has tested and checked them, and signed a certificate under his/her name. This is also why I couldn't "hack" or change the Texas Instruments' toolchain.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: DiTBho on July 21, 2021, 07:04:18 am
In 25 years of experience, the only honest company that said "my partner's tool is not C but C-like" was Motorola, back in 1996, when ICC11 was released for a MIT project.
90% of its syntax was C-like, but
- pointers were different and limited
- structures and unions were guaranteed to not be stuffed or padded
- etc
I remember they warned the users, inviting them to carefully read the manual, with a quick section which briefly summarized the most important differences

Well done  :-+
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: golden_labels on July 21, 2021, 09:23:29 am
then I found that sizeof(char) returns 2
And that is a bug. sizeof is defined to return the number of char elements that hold its argument. Therefore sizeof(char) can by definition be 1 and 1 only. If a compiler, claiming to be a C compiler, fails to properly undestand valid C code, it’s a bug. Period.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: SiliconWizard on July 21, 2021, 04:16:25 pm
C can be frustrating, this is the typical example.
That is *not* C.
It might look very like C, but it's not C.

Yep indeed.
That's kind of funny. As we said in other threads, C users tend to consider C as something you don't need to learn - and thus largely ignore standards - but some tool vendors (fortunately, not many) do the exact same.

Poor C... so mistreated. :(
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: Siwastaja on July 21, 2021, 04:24:54 pm
No compiler is completely perfect and bug-free but they still can be called "C compilers".

But indeed sizeof(char)==2 is so elemental issue breaking such important base assumption behind the C language that it will have serious consequences in most codebases, and will make most developer's brain spin around. This needs a lot of working around.

So yes, you can assume sizeof(char)==1 by default. Such exceptions to this rule are super special, likely with that would be the only compiler in the world that does that. Don't waste your time always questioning everything, you need to be able to trust some basic things like table salt being table salt when the package says so, you don't send it to a lab every time you fix something to eat just in case this particular table salt batch accidentally ships some poison which happens to taste like table salt.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: SiliconWizard on July 28, 2021, 04:37:17 pm
No compiler is completely perfect and bug-free but they still can be called "C compilers".

Vendors can call their compilers however they like.
If they are not clearly stating which C standard they are compliant with (or they do, but they are actually not compliant, which is way worse), then I'll either not use the compiler, or if I have no choice, I'll treat it as a compiler for a language that I don't know, and will thoroughly read the manual.
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: TheCalligrapher on July 28, 2021, 06:14:38 pm
sizeof(char) returns 2 on DSP43.

I find it hard to believe. Any reference to the compiler docs?
Title: Re: 2 byte value into 2 separate 1 byte values
Post by: PlainName on July 28, 2021, 07:16:31 pm
Quote
... and will thoroughly read the manual.

How quaint  ;)

Won't be long until that becomes "and watched half a youtube video" out of necessity. :(