Author Topic: text / strings in C  (Read 2568 times)

0 Members and 1 Guest are viewing this topic.

Offline Simon

  • Global Moderator
  • *****
  • Posts: 14986
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #75 on: April 13, 2020, 10:21:30 am »
not really. ARM addresses memory in bytes. I have saved 3 bytes of RAM on the 4 you suggest. Next I could use 16 bits but really i can't see myself at the moment with an array of more than 256 elements. I can of course make this an 8 bit version and write the 16 bit version when required.

No, that's not correct. Those function arguments (the first 4 of them anyway, on ARM) and local variables (up to 12 or so total that are live at the same time) are in CPU registers, not RAM. The registers are always 32 bits even if you put an 8 bit value into them.


So what is the difference if i give the register up to 8 bits or up to 32 bits? or is it the time taken to fill in the other 24 bits? but then the register should be empty before the 8 bits are put into it?
 

Offline Simon

  • Global Moderator
  • *****
  • Posts: 14986
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #76 on: April 13, 2020, 10:25:48 am »
Here is a practical example why local variables and function parameters should generally be ints and not chars or shorts or explicit-length types:
Code: [Select]
int  my_slen1(const char *p) { return __builtin_strlen(p); }
char my_slen2(const char *p) { return __builtin_strlen(p); }
When compiling the above using GCC for Cortex-M4 using Thumb instructions, the two compile essentially to
Code: [Select]
    .text
    .thumb

my_slen1:
    b   strlen

my_slen2:
    push {r3, lr}
    bl   strlen
    uxtb  r0, r0
    pop  {r3, pc}
Instead of saving memory, my_slen2() generates extra code, because the return value must be limited to 8 bits!

Similar effects happen when calling functions that take parameters that are limited to smaller range of values than the native registers support.

So, what to do?

It is a portability issue.  On 8-bit Arduinos, int is a 16-bit type, and technically requires two registers.  On 64-bit architectures, int may be just 32-bits.

Although the int_fastN_t and uint_fastN_t types would be the technically best option (with N being the smallest reasonable value for that particular variable; 8, 16, 32, or 64), not many programmers use them.  If others work on your code, they may be surprised and get it wrong, which incurs a maintenance burden.

Current POSIXy systems I use are either ILP32 or LP64, so long is the "native register type".  However, for in-memory sizes and counts, I use size_t .

Some projects define their own types, using preprocessor #if directives to choose the best mapping.  For example, you might have iregN and uregN, analogously to int_fastN_t and uint_fastN_t.  Because of the nonstandard type name, other programmers might actually read the documentation or comments on how the types are intended to be used.  However, don't fall into the WORD/DWORD/QWORD trap; better assume the types are binary (two's complement if signed) with a fixed number of bits, and have that number (or lower limit) in the type.

Most current C code seems to use size_t (for in-memory sizes and counts) and int for everything else.

So should i just use int/uint instead? and let the compiler choose?
 

Offline Yansi

  • Super Contributor
  • ***
  • Posts: 3283
  • Country: 00
  • STM32, STM8, AVR, 8051
Re: text / strings in C
« Reply #77 on: April 13, 2020, 10:35:15 am »
Ultimately my buffer is being pumped out on an 8 bit serial port, are there any better variable types that I should use? what does GCC for ARM support?

Well, on ARM, you most time get some more cleverly built peripherals. And having an 8bit wide SPI or UART or whatever interface bus does not necessarily mean you need to feed it 8 bits at a time. Most peripherals (you need to RTFM the exact manual specific to your device!) support wider accesses to feed in data in more efficient manner. That is to make less memory accesses and use less bus cycles. Also, you have the DMA to help with all this.

Let me demonstrate a common one:

For example on STM32 microcontrollers, it is a well known case that numerous programmers fight with SPI peripheral. That's because the data register is 16bit wide and writing to the register just like this

Code: [Select]
uint8_t data = 0x55;
SPIx->DR = data;

triggers a transmission of 2 bytes instead of one.  That is because DR is 16bit and compiler does exactly as is supposed to: uses a 16bit write to 16bit memory location (DR register).

To correct it, one must typecast the data register to 8bit, to tell the compiler to produce just 8bit memory write access:

Code: [Select]
*(uint8_t*)&(SPIx->DR) = data;
 

Offline Simon

  • Global Moderator
  • *****
  • Posts: 14986
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #78 on: April 13, 2020, 10:53:30 am »
The SPI port i am using is 8bits wide, if it were 16 or 32 bits wide I would just store my predefined texts in 32 bit variables so that I have 1/4 the interrupt calls when i get round to driving this with interrupts.
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 3500
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Re: text / strings in C
« Reply #79 on: April 13, 2020, 10:56:10 am »
For example on STM32 microcontrollers, it is a well known case that numerous programmers fight with SPI peripheral. That's because the data register is 16bit wide and writing to the register just like this

Code: [Select]
uint8_t data = 0x55;
SPIx->DR = data;

triggers a transmission of 2 bytes instead of one.  That is because DR is 16bit and compiler does exactly as is supposed to: uses a 16bit write to 16bit memory location (DR register).

To correct it, one must typecast the data register to 8bit, to tell the compiler to produce just 8bit memory write access:

Code: [Select]
*(uint8_t*)&(SPIx->DR) = data;
False. SPI can only be accessed with an 16 or 32 bit load/store. Using 8 bit load/store instruction on SPI will cause a bus error. (entire APB can't do 8 bits I think)

Simon is still struggling with C string pointers and you're doing micro optimizations?
 

Offline Simon

  • Global Moderator
  • *****
  • Posts: 14986
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #80 on: April 13, 2020, 11:04:55 am »
I'm still trying to find out what data types i am supposed to be using, another one of those best kept secrets, is it a case of making a compromise on memory use and speed of execution? i still don't get why putt 8 bits on a 32 bit bus is slower than putting 32 bits.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 1528
  • Country: us
  • Formerly SiFive, Samsung R&D
Re: text / strings in C
« Reply #81 on: April 13, 2020, 12:08:56 pm »
I'm still trying to find out what data types i am supposed to be using, another one of those best kept secrets, is it a case of making a compromise on memory use and speed of execution? i still don't get why putt 8 bits on a 32 bit bus is slower than putting 32 bits.

It's not. It's usually the same.

Putting 8 bits four times in a row will be slower than putting 32 bits once. But that can complicate the programming a lot. And usually unncessarily.

Just keep doing what you're doing. Write code that makes sense for your problem, not for the exact CPU you're using. It will be *good enough*.
 

Offline Simon

  • Global Moderator
  • *****
  • Posts: 14986
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #82 on: April 13, 2020, 12:22:40 pm »
yes which is what I said earlier. I did consider the idea of putting all port and pin information into one byte but soon decided that the time required to separate the port information and pin information made it a really silly idea. i can't see how there is any way to encode anything into larger variables and save time over single memory transfers.
 

Offline Yansi

  • Super Contributor
  • ***
  • Posts: 3283
  • Country: 00
  • STM32, STM8, AVR, 8051
Re: text / strings in C
« Reply #83 on: April 13, 2020, 12:29:01 pm »
For example on STM32 microcontrollers, it is a well known case that numerous programmers fight with SPI peripheral. That's because the data register is 16bit wide and writing to the register just like this

Code: [Select]
uint8_t data = 0x55;
SPIx->DR = data;

triggers a transmission of 2 bytes instead of one.  That is because DR is 16bit and compiler does exactly as is supposed to: uses a 16bit write to 16bit memory location (DR register).

To correct it, one must typecast the data register to 8bit, to tell the compiler to produce just 8bit memory write access:

Code: [Select]
*(uint8_t*)&(SPIx->DR) = data;
False. SPI can only be accessed with an 16 or 32 bit load/store. Using 8 bit load/store instruction on SPI will cause a bus error. (entire APB can't do 8 bits I think)

Simon is still struggling with C string pointers and you're doing micro optimizations?

I think Simon is doing quite some progress already, so what is wrong with giving a little bit of an insight why stuff is than the way it is done? The worst thing is that he won' understand and will ask for clarification. Better than to stay dumb and know nothing.


Regarding the SPI what you call "false", you should look it up first.
https://electronics.stackexchange.com/questions/324439/stm32-spi-slave-send-data-problem
https://community.st.com/s/question/0D50X00009kHTHu/bug-in-stm32f0-spi-lowlevel-driver-llspireceivedata8-and-llspitransmitdata8
etc...  just look how a STM32F0_LL_HAL driver implements LL_SPI_TransmitData8 ...
« Last Edit: April 13, 2020, 12:33:25 pm by Yansi »
 

Offline Simon

  • Global Moderator
  • *****
  • Posts: 14986
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #84 on: April 13, 2020, 12:38:57 pm »
I do need to tackle one topic at a time. The SPI is 8 or 9 bit on the SMAC, no point in putting data into 32 bit variables. I would incur code overhead to get it in and out. Fact is much of the data in an embedded system is 8 bit, I do like the fact that with ARM when I do come to deal in 16 and 32 bit variables they will take it in their stride unlike 8 bitters.
 

Offline Simon

  • Global Moderator
  • *****
  • Posts: 14986
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #85 on: April 13, 2020, 01:25:47 pm »
silly question related to another thing I wrote some code for. If I have a function that works on an array variable but only on one can I do a return variable at the end of the function or should I just access the variable externally. It's an external variable anyway.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 1724
  • Country: fi
    • My home page and email address
Re: text / strings in C
« Reply #86 on: April 13, 2020, 01:50:22 pm »
So should i just use int/uint instead? and let the compiler choose?
Like I wrote, it depends on how important portability is to you, and to what sort of systems.

For code running on 32-bit processors, using int or unsigned int works well.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 1724
  • Country: fi
    • My home page and email address
Re: text / strings in C
« Reply #87 on: April 13, 2020, 02:29:51 pm »
silly question related to another thing I wrote some code for. If I have a function that works on an array variable but only on one can I do a return variable at the end of the function or should I just access the variable externally. It's an external variable anyway.
If there is a real reason why it is always the same array, just access the global variable.

If there is a possibility the same operation could be used for different arrays, pass the array (and length) as a parameter.

For example, consider:
Code: [Select]
char *uint_string(unsigned int  u)
{
    static char buffer[12]; /* Max. 99,999,999,999 */
    char *p = buffer + sizeof buffer;

    if (u > 99999999999) {
        /* Not enough room in buffer! */
        return NULL;
    }

    *(--p) = '\0';
    do {
        *(--p) = '0' + (u % 10);
        u /= 10;
    } while (u);

    return p;
}
This uses 12 bytes of RAM for the buffer, and each consecutive invocation overwrites the previous value.  The return value is a pointer to a string (in RAM) containing the unsigned integer value as a decimal string.

It would be perfectly okay to move the static char buffer[]; outside the function, for example if several functions could use the same buffer.

However, what if one needs more than one result at the same time?  We could instead do
Code: [Select]
char *uint_buffer(char *buf, size_t len, unsigned int u)
{
    char *ptr = buf + len;

    if (len > 0) {
        /* Terminate buffer */
        *(--ptr) = '\0';
    }

    do {
        if (len < 1) {
            /* Not enough room in buffer! */
            return NULL;
        }

        --len;
        *(--ptr) = '0' + (u % 10);
        u /= 10;
    } while (u);

    return ptr;
}

char *uint_string(unsigned int u)
{
    static char buffer[12]; /* 11 digits is enough for everyone! ;-) */
    return uint_buffer(buffer, sizeof buffer, u);
}
which provides the same uint_string(), but only as a wrapper around uint_buffer(), which takes the buffer array as a parameter.

For embedded code, I'd lean towards using the array directly (first example).

It does mean that if you realize you could reuse the function with a different array, instead of just copy-pasting and tweaking the copy to "just work", you really need to do refactoring, to extract the common code to a single function, possibly with wrapper functions for the particular cases, similar to uint_buffer(). It's not a lot of extra work, but there is the temptation to just copy-paste-tweak it; and that leads to unmaintainable bloated blob of spaghetti code, which is no fun.
« Last Edit: April 13, 2020, 02:35:11 pm by Nominal Animal »
 

Offline IanB

  • Super Contributor
  • ***
  • Posts: 9693
  • Country: us
Re: text / strings in C
« Reply #88 on: April 13, 2020, 06:01:22 pm »
So should i just use int/uint instead? and let the compiler choose?

For the most part, yes, you should use the ordinary types. The design of the C language is that the compiler will try to choose a representation that is most efficient for the target hardware. The only time you should really use the special types like uint8_t is for example when you very specifically want the value to fit into a hardware register of known size.

For general programming, use char, int, long, unsigned int, etc.
I'm not an EE--what am I doing here?
 

Offline Simon

  • Global Moderator
  • *****
  • Posts: 14986
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #89 on: April 13, 2020, 07:31:19 pm »
so if I use uint the compiler will decide if i need 8, 16 or 32 bits? sounds a bit dodgy for things like writes to an spi peripheral.
 

Offline IanB

  • Super Contributor
  • ***
  • Posts: 9693
  • Country: us
Re: text / strings in C
« Reply #90 on: April 13, 2020, 07:59:13 pm »
so if I use uint the compiler will decide if i need 8, 16 or 32 bits? sounds a bit dodgy for things like writes to an spi peripheral.

No, an int is always at least 16 bits, but it maybe more if the compiler thinks that would be more efficient.

If you need at least 8 bits use "char".
If you need at least 16 bits use "int".
If you need at least 32 bits use "long int".
If you need at least 64 bits use "long long int".

If you need it to be unsigned put "unsigned" in front of it. Therefore unsigned integers of at least 16 bits are "unsigned int".

As I mentioned above, when dealing with hardware you may need to construct bit patterns of exact size and order. You can do this in C if needed, but for general programming it is better to use the regular datatypes and let the compiler have freedom to optimize your code.
I'm not an EE--what am I doing here?
 

Offline Simon

  • Global Moderator
  • *****
  • Posts: 14986
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #91 on: April 13, 2020, 08:25:55 pm »
Well this is sort of what i asked above. If using 8 bit variables causes code overhead does using the generic data types allow the compiler to optimise, so if I only need 8 bits but there is plenty of memory to use 32 bits and it's faster it does that ?
 

Offline IanB

  • Super Contributor
  • ***
  • Posts: 9693
  • Country: us
Re: text / strings in C
« Reply #92 on: April 13, 2020, 09:31:28 pm »
Well this is sort of what i asked above. If using 8 bit variables causes code overhead does using the generic data types allow the compiler to optimise, so if I only need 8 bits but there is plenty of memory to use 32 bits and it's faster it does that ?

It's best not to overthink it. If you are sending text to an LCD you are using (probably) ASCII characters, so use "char".

If you are doing integer arithmetic or bit manipulation, use "int" or "unsigned int".

Contrary to what I think I saw earlier in the thread, it is perfectly OK to do this:

Quote
    const char* const message = "Some menu item";

This will allow the compiler to store the text string in some area of memory reserved for constant data and you promise the compiler you will never try to overwrite it. The two consts also make sure you never accidentally reassign the message pointer to point somewhere else and thus lose access to the message.

This construct is very good for menus, since the menu items do not change during the running of the program.

You could also do this:
Code: [Select]
    const char* menu_items[] = { "First item", "Second item", "Third Item" };
I'm not an EE--what am I doing here?
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 1724
  • Country: fi
    • My home page and email address
Re: text / strings in C
« Reply #93 on: April 14, 2020, 12:22:03 am »
Well this is sort of what i asked above. If using 8 bit variables causes code overhead does using the generic data types allow the compiler to optimise, so if I only need 8 bits but there is plenty of memory to use 32 bits and it's faster it does that ?
If we consider code that will only run on 32-bit microcontrollers, and ignore the portability issues to other types of architectures (64-bit, 8-bit), then:
  • For function parameters and temporary variables like look indexes, use int or unsigned int.  The latter can never be negative, so if you have something that cannot be negative, use unsigned int .
  • For state variables, buffers, arrays, structures, and so on, use size-appropriate specific-size integer types.

If you need at least 8 bits use [...]
Noooooo!

The proper types are provided by the compiler when you include <stdint.h> (this header file is provided by the compiler; GCC in Simon's case).

Here is the more complete, portable but complex, criterion for choosing the best integer type for each variable or structure member:
  • Choose N between 8, 16, 32, and 64, depending on the range of values you need.
    Unsigned integer types can hold values from 0 to 2N-1, and signed integer types from -2N-1 to 2N-1-1, inclusive.
  • For in-memory sizes (array lengths, array indexes, counts), use size_t . It is an unsigned type.
    For historical reasons, standard C string functions (strlen(), strchr()) use int instead.
  • If you convert pointers to integers or vice versa, use uintptr_t or intptr_t, depending on whether you want an unsigned or signed integer type.
  • For function parameters and temporary variables, use size_t for sizes and indexes, int_fastN_t for at least N-bit signed integers, and uint_fastN_t for unsigned integers.
  • For character arrays, use either char or unsigned char.  If you use any of the character classification functions (isspace() et cetera), you must cast the character code to unsigned char when passing to the function, or failures will occur with non-ASCII character codes.
  • For other arrays, state variables, and memory structures, use explicit-width signed (intN_t) or unsigned (uintN_t) integer types.
  • For best alignment within structures, put doubles and 64-bit integer types first, then pointers and size_ts, then floats and 32-bit integer types, then 16-bit integer types, and 8-bit integer types and char types last.  The compiler is not allowed to rearrange the order of the members in memory, only to add unused padding (bytes) if necessary.
  • For bit arrays, use a preprocessor macro to define the bit array word type, and another macro to hold the number of bits in each word.
    The best type depends on the word size, and can be determined at compile time using the __WORDSIZE preprocessor macro most C compilers provide.

The bit array word size autodetection is actually very simple:
Code: [Select]
#if !defined(BITGROUP) && !defined(BITGROUP_BITS)
#if __WORDSIZE < 16
#define  BITGROUP  uint8_t
#define  BITGROUP_BITS  8
#elif __WORDSIZE < 32
#define  BITGROUP  uint16_t
#define  BITGROUP_BITS  16
#elif __WORDSIZE < 64
#define  BITGROUP  uint32_t
#define  BITGROUP_BITS  32
#else
#define  BITGROUP  uint64_t
#define  BITGROUP_BITS  64
#endif
#endif
#define  BITGROUPSFOR(bits)  (((bits) + BITGROUP_BITS - 1) / BITGROUP_BITS)

static inline BITGROUP bitgroup_getbit(const BITGROUP *const map, const size_t bit)
{
    const BITGROUP  mask = ((BITGROUP)1) << (bit % BITGROUP_BITS);
    return !!(map[bit / BITGROUP_BITS] & mask);
}

static inline void bitgroup_clearbit(BITGROUP *const map, const size_t bit)
{
    const BITGROUP  mask = ((BITGROUP)1) << (bit % BITGROUP_BITS);
    map[bit / BITGROUP_BITS] &= ~mask;
}

static inline void bitgroup_setbit(BITGROUP *const map, const size_t bit)
{
    const BITGROUP  mask = ((BITGROUP)1) << (bit % BITGROUP_BITS);
    map[bit / BITGROUP_BITS] |= mask;
}

static inline void bitgroup_togglebit(BITGROUP *const map, const size_t bit)
{
    const BITGROUP  mask = ((BITGROUP)1) << (bit % BITGROUP_BITS);
    map[bit / BITGROUP_BITS] ^= ~mask;
}

static inline void bitgroup_setbit_to(BITGROUP *const map, const size_t bit, BITGROUP state)
{
    if (state) {
        map[bit / BITGROUP_BITS] |= ((BITGROUP)1) << (bit % BITGROUP_BITS);
    } else {
        map[bit / BITGROUP_BITS] &= ~(((BITGROUP)1) << (bit % BITGROUP_BITS));
    }
}
so that to declare an array of words enough for e.g. 293 bits, you use
Code: [Select]
BITGROUP mybits[BITGROUPSFOR(293)];
The BITGROUPSFOR(bits) macro calculates the number of elements needed for bits bits, rounding up.  To get, set, clear, or toggle specific bits, you can use the bitgroup_getbit()/_setbit()/_setbit_to()/_clearbit()/_togglebit() inline helper functions.  I normally add also bit range functions, for clearing ranges of bits.  To clear the entire map, I use memset(mybits, 0, sizeof mybits);.

You can also use the BITGROUP type for unsigned integers of native register size.  You might wish to name it better, though.

The default type and size can be overridden by defining the preprocessor macros BITGROUP and BITGROUP_BITS before the above code; for example, at the GCC command line, using -DBITGROUP=type -DBITGROUP_BITS=size .
 

Offline IanB

  • Super Contributor
  • ***
  • Posts: 9693
  • Country: us
Re: text / strings in C
« Reply #94 on: April 14, 2020, 12:46:12 am »
If you need at least 8 bits use [...]
Noooooo!

The proper types are provided by the compiler when you include <stdint.h> (this header file is provided by the compiler; GCC in Simon's case).

Well I think that depends on your definition of the word "proper"  :)

If I have one of the special cases you list, I should use one of the special types provided for those cases (e.g. size_t for sizes and indices).

If I am just doing simple integer arithmetic with small numbers I will use "int". As long as I know that I am only guaranteed support for values from -32768 to 32767 this will be fine and will be perfectly portable. Code should never contain more decoration and special cases than are truly needed, otherwise readers will ask, "Why is this here?"
I'm not an EE--what am I doing here?
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 1724
  • Country: fi
    • My home page and email address
Re: text / strings in C
« Reply #95 on: April 14, 2020, 04:46:34 am »
Well I think that depends on your definition of the word "proper"  :)
For sure.  There wasn't a good emoticon to express it.  I vacillated between "noooo" and "yes, but, butbut"  :).

Basically, I too recommended int + size_t to Simon in a message or two earlier, for the very same reasons.

Yet, the simplistic char/short/int/long int (+ long long int for compilers that support it) selection rule has failed me before (and caused me pain in existing old projects).  It works fine on any single architecture, except for the "if you choose a too small type for the native register size, you get extra machine code" detail, that might or might not matter.

However, when porting code to different architectures, with different type sizes (in particular, from an ILP32 to LP64, i.e. from 32-bit ints, longs and pointers to 32-bit ints but 64-bit longs and pointers), the types don't work that well.  That's why the "new" integer types were added to C99, after all.

So, consider the "nooooo" as a quiet sob, not a yell.

The "new" integer types were designed to solve the exact questions Simon has posed here.  When moving from 16-bit to 32-bit architectures, and again when moving from 32-bit to 64-bit architectures, code that used the simple size-based rules became inefficient.  (The integer/pointer size disparities between e.g. ILP32 and LP64 did lead to bugs, and the intptr_t/uintptr_t types are designed to fix those now and in the future, but I'm talking about inefficiencies as in compiler generating unneeded extra code to follow the standard C integer type rules because it does not know the programmer intent.  These "new" integer types "size behaviour" better expresses programmer intent, even if the hardware implementation is unchanged.)

In particular, with these "new" types, there is no reason why int_fast16_t, int16_t, or int_least16_t should all/any be the same type: the first one is the one that is suitable for temporary variables and function parameters (i.e., is of native integer arithmetic size), second is exactly 16 bits, and the third is at least 16 bits, but the architecture can use a larger type if keeping it to just 16 bits would mean extra code (say, accessing the upper 16 bits of a 32-bit word would require bit shifting).

While there is nothing special in their hardware implementation, the interesting part in them is the rules on how their sizes are defined on different architectures, and how useful they can be -- for portable code.  But few C programmers write truly portable code, so not many C programmers know this.  (Which is kinda why I'm harping about it here, even this is just a thread where Simon is looking for quick hints on how to progress.)
« Last Edit: April 14, 2020, 04:49:34 am by Nominal Animal »
 
The following users thanked this post: Yansi

Offline Simon

  • Global Moderator
  • *****
  • Posts: 14986
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #96 on: April 14, 2020, 08:33:39 am »
so who created the uint8_t stuff?
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 5344
  • Country: fr
Re: text / strings in C
« Reply #97 on: April 14, 2020, 03:45:47 pm »
However, when porting code to different architectures, with different type sizes (in particular, from an ILP32 to LP64, i.e. from 32-bit ints, longs and pointers to 32-bit ints but 64-bit longs and pointers), the types don't work that well.  That's why the "new" integer types were added to C99, after all.

This was a longgg overdue addition. For a standardized language that was supposed to be good for low-level stuff, not having standard exact-width (and guaranteed minimal width, with the least* variants) types was mind-boggling.

Even much higher-level languages such as ADA had this.

 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 1724
  • Country: fi
    • My home page and email address
Re: text / strings in C
« Reply #98 on: April 15, 2020, 02:43:56 am »
so who created the uint8_t stuff?
I'm not sure who came up with them first, but by the mid-nineties, everyone agreed they were a good, necessary thing, and the C standard included them in C99, also known as the ISO/IEC 9899:1999 standard.

Although you need to #include <stdint.h> to use them, they aren't really a part of the standard C library, and are provided by the compiler itself.  So, even when you are writing freestanding code, say an operating system kernel, or bare metal code for a microcontroller, this is still available.  It is very much part of the core language itself, ever since C99.  (Although, *all* C compilers, even ANSI C/C89 ones, can provide them; and even if one does not, anyone can trivially add it themselves, so basically all C compilers, even oddball specialized ones, can be expected to provide these.)

And SiliconWizard is exactly right, C should have had these from the get go.

IanB is also right in that most C programmers are still unaware of them (well, everyone knows about intN_t and uintN_t and size_t, but very few regularly use int_fastN_t and uint_fastN_t -- not even myself), but I think that is a human problem -- the tutorials don't describe or recommend their use, and the huge amount of existing C code doesn't use them much either; some human C programmers never encounter them at all.

In real life, it is much more common to use custom type names in library code.  For example, glib uses g_uint, Windows APIs use DWORD et cetera.

If I were to write my own OS kernel or a Hardware Abstraction Library for programming a microcontroller or embedded system on bare metal in C or the appropriate subset of C++, I would probably use unsigned char for all strings, size_t for sizes, inative_t and unative_t for native register size signed and unsigned integer types (8, 16, 32, or 64 bit two's complement binary), inativeN_t and unativeN_t instead of int_fastN_t and uint_fastN_t, and intN_t and uintN_t for fixed-size signed and unsigned integer types.

If the Arduino environment had done that, it would be much easier to write code that works near-optimally on different hardware architectures (like 8-bit AVRs and 32-bit ARMs).
 

Offline Simon

  • Global Moderator
  • *****
  • Posts: 14986
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #99 on: April 15, 2020, 09:16:05 am »
Well I would guess that using a specific size variable will make code more portable not that that is an issue with bare metal program as it is not meant to be portable.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf