Author Topic: sizeof Function in C Is not Working as Expected With AVR Microcontroller  (Read 940 times)

0 Members and 1 Guest are viewing this topic.

Offline Kalcifer

  • Regular Contributor
  • *
  • Posts: 65
  • Country: ca
Observe the following example code (Microcontroller used: ATtiny84A):

Code: [Select]
void toggle_output(unsigned char *string)
{
    for (unsigned char i = 0; i < sizeof(string) - 3; i++)
    {         
        PORTB ^= (1 << PORTB0);     
    }
}   

toggle_output("test");

I would expect the pin, which is attached to port B0, to toggle 4 times - once for each character in `"test"`; however, what is happening is that I am ending up with an endless loop. The pin is endlessly toggling when it should only toggle 4 times.

I know that the pin is functioning normally, as I am able to set the output statically high or statically low. And I know that If I give the for loop a fixed number of times to loop, it will do so the expected amount of times. For example:

Code: [Select]
void toggle_output(void)
{
     for (unsigned char i = 0; i < 5; i++)
     {
         PORTB ^= (1 << PORTB0);
     }
 }

toggle_output();

The above code will cause the pin to toggle 5 times as expected. So for some reason the `sizeof` operator is yielding some enormous number, or the microcontroller is, for some reason, adding to it. What is even stranger, is if I write an equivalent C program on my computer that just prints a number for each character in a string, it does so as expected:

Code: [Select]
void number_for_character (unsigned char *string)
{
     for (i = 0; i < sizeof(string) - 3; i++)
     {
         printf("1");
     }
}

number_for_character("test");


Code: [Select]
OUTPUT: 1111
What is going on here? Why isn't the `sizeof` operator working with the ATtiny84A?
« Last Edit: May 05, 2021, 02:50:07 am by Kalcifer »
 

Offline ledtester

  • Super Contributor
  • ***
  • Posts: 1591
  • Country: us
Maybe you want to use strlen(string) instead.

sizeof() applied to a char * will return the width in bytes of a pointer - not the length of the string.

Related SO question:

https://stackoverflow.com/questions/12353973/sizeof-char-array-in-c-c

Quote
what is happening is that I am ending up with an endless loop.

I bet sizeof(string) is equal to 2, so the condition i < sizeof(string)-3 will never be true.

Quote
What is even stranger, is if I write an equivalent C program on my computer ...

On your PC try calling varying the argument to number_for_character(...) -- e.g.:

https://onlinegdb.com/BJaQstJOu
« Last Edit: May 05, 2021, 03:05:20 am by ledtester »
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 7826
  • Country: us
    • Personal site
This. And also, sizeof is not a function. It is an operator. Its value is computed at compile time and will be fixed. It always evaluated to a constant value and does nothing in run-time.

Your PC output makes no sense either. On a 64-bit PC, the size of the pointer is 8 bytes. So it should print 5 "1"s, not 4.
Alex
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 2225
  • Country: nz
  • Formerly SiFive, Samsung R&D
Your variable "string" is a pointer. On AVR sizeof(a pointer) is 2. Always. At compile time. When you subtract 3 from it (why???) you get -1, which as you are comparing to an unsigned number is instead 65535.

On a 32 bit PC sizeof(string) will be 4. On a 64 bit PC sizeof(string) will be 8. Always.

I still don't know why you're subtracting 3 from it.

Try using strlen().
 

Offline technix

  • Super Contributor
  • ***
  • Posts: 3503
  • Country: cn
  • From Shanghai With Love
    • My Untitled Blog
sizeof() is not a function, it is an operator. If you want the length of the string, you need strlen(3). (I usually nominate C standard functions using their UNIX manual page name, so strlen(3) means "man 3 strlen".)
 

Offline rglory

  • Contributor
  • Posts: 15
  • Country: us
You can use this instead:

Code: [Select]
void toggle_output( const char *string)
{
    while( *string++ )
        PORTB ^= (1 << PORTB0);     
}
then you do not need `strlen()` and it is not a good idea to use `strlen()` in condition of `for` loop
 

Offline technix

  • Super Contributor
  • ***
  • Posts: 3503
  • Country: cn
  • From Shanghai With Love
    • My Untitled Blog
You can use this instead:

Code: [Select]
void toggle_output( const char *string)
{
    while( *string++ )
        PORTB ^= (1 << PORTB0);     
}
then you do not need `strlen()` and it is not a good idea to use `strlen()` in condition of `for` loop
This is internally how a basic strlen is implemented:
Code: [Select]
size_t strlen(const char *s)
{
    size_t count = 0;
    while (*s++) count++;
    return count;
}

While it is not a good idea to call strlen(3) in loop condition, you can call it before entering the loop and keep the contents in a local variable. However if the string can change while the loop is running (uncommon for AVR since it lacked DMA, but common on chips with DMA and multi-core chips when you don't want to or can not use interrupts) you will need one strlen call every loop as an easy and dirty way to poll if the string has changed.
« Last Edit: May 05, 2021, 05:38:18 am by technix »
 

Offline rglory

  • Contributor
  • Posts: 15
  • Country: us
While it is not a good idea to call strlen(3) in loop condition, you can call it before entering the loop and keep the contents in a local variable.
You certainly can, but that makes code longer, more error prone and possibly less efficient.

Quote
However if the string can change while the loop is running (uncommon for AVR since it lacked DMA, but common on chips with DMA and multi-core chips when you don't want to or can not use interrupts) you will need one strlen call every loop as an easy and dirty way to poll if the string has changed.
If string can change then OP should say so and that would be completely different question.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 2225
  • Country: nz
  • Formerly SiFive, Samsung R&D
While it is not a good idea to call strlen(3) in loop condition, you can call it before entering the loop and keep the contents in a local variable.
You certainly can, but that makes code longer, more error prone and possibly less efficient.

In most cases one extra instruction (ok, 2 on an 8 bit CPU), and 100% certainly more efficient unless the string is zero length.

More error prone? Not for an experienced programmer (who seldom makes errors) and not for a beginner (who makes errors all the time).

It would also give a chance to debug print the loop count you have just obtained and instead of asking why the loop doesn't loop properly, ask why the string length isn't what you thought it was BEFORE you start the loop.

This is an excellent advantage.
 

Offline rglory

  • Contributor
  • Posts: 15
  • Country: us
More error prone? Not for an experienced programmer (who seldom makes errors)
Experienced programmer seldom makes errors because establishes good habits, and writing short readable code is one of them.

Quote
and not for a beginner (who makes errors all the time).
This is simply wrong statement.
First of all as OP is quite a beginner and nobody mentioned that he should not use `strlen()` in for loop condition, he would write code like this:

Code: [Select]
for( unsigned char i = 0; i < strlen( string ); ++i ) ...
which is not only badly ineffective but has potential problem. Ok now he heard your advice - use `strlen()` before:
Code: [Select]
const unsigned char len = strlen( string );
for( unsigned char i = 0; i <  len; ++i ) ...
Better, but still has hidden problem.

So longer code has potential for more mistakes.
 

Offline technix

  • Super Contributor
  • ***
  • Posts: 3503
  • Country: cn
  • From Shanghai With Love
    • My Untitled Blog
Code: [Select]
for( unsigned char i = 0; i < strlen( string ); ++i ) ...
which is not only badly ineffective but has potential problem. Ok now he heard your advice - use `strlen()` before:
Code: [Select]
const unsigned char len = strlen( string );
for( unsigned char i = 0; i <  len; ++i ) ...
Better, but still has hidden problem.

So longer code has potential for more mistakes.
Well if you are iterating through a string maybe your code would make sense, actually I'd prefer this though:
Code: [Select]
for (const char *ch = string; *ch; ch++) do_something_with_the_character(*ch); // This code uses a C99 language feature.

However if you are doing other things with the string, it is always faster to use the system-provided strlen(3) function, as it almost always uses processor-specific optimizations and is written in assembly - at least a C function that is just a large block of inline assembly.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 2225
  • Country: nz
  • Formerly SiFive, Samsung R&D
Quote
and not for a beginner (who makes errors all the time).
This is simply wrong statement.
First of all as OP is quite a beginner and nobody mentioned that he should not use `strlen()` in for loop condition, he would write code like this:

Because it's not true. It is perfectly correct to use strlen() on each loop iteration. The code will work fine. It's not the most efficient code possible but that's fine -- we let people program in Python or shell script if they want to and that is often adequate to their purpose.

Once you find your code is too slow, then look for optimisations.

But in this case I personally as an experienced programmer will almost always assign any complex expression I'm tempted to use in a for() to a local int or long or size_t variable just before the loop. It makes the for() itself clearer and, as already mentioned, gives you somewhere to debugprint the value.

Quote
Code: [Select]
for( unsigned char i = 0; i < strlen( string ); ++i ) ...
which is not only badly ineffective but has potential problem. Ok now he heard your advice - use `strlen()` before:
Code: [Select]
const unsigned char len = strlen( string );
for( unsigned char i = 0; i <  len; ++i ) ...
Better, but still has hidden problem.

So longer code has potential for more mistakes.

Well, if you're completely incompetent then yes.

Code: [Select]
STRLEN(3)                BSD Library Functions Manual                STRLEN(3)

NAME
     strlen, strnlen -- find length of string

LIBRARY
     Standard C Library (libc, -lc)

SYNOPSIS
     #include <string.h>

     size_t
     strlen(const char *s);

It says size_t. Use a size_t, not a char. Always RTFM unless you've got TFM memorized.
 
The following users thanked this post: newbrain

Offline DavidAlfa

  • Super Contributor
  • ***
  • Posts: 1007
  • Country: es
Sizeof works for typedefs, constants, known arrays.
Not any string you throw into. Definitely strlen.
Hantek DSO2x1x            Drive        FAQ
Stm32 Soldering FW      Forum      Github      Donate      I need calibration reports!
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 539
  • Country: ru
Does TFM also mention memory class (rom/ram/eeprom) of strlen parameter? OP is on AVR platform ;)
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 2225
  • Country: nz
  • Formerly SiFive, Samsung R&D
Does TFM also mention memory class (rom/ram/eeprom) of strlen parameter? OP is on AVR platform ;)

Unless I've forgotten everything, if he hasn't explicitly marked it, strings will get copied into RAM on program startup, and the function will receive a normal char *. And if he has marked it to stay in flash then it can't be passed to a char *.
 

Offline rglory

  • Contributor
  • Posts: 15
  • Country: us
Because it's not true. It is perfectly correct to use strlen() on each loop iteration.
Hmm where did I say otherwise?

Quote
The code will work fine. It's not the most efficient code possible but that's fine -- we let people program in Python or shell script if they want to and that is often adequate to their purpose.
Looks like you are arguing just for arguing. My point is - it is not necessary to write longer code, which creates more opportunities for mistakes and less optimal at the same time when alternative is readable, simple and efficient.

Quote
But in this case I personally as an experienced programmer will almost always assign any complex expression I'm tempted to use in a for() to a local int or long or size_t variable just before the loop. It makes the for() itself clearer and, as already mentioned, gives you somewhere to debugprint the value.
Then we have different experience. I do not work for embedded, this is a hobby for me, but even on programming for PC I try not to waste resources just for fun. Yes when efficient code is less readable, but will give me couple percent I would prefer readability, but in embedded and alternative is shorter and even more readable...

Quote
Well, if you're completely incompetent then yes.
Yea yea we leave in the world where people always read and remember documentation, use proper data types and purple unicorns walking on streets. Totally agree.
 

Offline newbrain

  • Super Contributor
  • ***
  • Posts: 1141
  • Country: se
Quote
But in this case I personally as an experienced programmer will almost always assign any complex expression I'm tempted to use in a for() to a local int or long or size_t variable just before the loop. It makes the for() itself clearer and, as already mentioned, gives you somewhere to debugprint the value.
Then we have different experience. I do not work for embedded, this is a hobby for me, but even on programming for PC I try not to waste resources just for fun. Yes when efficient code is less readable, but will give me couple percent I would prefer readability, but in embedded and alternative is shorter and even more readable...
"I try not to waste resources":
This obviously means you would do like brucehoult was suggesting?

No resources are wasted by having a variable holding the loop limit, not even stack (at anything but -O0).
Nandemo wa shiranai wa yo, shitteru koto dake.
 

Offline Doctorandus_P

  • Super Contributor
  • ***
  • Posts: 1600
  • Country: nl
Writing, learning and debugging for generic algorithms is much easier done on a PC.
a C compiler for a PC does not know what a "PORTB" is, but you can easily declare such register variables as normal variables, and can even put such definitions in a separate file, so your code can be compiled for either a PC or a microcontroller.

It is quite common and normal to design and debug algorithms in C on a PC, and only when the function is debugged run it on a microcontroller.

It is one of the strong points of C and C++. It is highly standardized, so different compilers will treat your code very similarly.

There are differences, but they're well documented. On an AVR an int is 16 bits, while on a PC it will be 32 (maybe even 64, I don't use regular ints in C for a long time)
 

Offline rglory

  • Contributor
  • Posts: 15
  • Country: us
No resources are wasted by having a variable holding the loop limit, not even stack (at anything but -O0).
Yea right, the fact that you are doing the same loop twice does not count...  :o
CPU time is resource as well, is it not?
 

Offline newbrain

  • Super Contributor
  • ***
  • Posts: 1141
  • Country: se
No resources are wasted by having a variable holding the loop limit, not even stack (at anything but -O0).
Yea right, the fact that you are doing the same loop twice does not count...  :o
CPU time is resource as well, is it not?
:-//
I have some problem in understanding the statement above.

I have two functions, identical but for the point at hand: in one the limit is set in a separate variable, in the other it's inlined in the for statement.
The generated assembler (-Og and up) shows that:
  • The version with the variable is more space efficient, as the variable is held in a register and never spilled to memory, and the code is shorter.
  • The version with the variable is more time efficient, as the calculation is performed only once.
  • All of the above is moot from -O1 upwards, the code for the two functions becomes identical.
My point is: clarity trumps "premature optimizations" in (100.0 - DBL_EPSILON) percent of the cases, even if the code is slightly longer.
Something you seemed to advocate against, in the paragraph I quoted, but I might have of course misunderstood.
Nandemo wa shiranai wa yo, shitteru koto dake.
 
The following users thanked this post: agehall

Offline rglory

  • Contributor
  • Posts: 15
  • Country: us
I have some problem in understanding the statement above.
I mean:
Code: [Select]
void toggle_output(const char *string)
{
   const size_t len = strlen(string);
    for (size_t i = 0; i < len; ++i )
    {         
        PORTB ^= (1 << PORTB0);     
    }
}   
vs:
Code: [Select]
void toggle_output(const char *string)
{
    while( *string++)
        PORTB ^= (1 << PORTB0);     
}   
First function executes the same loop twice. I did not compare that 2 variants when strlen() is in for loop and when it is outside. Variant with strlen() inside for loop condition is a disaster and that's why I suggested against it and that started this conversation - I had educated guess that OP would simply replace `sizeof()` with `strlen()` in his code.
« Last Edit: May 05, 2021, 02:57:09 pm by rglory »
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 6857
  • Country: fr
Observe the following example code (Microcontroller used: ATtiny84A):

Code: [Select]
void toggle_output(unsigned char *string)
{
    for (unsigned char i = 0; i < sizeof(string) - 3; i++)
...

Ouch.

What is going on here? Why isn't the `sizeof` operator working with the ATtiny84A?

It is.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf