Author Topic: Annoying character array parsing with AVR-GCC  (Read 2815 times)

0 Members and 1 Guest are viewing this topic.

Offline slateraptor

  • Frequent Contributor
  • **
  • Posts: 833
  • Country: us
Annoying character array parsing with AVR-GCC
« on: March 26, 2012, 04:49:38 am »
A quick example using well-defined escape sequences:

Code: [Select]
const char* ArrayExample1 = "\r\nC";

Character-wise, this yields: 0x0D 0x0A 0x43 0x00...as expected.

Switching it up a bit:

Code: [Select]
const char* ArrayExample2 = "\x0D\x0AC";

Ok, so I've replaced the well-defined escape sequences with 8-bit hex sequence equivalents and we're expecting the same contents as the previous example, right? Wrong. Actual content: 0x0D 0xAC 0x00

A third example:

Code: [Select]
const char* ArrayExample3 = "\x0D\x0AG";

This correctly yields 0x0D 0x0A 0x47 0x00, as expected. Evidently, if the most significant nibble of an 8-bit hex sequence is 0 and an A-F character immediately follows the sequence, the character array won't be parsed correctly in AVR-GCC; leading zero appears to be truncated during parsing by compiler. Meh. :-\
« Last Edit: March 26, 2012, 04:53:51 am by slateraptor »
 

Offline IanB

  • Super Contributor
  • ***
  • Posts: 9618
  • Country: us
Re: Annoying character array parsing with AVR-GCC
« Reply #1 on: March 26, 2012, 05:13:17 am »
This is actually behaving correctly according to the C language standard and is not a bug. The compiler does not stop at two characters after the \x, it continues to eat all characters that are valid hexadecimal digits until a non-hexadecimal digit is found. In  your example, \x0AC is a valid hexadecimal digit string, so the parser captures all of it.

The correct way to do what you want is to force a break in parsing where you need it, like this:

Code: [Select]
const char* ArrayExample2A = "\x0D\x0A" "C";
Note that in your third example, the letter G is not a valid hex digit so it breaks the parsing of the preceding \x0A at that point.
I'm not an EE--what am I doing here?
 

Offline slateraptor

  • Frequent Contributor
  • **
  • Posts: 833
  • Country: us
Re: Annoying character array parsing with AVR-GCC
« Reply #2 on: March 26, 2012, 05:39:52 am »
Tricky tricky. K&R references syntax for the hex escape sequence as \xhh, which led me to my original conclusion...I should have known better when they used subtle examples like \xb and \x7.

Thanks for the clarification.
 

Offline IanB

  • Super Contributor
  • ***
  • Posts: 9618
  • Country: us
Re: Annoying character array parsing with AVR-GCC
« Reply #3 on: March 26, 2012, 05:48:58 am »
In section A2.5.2 "Character Constants" of my copy of K&R (2nd Ed), it says:

The escape \xhh consists of the backslash, followed by x, followed by hexadecimal digits... . There is no limit on the number of digits, but the behavior is undefined if the resulting character value exceeds that of the largest character.

Sometimes you have to watch the small print...
I'm not an EE--what am I doing here?
 

Offline slateraptor

  • Frequent Contributor
  • **
  • Posts: 833
  • Country: us
Re: Annoying character array parsing with AVR-GCC
« Reply #4 on: March 27, 2012, 05:18:42 am »
Fortunately, this nuisance caveat wasn't a show-stopper and was easy enough to isolate. Just thought I'd share my inner thoughts as others working on embedded projects here might have something to gain from the observation, or method of identifying the culprit.

P.S. The print in that section is rather normal. That the subtle detail was put away in an appendix rather than included in a main chapter is another story. :P
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf