Author Topic: Parsing UART packets / state machines / packed structs (Read 8285 times)

uer166 · « **on:** December 03, 2022, 10:35:21 pm »

I am in need to parse some UART packets in a format I have no control over. The format is something like:
|0x55|0xAA|LENGTH|ADDRESS|0x01|CMD|DATA0....DATAn|CHKSUM|.

The easiest way seems to be a switch/case state machine, to basically go byte by byte per invocation. It's to be called at a rate higher than the incoming bytes of course to not miss anything, or alternatively use the hardware 8-byte FIFO to call this every 8 bytes, reducing ISR rate. After a packet data is read and validated, it is then memcpy-d into a packed struct based on the command byte which defines the data format, therefore automatically getting the right bits in place. Something like:

Code: [Select]

/* Data structure declarations */
volatile struct BMS_batstat bms_batstat;

/* UART state machine state for reading/parsing packets */
static uint8_t length = 0;
static uint16_t chksum = 0;
static uint8_t cmdreq = 0;
static uint8_t databuf[255] = {0};
static uint8_t databuf_i = 0;

void BMS_UART_ParseData(void)
{
	if(cmdreq == CMDREQ_BATSTAT)
	{
		memcpy(&bms_batstat, databuf, sizeof(bms_batstat));
	}
	else
	{
		// unknown packet, TODO
	}
}

void BMS_UART_StateMachine_Rx(uint8_t rx)
{
	switch(bms_uart_state)
	{
	case UART_wait_0x55:
		if(rx == 0x55) { bms_uart_state = UART_wait_0xAA; } else { bms_uart_state = UART_wait_0x55; }
		break;

	case UART_wait_0xAA:
		if(rx == 0xAA) { bms_uart_state = UART_read_length; } else { bms_uart_state = UART_wait_0x55; }
		break;

	case UART_read_length:
		length = rx;
		databuf_i = 0;
		bms_uart_state = UART_wait_0x25;
		break;

	case UART_wait_0x25:
		if(rx == 0x25) { bms_uart_state = UART_wait_0x01; } else { bms_uart_state = UART_wait_0x55; }
		break;

	case UART_wait_0x01:
		if(rx == 0x01) {bms_uart_state = UART_read_cmdreq; } else { bms_uart_state = UART_wait_0x55; }
		break;

	case UART_read_cmdreq:
		cmdreq = rx;
		bms_uart_state = UART_read_data;
		break;

	case UART_read_data:
		databuf[databuf_i] = rx;
		databuf_i++;
		if(databuf_i == (length - 2)) {bms_uart_state = UART_read_checksumLSB;} /* Data length without including extras */
		break;

	case UART_read_checksumLSB:
		chksum = (uint16_t)rx;
		bms_uart_state = UART_read_checksumMSB;

		break;

	case UART_read_checksumMSB:
		chksum |= ((uint16_t)rx) << 8;

		uint16_t chksum_data = 0;
		chksum_data = length + 0x25 + 0x01 + cmdreq; /* Need to add extras to checksum */

		/* Data length without including extras */
		for(uint8_t i = 0; i < (length - 2); i++)
		{
			chksum_data += databuf[i];
		}

		if(chksum == (0xFFFF - chksum_data))
		{
			BMS_UART_ParseData();
		}

		bms_uart_state = UART_wait_0x55;
		break;

	default:
		asm("bkpt");
		break;
	}
}

I've seen some implementations of such a state machine with function pointers, separating each state into a function, although I didn't think it was particularly readable or safe. Are there more ways to do this in a more obvious or maybe safe manner? The only complication I see here is the need to debug these variables in the foreground in a shell-like environment, and the packed structs are not atomic, therefore I need to unpack/copy all these variable into atomic types such as float or uint32_t so that there is no data consistency issues.

krish2487 · « **Reply #1 on:** December 04, 2022, 08:02:25 am »

which mcu??
One approach I can think of is to use DMA to copy all the incoming data to a buffer and then invoke the parser once the buffer is full..
Then the parsing can happen at a much slower rate than byte invocation..
I dont know about other MCUs but STM32 have a UART IDLE line detection which is essentially an interrupt that fires if there is no data on the UART line for a clock cycle or two. This is nice because you dont need to know the length of the data you need to store.. just wait for the interrupt to fire, set a flag inside the ISR to trigger the parser and do the parsing in the main loop once you return from the ISR..

Siwastaja · « **Reply #2 on:** December 04, 2022, 09:22:56 am »

Some pseudocode (not tested), just to give idea how I have done this in recent times:

Code: [Select]

// .h
typedef struct PACK
{
   ...
} msg1_t;

typedef struct PACK
{
   ...
} msg_2_t;

typedef union
{
    struct PACK
    {
        uint16_t magic;
        uint8_t len;
        uint8_t msgid;
        union  // all payload types
        {
            msg1_t msg1;
            msg2_t msg2;
            uint8_t pay_u8[MAX_PAYLOAD_LEN];
        };
    };

    uint8_t whole_u8[MAX_PAYLOAD_LEN+4];
} msg_t;

extern msg_t* latest_msg;

// .c
static msg_t msg_a, msg_b;
msg_t* latest_msg;
static msg_t* accum;

init():
  accum = msg_a;
  latest_msg = msg_b;


accumulation in UART character ISR:
{
  accum.whole_u8[idx++] = c;
  update_crc(c);
  if(idx > offsetof(accum.len)+sizeof(accu .len) // len field already received
     && idx >= accum.len + HEADER_LENS_WHATEVER) // complete packet received
  {
     if(processing_running) error(57); // previous processing func did not finish in time
     if(!verify_crc())
        // ignore message and log error
     else
     {
       SWAP(accum, latest_msg);
       processing_running = 1;
       NVIC->STIR = PROCESSING_IRQn; // with lower priority than current interrupt handler
     }
  }
}

static volatile int processing_running;

processing_handler()
{
    // access latest_msg as you wish.
    // if msg_id==1, access latest_msg->msg1, and so on.
    // switch-case is fine, or a table which contains msgids and function pointers for the handlers; loop through the table here and call the correct handler.

   // last thing to do:
   processing_running = 0;
}

No memory copying here. Not saying it's usually a big problem, but why do it when it's not needed? You still have the same constraints: memory footprint of two buffers, and needing to get processing the data finished before the end of the next packet.

DavidAlfa · « **Reply #3 on:** December 04, 2022, 10:20:44 am »

For normal 115.200baud, interrupt-based routine shouldn't be a problem.
Add some sort of timeout handling to avoid stalling!

Code: [Select]

typedef struct{
    uint8_t  len;
    uint8_t  addr;
    uint8_t  cmd;
    uint8_t  buf[255];
    uint16_t chksum;
} uart_data_t;

typedef enum{
    uart_magic_55   = 0,
    uart_magic_AA   = 1,    
    uart_len        = 2,    
    uart_adr        = 3,    
    uart_flag       = 4,
    uart_cmd        = 5,    
    uart_data       = 6,    
    uart_chksum     = 7,    
    uart_done       = 8,
    
    uart_chksum_bytes  = 2,
}uart_states_t;

static volatile uart_states_t uart_state;
volatile uart_data_t uart_data;

// |0x55|0xAA|LENGTH|ADDRESS|0x01|CMD|DATA0....DATAn|CHKSUM|.

void reset_state(void){                         // Called by the user after parsing the packet, or by the timeout routine.
    uart_state = uart_magic_55;
}

void RX_ISR(void){
    static uint16_t count;
    uint8_t rx = RX_BF;
    
    reset_timeout();   // Reset the timeout as long as data keeps coming.

    switch(uart_state){
    
        case uart_magic_55:
            if(rx==0x55){
                enable_timeout();  // Enable timeout ISR or whatever
                uart_state++;
            }
            break;
            
        case uart_magic_AA:
            if(rx==0xAA){
                uart_state++;
            }
            else{
                uart_state = uart_magic_55;
            }
            break;

        case uart_len:
            uart_data.len = rx;
            uart_state++;
            break;
            
        case uart_adr:
            uart_data.adr = rx;
            uart_state++;
            break;
        
        case uart_flag:
            if(rx == 1){
                uart_state++;
            }
            else{            
                uart_state = uart_magic_55;
            }
            break;
        
        case uart_cmd:
            uart_data.cmd = rx;
            count = 0;
            uart_state++;
            break;

        case uart_data:
            uart_data.buffer[count++] = rx;
            if (uart_data.len == count){
                count = 0;
                uart_data.checksum = 0;
                uart_state++;
            }
            break;
            
        case uart_chksum:
            uart_data.checksum |= rx<<(8*count++);
            if(count == uart_chksum_bytes){
                disable_timeout();  // Disable timeout ISR, received full frame, wait for processing
                uart_state++;
            }
            break;
        
        case uart_done:                // Do nothing, wait for processing
        default:
            break;
}

Doctorandus_P · « **Reply #4 on:** December 04, 2022, 10:41:06 am »

I agree with the above.
You can make a union with a bunch of structures in it (Including a byte array to write the "raw" data from the uart to.

C compilers do keep the data in the same order, but they may add padding if your processor has a 16bit or larger word size.
To be able to use the same header file for both my 8-bit uC and my gazillion bit monster PC I had to add an:

#ifdef __linux__ #pragma pack(push) #pragma pack(1) // Alignment for my big 64 bit monster. #endif // Union definition goes here. #ifdef __linux__ #pragma pack(pop) // Restore alignment to normal #endif

to make GCC happy on both platforms. The PC needed the pack to prevent padding, while the compiler for the 8-bitter choked on the pack.

uer166 · « **Reply #5 on:** December 04, 2022, 07:48:33 pm »

Quote from: Siwastaja on December 04, 2022, 09:22:56 am

Some pseudocode (not tested), just to give idea how I have done this in recent times:

Code: [Select]
// .h typedef struct PACK { ... } msg1_t; typedef struct PACK { ... } msg_2_t; typedef union { struct PACK { uint16_t magic; uint8_t len; uint8_t msgid; union // all payload types { msg1_t msg1; msg2_t msg2; uint8_t pay_u8[MAX_PAYLOAD_LEN]; }; }; uint8_t whole_u8[MAX_PAYLOAD_LEN+4]; } msg_t; extern msg_t* latest_msg; // .c static msg_t msg_a, msg_b; msg_t* latest_msg; static msg_t* accum; init(): accum = msg_a; latest_msg = msg_b; accumulation in UART character ISR: { accum.whole_u8[idx++] = c; update_crc(c); if(idx > offsetof(accum.len)+sizeof(accu .len) // len field already received && idx >= accum.len + HEADER_LENS_WHATEVER) // complete packet received { if(processing_running) error(57); // previous processing func did not finish in time if(!verify_crc()) // ignore message and log error else { SWAP(accum, latest_msg); processing_running = 1; NVIC->STIR = PROCESSING_IRQn; // with lower priority than current interrupt handler } } } static volatile int processing_running; processing_handler() { // access latest_msg as you wish. // if msg_id==1, access latest_msg->msg1, and so on. // switch-case is fine, or a table which contains msgids and function pointers for the handlers; loop through the table here and call the correct handler. // last thing to do: processing_running = 0; }
No memory copying here. Not saying it's usually a big problem, but why do it when it's not needed? You still have the same constraints: memory footprint of two buffers, and needing to get processing the data finished before the end of the next packet.

Decent way to do it, I forgot about unions. The reason I copy data is because of data consistency: I have a single timer interrupt system that runs at 20kHz, and ideally everything is done in that ISR. If I were to copy directly to the struct the data would be inconsistent, so need to either double buffer or some kind of a valid flag. Honestly I may switch to those if it proves to be an issue, although a memcpy call is a one-liner..

SiliconWizard · « **Reply #6 on:** December 04, 2022, 07:55:24 pm »

Using unions is fine for this. Just beware of alignment issues.

As to using switch constructs or a table of function pointers, it's up to you. I don't have strong feelings either way. Both can work plenty fine if done right.
I've most often used switch/case, and, unless each case is very simple and short, call a different function inside each case to deal with this specific "state" or "command".
Might look clearer and less error-prone than declaring an array of function pointers and indexing it, but the latter can work too.

I recommend factoring stuff in helper functions in any case as I said above. Avoid gigantic switch constructs with each case handling a lot of code. It becomes quickly unreadable. Call functions inside each case, at least if whatever has to be done for a given case exceeds a couple statements.

nctnico · « **Reply #7 on:** December 04, 2022, 08:13:24 pm »

I'd avoid mapping packets onto structs. A lot can go wrong due to alignment issues. And your code won't be byte order agnostic.

IMHO the statemachine approach is a good one. It's intended function is very clear and thus easy to follow / alter if necessary. In the end you don't need that many states to handle a message.

Create a function for each command that takes the data (and length) as a parameter and go from there. If there are many commands with little or no gaps in between, then a table with function pointers can be efficient.

PCB.Wiz · « **Reply #8 on:** December 04, 2022, 10:13:09 pm »

Quote from: uer166 on December 03, 2022, 10:35:21 pm

I am in need to parse some UART packets in a format I have no control over. The format is something like:
|0x55|0xAA|LENGTH|ADDRESS|0x01|CMD|DATA0....DATAn|CHKSUM|.
...
I've seen some implementations of such a state machine with function pointers, separating each state into a function, although I didn't think it was particularly readable or safe. Are there more ways to do this in a more obvious or maybe safe manner?

Function pointers can be useful if you have a 'tangled state machine', with multiple paths, that a case statement can find hard to manage.
For most simple state machines, CASE is fine

You may also need to add error handling, as a system might start mid-packet.
Can the data payload ever contain the 0x55.0xAA trigger ? If not, you can use that pattern for hard sync.

Another point of consideration, is designs often use two data buffers, so one is fully stable, whilst the other one is being built from new data.
If the system is half duplex with a ACK on done, you might get away with a single buffer.

artag · « **Reply #9 on:** December 05, 2022, 12:40:26 am »

Be wary of using the content of he packet before verifying the checksum.

This most often occurs when you use the length value to find the end of packet and hence the checksum. There is some defence against this in the examples posted above where the packet can be timed out if length is read with errors and would otherwise result in waiting forever for a packet which is not as long as you are expecting.

I appreciate that you said you can't change the format and that you're stuck with this, but it's a very poor - albeit common - design. If you lose a character or misread the length and you don't have a sufficient gap between packets to recover with a timeout, all you can do is attempt to resynchronise using every possible 55 aa, and since it's not clear if that can occur in the data you can end up retrying forever (so don't just throw away a packet with a bad checksum .. reparse it from any other 55 aa that might be in the discarded data).

A much better practice is to reserve some characters for start-of-packet and end-of-packet (or just for end-of-packet) and use escaping (aka byte-stuffing) to keep those characters out of the data and formatting scheme.

uer166 · « **Reply #10 on:** December 05, 2022, 12:48:44 am »

Quote from: artag on December 05, 2022, 12:40:26 am

If you lose a character or misread the length and you don't have a sufficient gap between packets to recover with a timeout, all you can do is attempt to resynchronise using every possible 55 aa, and since it's not clear if that can occur in the data you can end up retrying forever (so don't just throw away a packet with a bad checksum .. reparse it from any other 55 aa that might be in the discarded data).

That if a fabulous point, here is my thought process:

The length byte may be corrupt, and may be either too short or too long up to 255 (this is why data buffer is that size)
In the worst case it may be say 255, which means that the next 255 bytes will make the state machine stuck in that loop reading the bytes
After the 255 bytes have been read, the CRC calc will of course fails, and the code then goes back to waiting for 0x55, 0xAA, which it will see at the start of the next valid packet and it'll re-sync, assuming that 0x55/0xAA wasn't part of data.
If it was part of data, I'm basically relying on it being unlikely to be a repeating sequence, although I admit that an edge case may be constructed here.
To fix last bullet point, I am relying on a timeout feature: in all states, I will be adding a "go back to waiting for sync bytes" if the lines has been idle for a long time

I'm personally not a fan of reserved characters and stuffing due to overhead being variable. One way I thought of solving it is via use of "special" characters like a LIN break, that aren't a valid byte at all. That should work well assuming hardware can support it. This us a M365 scooter BMS though and it's a reverse engineered protocol. Food for thought regardless though.

ledtester · « **Reply #11 on:** December 05, 2022, 01:02:45 am »

Quote from: DavidAlfa on December 04, 2022, 10:20:44 am

For normal 115.200baud, interrupt-based routine shouldn't be a problem.
Add some sort of timeout handling to avoid stalling!

...code...

One little critique about the code...

if you are in the expect_AA state and get a 0x55 you might want to go back to the expect_AA state.

artag · « **Reply #12 on:** December 05, 2022, 01:18:04 am »

Quote from: uer166 on December 05, 2022, 12:48:44 am

If it was part of data, I'm basically relying on it being unlikely to be a repeating sequence, although I admit that an edge case may be constructed here.
To fix last bullet point, I am relying on a timeout feature: in all states, I will be adding a "go back to waiting for sync bytes" if the lines has been idle for a long time

You're essentially then saying that there's a gap between packets which is greater than a defined timeout value. Which is fine, but a) your preexisting protocol might not have that, and b) it's overhead (unusable transmission time) just like reserved characters.

From a BMS a repeating value is very easily common. Something like a 16-bit temperature value ? And, amazingly, it always seems to be those 'very unlikely' values that happen once in a blue moon, cause some hideous failure, and can't be repeated to prove it.

You can improve that strategy slightly by doing everything you said, but in addition, don't toss the input data as you process it - keep it, even if it's only in the circular recieve buffer. Just mark where you read 55 aa. Then, after detecting a bad checksum, go back to the byte after the initial 55 aa and start looking for another pair. When you find one, repeat the verify.

Imagine if there is a regular 55 aa in the data due to some slow-changing physical value, and that you have have a corrupt length byte that results in you looking for the checksum a couple of bytes too late. You've now absorbed the NEXT 55 aa thinking it was the last couple of bytes of data. The next one you find really is data and perhaps the data following it - which can also be slowly changing - is misinterpreted as a length byte. It's not hard to see how you could grab a whole run of bad packets all synced up on the 5 aa in the data.

I've worked on an actual implementation of a logging system accepting data in this form from an ECU. There was an adaptable system to select one of several ECU manufacturers (these were on specialist vehicles, not common car ECUs, some designed before CAN was so common). It did generally recover by using the above rather painful resync procedure. I was appalled to discover how common such a poor implementation was, and I always use byte-stuffing on my own designs. It may be wasteful but it never does worse than halve the capacity (which is also a reasonable overhead for a gap between packets) and you can at least calculate and design for the worst-case.

uer166 · « **Reply #13 on:** December 05, 2022, 03:43:18 am »

Quote from: artag on December 05, 2022, 01:18:04 am

Quote from: uer166 on December 05, 2022, 12:48:44 am
If it was part of data, I'm basically relying on it being unlikely to be a repeating sequence, although I admit that an edge case may be constructed here.
To fix last bullet point, I am relying on a timeout feature: in all states, I will be adding a "go back to waiting for sync bytes" if the lines has been idle for a long time
You're essentially then saying that there's a gap between packets which is greater than a defined timeout value.

Yup, in fact I am in control for the request packets. I.e. I can expect silence until the time where I send out a formatted "give me cell voltages" or whatever. I can see your point about efficiency loss here being similar to overhead, maybe something like Consistent Overhead Byte Stuffing would help.

Siwastaja · « **Reply #14 on:** December 05, 2022, 07:20:36 am »

Quote from: uer166 on December 04, 2022, 07:48:33 pm

Decent way to do it, I forgot about unions. The reason I copy data is because of data consistency: I have a single timer interrupt system that runs at 20kHz, and ideally everything is done in that ISR. If I were to copy directly to the struct the data would be inconsistent, so need to either double buffer

My example was a double buffer solution. It is not much longer than memcpy version (a few extra lines swapping and initializing the pointers). The advantage is, you get rid of the fundamental problem "do you have time to call memcpy"? Let's say bytes arrive at frequency of 1/tb, and packets at 1/tp, so you have tb to process one byte and tp to process one packet. Now the memcpy solution has to copy the whole packet in the same time, tb, unless you can guarantee by design that there is an extra delay between packets.

The double buffer version has same memory footprint as the memcpy version - two buffers - but it's completely free of that issue. You have full tp to process the packet. All you need to do is to process the packet either in main thread (by setting a flag), or as I prefer, a lower priority software interrupt. During the processing, as new bytes arrive, the UART ISR pre-empts the processing and writes to the other buffer.

memcpy idea has no advantages IMHO. Timing-wise, it's even worse than the simplest "single buffer - validity flag - packet delay" solution, yet it requires the same (or actually longer) packet delay. If you can provide enough delay between packets and are sure you can process the packet during that, then just go for single buffer with validity flag.

Siwastaja · « **Reply #15 on:** December 05, 2022, 07:30:44 am »

Quote from: artag on December 05, 2022, 12:40:26 am

Be wary of using the content of he packet before verifying the checksum.

Yes. If messages are of variable lengths, there is a risk of misunderstanding the length field. Especially if the length field is wide and unbounded, you may read a lot larger number than any of the packets actually are. Then the state machine will ignore a lot of messages before it finally realizes the error (by CRC calculation not matching) and can start to resynchronize.

Of course, if you have decent signal integrity, such errors are rare. If you can accept the consequence of losing a few messages before self-recovering, then doing no more than constraining the length field to something sensible (without messing up the future-proofing, which is why the length field exists in the first place) is needed.

For quicker recovery during data corruption, one can add another CRC for just the header. As header is fixed size, getting CRC error there allows the code to start resynchronizing right there and only miss the corrupted packet and not more.

"CRC for header" also makes the synchronization start sequence effectively longer. If you just have say uint16 0xacdc as your magic number, there is a risk of missynchronization into data that accidentally contains this 16-bit sequence. But if you have 0xacdc + uint8 msgid + uint8 len + crc8, the data would need to have this 40-bit sequence in it for accidental synchronization.

If one is in control of both sender and receiver and works with them in sync, there is no need to overthink the protocol. If this is a BMS, maybe one packet type with all information fields (cell voltage, temperature, balancer status) is the best solution, so you get rid of both len and msgid. If you have different types of messages, len has no other use but to allow mixing different versions of devices so that the older version ignores new types of messages. If you do not need this functionality, you can just create a const table of lengths per msgid by using sizeof, and just keep the same header file over both parties, and make sure they use the same version.

Siwastaja · « **Reply #16 on:** December 05, 2022, 08:39:23 am »

Quote from: nctnico on December 04, 2022, 08:13:24 pm

I'd avoid mapping packets onto structs. A lot can go wrong due to alignment issues.

And, nctnico is wrong about this again.

I have never seen any issue with mapping packets onto structs. Actually, the whole internet (BSD, linux) works on C code written that way. Endianness is handled with simple swapping macros which are conditionally defined based on the architecture. Alignment is not an issue at all as I will explain below.

On the contrary, large parsers, when written by a human, almost always contain mistakes. It's not just one or two times someone got wrong the amount of shift, forgot parantheses, stepped into integer promotion rule traps or got some type cast wrong, typed < instead of <<, or, most usually, just forgot to update the damn parser when the interface changed. If the type changed but variable name did not, you don't even get a warning!

Then, alignment. When using normal struct, compiler pads the variables so they are self-aligning. This becomes a problem when different types of architectures with different padding rules communicate with each other. Solution: packed attribute.

With packed attribute, padding is removed and fields of the struct may not be self-aligned, except by luck. But contrary to what nctnico repeatedly claims, being corrected, but nevertheless continuing to supply the same wrong information, this is not a problem. nctnico's mistake happens when he thinks packed attribute only controls the padding when the struct is created and that's it. This is not true. packed attribute is part of the type qualification, so compiler knows the struct is packed everywhere it is used, and can always create correct access instructions. And trust me, compilers are nearly bug-free and will create correct code much more likely than nctnico writing low-level bit manipulation code by hand, trying to outsmart the compiler.

One rule results from this: casting away the packed attribute is dangerous. This happens if you take a pointer of a packed struct member and use that. But this is also why compilers warn about this. Seemingly gcc does this even with minimal warnings enabled (no -Wall):

Code: [Select]

typedef struct __attribute__((packed))
{
        char c1;
        int  i;
        char c2;
} s_t;

int main()
{
        extern void f1(s_t * s);
        extern void f2(int * i);

        s_t s = {0};

        f1(&s);   // correct use
        f2(&s.i); // possible alignment error
        return 0;
}

hrst@jorma:~$ gcc t3.c
t3.c: In function ‘main’:
t3.c:17:5: warning: taking address of packed member of ‘struct <anonymous>’ may result in an unaligned pointer value [-Waddress-of-packed-member]
   17 |  f2(&s.i);
      |     ^~~~

This is not a reason not to use packed structs. With similar reasoning, any qualifier like volatile is dangerous because you can cast it away (and compiler will warn you, too).

TLDR:
* use data types to convey the intent and let the compiler generate the memory access code for you.
* do not try to outsmart the compiler by manipulating bits, unless you really have to. While popular decades ago, it is prone to error.
* C is higher level language than many think. Use this to your advantage.
* packed struct does automatically exactly what nctnico is doing manually. This is a compromise in performance compared to normal struct, as compiler cannot and will not make unsafe assumptions about alignment.

Nominal Animal · « **Reply #17 on:** December 05, 2022, 09:06:43 am »

I think it is best to draw the states out –– say, each state corresponding to a case in the UART interrupt handler –– and examine carefully what should happen at each stage. Attached is an image that Graphviz generates from the following:

Code: [Select]

digraph {
    b0 [ label=<<table border="0" cellborder="1"><tr><td port="top" colspan="2"><b>Wait55</b></td></tr><tr><td port="v55">0x55</td><td port="other">*</td></tr></table>>, shape=none ];
    b0:v55:s -> b1:n;
    b0:other:s -> error:n;

    error [ label=<<table border="0" cellborder="1"><tr><td port="top" colspan="2"><b>Error</b></td></tr><tr><td port="v55">0x55</td><td port="other">*</td></tr></table>>, shape=none ];
    error:v55:s -> b1:n;
    error:other:s -> error:ne;

    b1 [ label=<<table border="0" cellborder="1"><tr><td port="top" colspan="3"><b>WaitAA</b></td></tr><tr><td port="vAA">0xAA</td><td port="v55">0x55</td><td port="other">*</td></tr></table>>, shape=none ];
    b1:v55:s -> b1:ne;
    b1:vAA:s -> b2:n;
    b1:other:s -> error:n;

    b2 [ label=<<table border="0" cellborder="1"><tr><td port="top"><b>WaitLen</b></td></tr><tr><td port="other">Len</td></tr></table>>, shape=none ];
    b2:other:s -> b3:n;

    b3 [ label=<<table border="0" cellborder="1"><tr><td port="top"><b>WaitAddr</b></td></tr><tr><td port="other">Addr</td></tr></table>>, shape=none ];
    b3:other:s -> b4:n;

    b4 [ label=<<table border="0" cellborder="1"><tr><td port="top" colspan="2"><b>Wait01</b></td></tr><tr><td port="v01">0x01</td><td port="other">*</td></tr></table>>, shape=none ];
    b4:v01:s -> b5:n;
    b4:other:s -> error:n;

    b5 [ label=<<table border="0" cellborder="1"><tr><td port="top"><b>WaitCMD</b></td></tr><tr><td port="other">CMD</td></tr></table>>, shape=none ];
    b5:other:s -> b6:n;

    b6 [ label=<<table border="0" cellborder="1"><tr><td port="top" colspan="3"><b>WaitData</b></td></tr><tr><td colspan="2">Checksum</td><td port="data">Data</td></tr><tr><td port="good">Good</td><td port="bad">Bad</td></tr></table>>, shape=none ];
    b6:data:s -> b6:n;
    b6:good:s -> b0:nw;
    b6:bad:s -> error:n;
}

The label code is a bit ugly, because I wanted to do nice-looking state blocks. It is just simplified HTML within < and >, though.

Sherlock Holmes · « **Reply #18 on:** December 05, 2022, 03:26:29 pm »

Quote from: uer166 on December 03, 2022, 10:35:21 pm

I am in need to parse some UART packets in a format I have no control over. The format is something like:
|0x55|0xAA|LENGTH|ADDRESS|0x01|CMD|DATA0....DATAn|CHKSUM|.

The easiest way seems to be a switch/case state machine, to basically go byte by byte per invocation. It's to be called at a rate higher than the incoming bytes of course to not miss anything, or alternatively use the hardware 8-byte FIFO to call this every 8 bytes, reducing ISR rate. After a packet data is read and validated, it is then memcpy-d into a packed struct based on the command byte which defines the data format, therefore automatically getting the right bits in place. Something like:

Code: [Select]
/* Data structure declarations */ volatile struct BMS_batstat bms_batstat; /* UART state machine state for reading/parsing packets */ static uint8_t length = 0; static uint16_t chksum = 0; static uint8_t cmdreq = 0; static uint8_t databuf[255] = {0}; static uint8_t databuf_i = 0; void BMS_UART_ParseData(void) { if(cmdreq == CMDREQ_BATSTAT) { memcpy(&bms_batstat, databuf, sizeof(bms_batstat)); } else { // unknown packet, TODO } } void BMS_UART_StateMachine_Rx(uint8_t rx) { switch(bms_uart_state) { case UART_wait_0x55: if(rx == 0x55) { bms_uart_state = UART_wait_0xAA; } else { bms_uart_state = UART_wait_0x55; } break; case UART_wait_0xAA: if(rx == 0xAA) { bms_uart_state = UART_read_length; } else { bms_uart_state = UART_wait_0x55; } break; case UART_read_length: length = rx; databuf_i = 0; bms_uart_state = UART_wait_0x25; break; case UART_wait_0x25: if(rx == 0x25) { bms_uart_state = UART_wait_0x01; } else { bms_uart_state = UART_wait_0x55; } break; case UART_wait_0x01: if(rx == 0x01) {bms_uart_state = UART_read_cmdreq; } else { bms_uart_state = UART_wait_0x55; } break; case UART_read_cmdreq: cmdreq = rx; bms_uart_state = UART_read_data; break; case UART_read_data: databuf[databuf_i] = rx; databuf_i++; if(databuf_i == (length - 2)) {bms_uart_state = UART_read_checksumLSB;} /* Data length without including extras */ break; case UART_read_checksumLSB: chksum = (uint16_t)rx; bms_uart_state = UART_read_checksumMSB; break; case UART_read_checksumMSB: chksum |= ((uint16_t)rx) << 8; uint16_t chksum_data = 0; chksum_data = length + 0x25 + 0x01 + cmdreq; /* Need to add extras to checksum */ /* Data length without including extras */ for(uint8_t i = 0; i < (length - 2); i++) { chksum_data += databuf[i]; } if(chksum == (0xFFFF - chksum_data)) { BMS_UART_ParseData(); } bms_uart_state = UART_wait_0x55; break; default: asm("bkpt"); break; } }
I've seen some implementations of such a state machine with function pointers, separating each state into a function, although I didn't think it was particularly readable or safe. Are there more ways to do this in a more obvious or maybe safe manner? The only complication I see here is the need to debug these variables in the foreground in a shell-like environment, and the packed structs are not atomic, therefore I need to unpack/copy all these variable into atomic types such as float or uint32_t so that there is no data consistency issues.

Hi,

An FSM based on pointers is an excellent design choice, there's a small cost consideration as a function call is involved but that might be negligible.

Create a 1D array of function pointers, invoked like this

Code: [Select]


STATE = INITIAL; // e.g. 0, some enum element...

while (whatever)
   {
   CHAR = get_next_byte();
   STATE = process_event[STATE](CHAR);
   }

One can use a 2D map but not in cases like yours, you have fixed set of states but not a fixed set of incoming bytes.

Each handler is then pretty simple, it performs some simple, single, well defined action and returns a new (or unchanged) state.

Once the array is initialized you're good to go and debugging this is much easier than explicit switch based designs.

Also if the data is corrupt, you can sometimes handle that within a handler and consuming bytes until some safe recognizable sentinel is seen and continue on, yes you've lost some data but at least you can recover to a degree and log the loss.

nctnico · « **Reply #19 on:** December 05, 2022, 03:48:19 pm »

Quote from: Siwastaja on December 05, 2022, 08:39:23 am

Quote from: nctnico on December 04, 2022, 08:13:24 pm
I'd avoid mapping packets onto structs. A lot can go wrong due to alignment issues.

And, nctnico is wrong about this again.

I have never seen any issue with mapping packets onto structs. Actually, the whole internet (BSD, linux) works on C code written that way. Endianness is handled with simple swapping macros which are conditionally defined based on the architecture. Alignment is not an issue at all as I will explain below.

That is because most big CPUs don't care about aligment. Go into the microcontroller world and things are different. Just count yourself lucky you never had to spend days looking for an alignment problem that caused silent data corruption.

Another disadvantage of using structs is that you need to do work for fields you don't need. In many protocols you don't need the contents of all fields. And how about when the protocol changes where fields suddenly are placed at a different offset?

A set of simple functions that get a certain type from a certain offset (which is also how protocols are typically specified so field offset verification is easy) in the right byte order is much easier and future proof to work with. In the end not mapping structs onto buffers will make your life easier.

Siwastaja · « **Reply #20 on:** December 05, 2022, 04:10:49 pm »

Quote from: nctnico on December 05, 2022, 03:48:19 pm

That is because most big CPUs don't care about aligment. Go into the microcontroller world and things are different. Just count yourself lucky you never had to spend days looking for an alignment problem that caused silent data corruption.

It's not luck - it's understanding the language, the tools and documentation. As expected, you completely ignored all the facts again.

The fact that using correct, robust, test&tried language constructs does not cause problems in practice is completely unsurprising to everyone else except someone who has just decided they are bad, without ever being able to explain why, except bogus, made-up reasons like "alignment", which is shown to be wrong time after time.

Again, we are talking about a construct where compiler guarantees correct alignment, but somehow you don't like it "because of alignment".

And yes, I have debugged alignment problems (e.g. with DMA) but never with a packed struct - exactly because the packed attribute removes the assumptions about self-alignment

.

ledtester · « **Reply #21 on:** December 05, 2022, 04:12:53 pm »

I have a feeling that the BMS only sends packets every so often.

Therefore there will be a long pause between packets and you can use this pause as a way of synchronizing the start of a packet.

nctnico · « **Reply #22 on:** December 05, 2022, 04:16:44 pm »

Quote from: Siwastaja on December 05, 2022, 04:10:49 pm

Quote from: nctnico on December 05, 2022, 03:48:19 pm
That is because most big CPUs don't care about aligment. Go into the microcontroller world and things are different. Just count yourself lucky you never had to spend days looking for an alignment problem that caused silent data corruption.

It's not luck - it's understanding the language, the tools and documentation. As expected, you completely ignored all the facts again.

The fact that using correct, robust, test&tried language constructs does not cause problems in practice is completely unsurprising to everyone else except someone who has just decided they are bad, without ever being able to explain why, except bogus, made-up reasons like "alignment", which is shown to be wrong time after time.

I already explained precisely why mapping structs on buffer is bad: not all platforms support it AND it doesn't solve byte order (many protocols use big endian byte order while many CPUs nowadays are little endian). It is not much simpler than that.

Using getters / setters to deal with a buffer make sure alignment never comes around to bite you AND it gets you the right byte order without needing to do post-processing / keep track on whether you already swapped the bytes or need to that later on. Everything is neatly packed into a single -call it layer if you like- that deals with formatting the fields in the buffer properly.

I don't care about compiler / language constructs; they don't have to work for all situations. And to be more precise: struct packing is not really a C language construct either. You need to use a compiler specific attribute or pragma to tell the compiler the struct is somehow 'special'.

Siwastaja · « **Reply #23 on:** December 05, 2022, 04:24:25 pm »

Quote from: nctnico on December 05, 2022, 04:16:44 pm

I already explained precisely why mapping structs on buffer is bad: not all platforms support it

What do you mean by "mapping"? Pointer casting?

Why not do it properly (as per C99, plus packing compiler extension):
union
{
struct PACKED
{
...
};
uint8_t u8[...];
};

I already showed this, but of course you ignored it.

Avoiding pointer manipulation is a good idea when it can be avoided. Type-punning via unions is extremely robust, compiler HAS to make it work (for example, if you have a uin32_t form in the union, compiler aligns the whole object by 4, and you can fill the data efficiently from a 32-bit data register or DMA). But the struct, being packed, is still accessed with correct instructions and never unaligned.

Plus it's super readable, you document directly into the datatype all the forms the objects will be used in. No messy code; no casting to different type of pointer then dereferencing it. No bit manipulation gymnastics. Just well defined language construct, plus:

Quote

And to be more precise: struct packing is not really a C language construct either. You need to use a compiler specific attribute

Which is implemented by every practical compiler available, because so much code depends on
1) having it available,
2) having it operate correctly (not generate unaligned access, that's the whole point)

C programming often uses compiler extensions, especially on embedded (MCU). Some are quite special snowflakes; others like packed is realistically always available. Get used to it and stop complaining about it and start writing better code.

Oh god I wish C standard folks would have added the packed keyword to the language. Not that it's a big problem, because every compiler implements it anyway and it's not a problem with practical people who can wrap it in a #define macro for portable code. But I would like to see all the needless complaining go away.

uer166 · « **Reply #24 on:** December 05, 2022, 06:11:30 pm »

Quote from: Siwastaja on December 05, 2022, 07:30:44 am

"CRC for header" also makes the synchronization start sequence effectively longer. If you just have say uint16 0xacdc as your magic number, there is a risk of missynchronization into data that accidentally contains this 16-bit sequence. But if you have 0xacdc + uint8 msgid + uint8 len + crc8, the data would need to have this 40-bit sequence in it for accidental synchronization.

The above opened my eyes a little bit, since the first few bytes:
|0x55|0xAA|LENGTH|ADDRESS|0x01|CMD|

Are actually predictable and are all the same given a specific CMD. the CMD defines the length, and the ADDRESS is always a fixed number. This means I can easily sync on the first 6 bytes, which gives a nice 48-bit sequence. This would avoid the need for a 255-byte buffer and any re-sync should be pretty quick. This is kinda like having a CRC since the length and address is always a unique fixed pair in the messages that I care about.

uer166 · « **Reply #25 on:** December 05, 2022, 06:13:16 pm »

Quote from: Sherlock Holmes on December 05, 2022, 04:43:40 pm

Quote from: Siwastaja on December 05, 2022, 04:37:47 pm
Quote from: Sherlock Holmes on December 05, 2022, 04:29:46 pm
You cannot counter someone's argument by simply insinuating they are incompetent, ignoring "facts" and so on.

Unlike you, nctnico is highly competent and I deeply respect his competence. He just has a few weird "blind spots" where his opinions, which are just matter of taste which I could accept as opinions, like disliking structs in networking and preferring bit manipulation reconstruction, make him close his eyes from objective facts. I'm sure he still does excellent job.

I have offered all the relevant information, unlike you; in other words, please get out of our argument, your comments are completely unhelpful. We have enough trolls on this forum already. Thank you in advance.

I see through this Sir, when you speak it is to be regarded as "facts" as "objectively true" but the words of another (who disagrees with you) are merely "opinions" as the words of one who has "closed" their eyes, has "blind spots".

You are a brave man indeed, hiding behind a keyboard and unhesitatingly insulting me and my presumed lack of competence. Is that how you speak to coworkers who dare to disagree with you? Is that really the way you'd communicate in person? I doubt it, I bet you're as nice as pie under those circumstances, nobody who bullies as you do here, would last long in the workplace.

If you have nothing technical to add to the discussion, feel free to avoid the thread. I don't mind technical discourse but this is not useful.

Siwastaja · « **Reply #26 on:** December 05, 2022, 06:45:09 pm »

Quote from: uer166 on December 05, 2022, 06:11:30 pm

The above opened my eyes a little bit, since the first few bytes:
|0x55|0xAA|LENGTH|ADDRESS|0x01|CMD|

Are actually predictable and are all the same given a specific CMD. the CMD defines the length, and the ADDRESS is always a fixed number. This means I can easily sync on the first 6 bytes, which gives a nice 48-bit sequence. This would avoid the need for a 255-byte buffer and any re-sync should be pretty quick. This is kinda like having a CRC since the length and address is always a unique fixed pair in the messages that I care about.

One can use something like this:

Code: [Select]

typedef struct
{
   uint8_t id;
   uint8_t len;
   void (*handler)(void* msg);
} msg_meta_t;

#define MSG(id, prefix) {.id=(id), .len=sizeof (prefix ## _msg_t), .handler = prefix ## _handler}

static const msg_meta_t MSGS[] =
{
   MSG(42, measure_voltage),
   MSG(57, self_destruct)
};

Or, even better, make the id contiguous from 0 (not a sparse table) and use the meta table index directly as msgid.

But with such sparse table (current project of mine uses this approach because message type space is 16 bits wide and we may have many hundreds of message types eventually, where each product only implements subset of):

When receiving msgid, loop through the meta table:

Code: [Select]

for(int i=0; i<NUM_ELEM(MSGS); i++)

Now you get the length from the table. If you still prefer to send it over wire - not a bad idea, now you can do a sanity check if these two match. But if you trust yourself to keep both parties up-to-date, no need to transfer the len field.

When done, and CRC checks out, you just call the handler through the function pointer, giving them the other two of the buffers. (In this example, we used void* pointer that we cast back to the desired type. I know it is safe to do so because the original buffer type is an union with the type in it, but if this concerns you, feel free to make that meta function pointer another union, with all possible argument types, then you can pass the actual correct type without casting it.)

I prefer to utilize preprocessor copy-and-paste to
1) reduce typing the same things over and over again, reducing human errors when changing things
2) force to use unified naming scheme, like
self_destruct_handler()
which uses
self_destruct_msg_t
as payload input type.

This is not super-fancy and not super-high level, but it's some kind of sweet spot for a simple language like C.

NorthGuy · « **Reply #27 on:** December 06, 2022, 03:10:27 am »

Quote from: uer166 on December 05, 2022, 03:43:18 am

Yup, in fact I am in control for the request packets. I.e. I can expect silence until the time where I send out a formatted "give me cell voltages" or whatever. I can see your point about efficiency loss here being similar to overhead, maybe something like Consistent Overhead Byte Stuffing would help.

Syncing to 0xaa55, given that 0xaa55 may occur in the middle of the packet, complicates things, so try to avoid it.

If you ask for cell voltages, you know what are you going to receive. The length is probably fixed too. So, you do very simple thing - you send your request, and then you use DMA (or interrupt if you don't have DMA) to collect data into a buffer until you receive the needed number of bytes or until the operation times out. Then you parse the received data - verify the preamble, length, message type, checksum etc. You don't need BMS data very often, so you have plenty of time to do so. No need for a state machine or anything of that sort. Don't complicate things without a need.

NorthGuy · « **Reply #28 on:** December 06, 2022, 03:17:56 am »

Quote from: nctnico on December 05, 2022, 04:16:44 pm

I already explained precisely why mapping structs on buffer is bad: not all platforms support it AND it doesn't solve byte order (many protocols use big endian byte order while many CPUs nowadays are little endian). It is not much simpler than that.

What do you mean by "not all platforms support"? If platform supports pointers, then you certainly can interpret such pointers as pointers to structs. Do you have any example of a platform which wouldn't support it?

If a field contains a big endian (e.g. a port number in IP protocol) you can easily invert it, or you can keep all your port numbers in big endiam format. Why is that a big deal?

SiliconWizard · « **Reply #29 on:** December 06, 2022, 05:26:32 am »

Any platform I've ever used in C from 8-bit to 64-bit had compilers allowing struct packing. The only problem to be aware of is again alignment, which may be a problem on some platforms for accessing certain ill-aligned struct members directly. That's the case in particular for the 16-bit PIC MCUs for accessing 16-bit or 32-bit struct members. They need to be aligned.

As to endianness, yeah it's not a big deal. If you make your own protocols don't bother with big-endian as it's pretty rare nowadays on processors. Buf if you have to deal with it, you can just swap bytes "in place" as NorthGuy said, for the members that require it. Still usually eons simpler, faster and more readable than hand-decoding packets byte per byte.

The real potential issue is the alignment, and that's something you have to consider for the target you're programming for. And for the protocol you're designing as well, unless you have no choice and have to use an existing one. Otherwise, defining your data packets to have "fields" that are all naturally aligned will pretty much cover all targets, and it's not rocket science.

Nominal Animal · « **Reply #30 on:** December 06, 2022, 05:35:58 am »

If you want to use function pointers, then a pattern where each character received is handled by a function which returns the pointer to the function that handles the next character, should work:

Code: [Select]

#include <stdint.h>
#include <stddef.h>

// Known states:
static void *wait_55(uint8_t rx);
static void *wait_AA(uint8_t rx);
static void *wait_len(uint8_t rx);
static void *wait_addr(uint8_t rx);
static void *wait_01(uint8_t rx);
static void *wait_cmd(uint8_t rx);
static void *wait_data(uint8_t rx);
static void *wait_check(uint8_t rx);
static void *wait_skip(uint8_t rx);
static void *wait_skipcheck(uint8_t rx);
static void *in_error(uint8_t rx);

// Handler for the next incoming byte
void *      (*BMS_UART_State)(uint8_t) = wait_55;

// Handler for the currently received command (unused if skipped)
void *      (*BMS_UART_Handler)(void) = NULL;

uint_fast16_t BMS_UART_Checksum = 0;
uint_fast16_t BMS_UART_Datasum = 0;
uint8_t      BMS_UART_Data[255];  // Or whatever is the longest supported command
uint8_t      BMS_UART_Have = 0;
uint8_t      BMS_UART_Length = 0;
uint8_t      BMS_UART_Address = 0;
uint8_t      BMS_UART_Command = 0;

uint8_t      BMS_UART_Error = 0;
enum {
    BMS_UART_ERROR_NO_AA = 1,
    BMS_UART_ERROR_NO_01,
    BMS_UART_ERROR_CHECKSUM,
};

// State machine, in all its glory
void BMS_UART_StateMachine_Rx(uint8_t rx)
{
    BMS_UART_State = BMS_UART_State(rx);
}

// Implementation of individual states
static void *wait_55(uint8_t rx)
{
    if (rx == 0x55)
        return wait_AA;
    else
        return wait_55;
}

static void *wait_AA(uint8_t rx)
{
    if (rx == 0xAA)
        return wait_len;
    else {
        BMS_UART_Error = BMS_UART_ERROR_NO_AA;
        return in_error;
    }
}

static void *wait_len(uint8_t rx)
{
    BMS_UART_Length = rx;
    return wait_addr;
}

static void *wait_addr(uint8_t rx)
{
    BMS_UART_Address = rx;
    return wait_01;
}

static void *wait_01(uint8_t rx)
{
    if (rx == 0x01) {
        return wait_cmd;
    } else {
        BMS_UART_Error = BMS_UART_ERROR_NO_01;
        return in_error;
    }
}

static void *wait_cmd(uint8_t rx)
{
    BMS_UART_Command = rx;
    BMS_UART_Datasum = BMS_UART_Length + 0x26 + BMS_UART_Command;
    BMS_UART_Checksum = 0;
    BMS_UART_Have = 0;

    if (BMS_UART_Length == 0xCA && BMS_UART_Address == 0xFE && BMS_UART_Command == 0x64) {
        // Command 0x64, address 0xFE, length 0xCA is handled by function NULL:
        BMS_UART_Handler = NULL;
        return wait_data;

    } else
    if (BMS_UART_Length == 0xB0 && BMS_UART_Address == 0x0B && BMS_UART_Command == 0x1E) {
        // Command 0x1E, address 0x0B, length 0xB0 is handled by function NULL:
        BMS_UART_Handler = NULL;
        return wait_data;

    } else {
        // Everything else is skipped.
        return wait_skip;
    }
}

static void *wait_skip(uint8_t rx)
{
    if (BMS_UART_Have < BMS_UART_Length) {
        BMS_UART_Datasum += rx;
        BMS_UART_Have++;
        return wait_skip;
    } else {
        BMS_UART_Checksum = rx;
        BMS_UART_Datasum = 0xFFFF - BMS_UART_Datasum;
        return wait_skipcheck;
    }
}

static void *wait_skipcheck(uint8_t rx)
{
    BMS_UART_Checksum = (BMS_UART_Checksum << 8) | rx;
    if (BMS_UART_Checksum == BMS_UART_Datasum) {
        // No error, but we ignored this command.
        return wait_55;
    } else {
        BMS_UART_Error = BMS_UART_ERROR_CHECKSUM;
        return in_error;
    }
}

static void *wait_data(uint8_t rx)
{
    if (BMS_UART_Have < BMS_UART_Length) {
        BMS_UART_Datasum += rx;
        BMS_UART_Data[BMS_UART_Have++] = rx;
        return wait_data;
    } else {
        BMS_UART_Checksum = rx;
        BMS_UART_Datasum = 0xFFFF - BMS_UART_Datasum;
        return wait_check;
    }
}

static void *wait_check(uint8_t rx)
{
    BMS_UART_Checksum = (BMS_UART_Checksum << 8) | rx;
    if (BMS_UART_Checksum == BMS_UART_Datasum) {
        if (BMS_UART_Handler)
            BMS_UART_Handler();
        return wait_data;
    } else {
        BMS_UART_Error = BMS_UART_ERROR_CHECKSUM;
        return in_error;
    }
}

// Error state
static void *in_error(uint8_t rx)
{
    if (rx == 0x55) {
        return wait_AA;
    } else {
        return in_error;
    }
}

Take a note at the wait_cmd() function. In it, you detect if the command, address, and length is something you are interested in, and if so, assign a handler function to be called when it is received with a matching checksum. Otherwise, the data is ignored.

I'm not sure I like this over the switch (state) { ... } idiom, though.

The key feature of this state machine is that it detects errors, and has a dedicated error state (from which it will continue when a valid 0x55 0xAA sequence is received). And, of course, that it does not store all data, only data from commands you want to support, while still handling all commands of the stated form correctly (and even checking their checksum).
Obviously, you do not need to use function pointers to implement this exact state machine.

Siwastaja · « **Reply #31 on:** December 06, 2022, 08:48:14 am »

Quote from: SiliconWizard on December 06, 2022, 05:26:32 am

Any platform I've ever used in C from 8-bit to 64-bit had compilers allowing struct packing. The only problem to be aware of is again alignment, which may be a problem on some platforms for accessing certain ill-aligned struct members directly. That's the case in particular for the 16-bit PIC MCUs for accessing 16-bit or 32-bit struct members. They need to be aligned.

But, once again, with the packed attribute, compiler uses correct instructions to access unaligned 16- or 32-bit member; e.g. 8-bit loads and shifting + orring to combine the result into a register. Just like what you do manually in C by writing val = (uint32_t)in[1]<<24 | ... and so on. This is obviously slower than aligned loads, and the reason not to pack structs "just because".

It only becomes a problem if pointer to a member is used, but then the compiler will warn you, as I found out, even with the minimal default warnings on GCC. Casting the pointer to the whole packed struct is fine because compiler knows the struct is packed and does not assume any alignment.

But, when unsure, use unions instead of pointer casting. Then the language itself guarantees alignment, and nothing can go wrong. By thinking about your datatypes and all their intended uses, you get almost completely rid of casting buffers into something else. Especially if you don't have to deal with libraries.

westfw · « **Reply #32 on:** December 06, 2022, 09:43:22 am »

Quote

the language itself guarantees alignment

While I'm a big fan of mapping packets unto structures, it is possible to end up with the start of your packet at an unaligned address due to encapsulation or options or whatever coming before it. In which case you may need to use accessor functions/macros that are not more complicated than using similar functions all the time, but (IMO) more readable.
"src = GETLONG(pak->ipsrcaddr)" vs "src = get_ip_srcaddr(pak)"

OTOH, there's IP's "Fragment Offset", which is substantially inconvenient on little-endian CPUs. :-)

Nominal Animal · « **Reply #33 on:** December 06, 2022, 01:13:14 pm »

Quote from: westfw on December 06, 2022, 09:43:22 am

"src = GETLONG(pak->ipsrcaddr)" vs "src = get_ip_srcaddr(pak)"

Why not

Code: [Select]

// Generic accessors
static inline uint_fast16_t  get_u16be(const unsigned char *src)
{
    return ((uint_fast16_t)(src[0]) << 8) | src[1];
}

static inline uint_fast32_t  get_u32be(const unsigned char *src)
{
    return ((uint_fast32_t)(src[0]) << 24)
         | ((uint_fast32_t)(src[1]) << 16)
         | ((uint_fast32_t)(src[2]) << 8)
         | src[3];
}

// Dedicated accessors
static inline uint32_t  get_ipv4_length(const void *gram)
{
    return get_u16be((unsigned char *)gram + 2);
}

static inline uint32_t  get_ipv4_srcaddr(const void *gram)
{
    return get_u32be((unsigned char *)gram + 24);
}

static inline uint32_t  get_ipv4_dstaddr(const void *gram)
{
    return get_u32be((unsigned char *)gram + 28);

}

where the inline is just a reminder to us humans that these are accessor functions intended to be put into a header file?

The generic accessor is named according to the data type, whereas the dedicated accessor is named after the logical field, and internally "encodes" the size, type, and offset of that field.

On x86-64, where unaligned accesses are supported, get_ipv4_srcaddr() compiles to just one mov and a bswap. On ARM Thumb, to one ldr and a rev. On RISC-V, it generates a bit more code, but no more than any unaligned 32-bit load would.

There is only a small conceptual jump into memory-safe accessor primitives from here, too.

nctnico · « **Reply #34 on:** December 06, 2022, 02:04:12 pm »

Quote from: SiliconWizard on December 06, 2022, 05:26:32 am

Any platform I've ever used in C from 8-bit to 64-bit had compilers allowing struct packing. The only problem to be aware of is again alignment, which may be a problem on some platforms for accessing certain ill-aligned struct members directly. That's the case in particular for the 16-bit PIC MCUs for accessing 16-bit or 32-bit struct members. They need to be aligned.

And ARM platforms have a habit of needing alignment as well. I have run into the situation where an Atmel SOC would silently corrupt unaligned data. And this was due to mapping a type directly onto the buffer.

Quote

As to endianness, yeah it's not a big deal. If you make your own protocols don't bother with big-endian as it's pretty rare nowadays on processors. Buf if you have to deal with it, you can just swap bytes "in place" as NorthGuy said, for the members that require it. Still usually eons simpler, faster and more readable than hand-decoding packets byte per byte.

I never suggested hand decoding packet byte-per-byte. Use get / set routines for specific field types that work with offsets. In my experience that is much more practicle and less error prone than putting a struct together because -again- fields in a protocol are typically specified with an offset and size. These get / set routines can take care of byte swapping which then wraps up everything that deals with hardware at the lowest level possible. IMHO is it utterly nuts that the BSD socket API requires the application layer to deal with endianess. That should have been handled at a much lower level.

Also keep in mind some protocols may have messages that have different fields depending on the message type so one struct typically isn't enough. Another advantage of not mapping structs is that you can use a window that moves over the incoming data without needing to care about alignment. Especially with protocols that come in over continuous streams (like UART and TCP/IP) you don't know the start & end of a packet; it makes more sense to use the receive buffer to examine the data 'in situ' rather than copying something that could be a packet but may be garbage.

This may be drivel to some but I have implemented a crapload of protocols over the years and know very well what works and what doesn't.

NorthGuy · « **Reply #35 on:** December 06, 2022, 03:08:04 pm »

Quote from: nctnico on December 06, 2022, 02:04:12 pm

Another advantage of not mapping structs is that you can use a window that moves over the incoming data without needing to care about alignment. Especially with protocols that come in over continuous streams (like UART and TCP/IP) you don't know the start & end of a packet; it makes more sense to use the receive buffer to examine the data 'in situ' rather than copying something that could be a packet but may be garbage.

That is because you have used an inappropriate buffering method. If you use a circular buffer then everything gets misaligned and the buffer may wrap at any point, so the data can only be accessed byte by byte. Accessor functions create an illusion of structural approach, but do not solve the problem - they only add bloat to hide the ugliness.

Instead, create a number of aligned buffers (one, two, or many) where you can collect the "would-be-packets" from UART, then, when the packet is confirmed, pass the pointer to the buffer further for processing. Once the contents of the packet is identified pass a pointer the specific data structure within the packet to analyze the contents.

Siwastaja · « **Reply #36 on:** December 06, 2022, 03:15:19 pm »

Quote from: nctnico on December 06, 2022, 02:04:12 pm

Quote from: SiliconWizard on December 06, 2022, 05:26:32 am
Any platform I've ever used in C from 8-bit to 64-bit had compilers allowing struct packing. The only problem to be aware of is again alignment, which may be a problem on some platforms for accessing certain ill-aligned struct members directly. That's the case in particular for the 16-bit PIC MCUs for accessing 16-bit or 32-bit struct members. They need to be aligned.
And ARM platforms have a habit of needing alignment as well. I have run into the situation where an Atmel SOC would silently corrupt unaligned data. And this was due to mapping a type directly onto the buffer.

... which (the type) clearly was not a packed struct.

The title here says: "packed structs".

Case closed. Can we now go forward or do we need to continue this broken record game?

Quote

Also keep in mind some protocols may have messages that have different fields depending on the message type so one struct typically isn't enough

Which is why I suggested unions.

Quote

it makes more sense to use the receive buffer to examine the data 'in situ' rather than copying something that could be a packet but may be garbage.

Which is why I suggested instead of copying, using an union so that the same buffer type can be accessed as the struct.

Quote

This may be drivel to some but I have implemented a crapload of protocols over the years and know very well what works and what doesn't.

Clearly you have. Sometimes it is a good idea to look at new, modern ideas that improve the code. But if you refuse to even read the messages you reply to, and instead build a strawman against any idea except of your own, then we can do nothing to help.

nctnico · « **Reply #37 on:** December 06, 2022, 03:37:13 pm »

Combining a struct in a union won't help if the buffer itself isn't aligned (for example when moving a 'window' over a buffer). And mapping a type or struct onto a buffer is the same thing. It will break when alignment is required. It is that simple. And this has nothing to do with being 'modern' or something like that (C compilers have been supporting packed structs since the dawn of time) but it is all about creating robust, portable code that works as expected no matter what. Even in the hands of less experienced engineers.

Actually, you have to think about whether packed structs have been designed to be used on dealing with external protocols. Saving memory at the expense of speeds seems like a very reasonable explaination to me. Especially since you can also pack bitfields.

Siwastaja · « **Reply #38 on:** December 06, 2022, 07:09:31 pm »

Portability no matter what? Sure, that's a valid reason, but it's not the only one.

If you want portability so that the MCU could one day be replaced with a big endian MCU, without code needing to be modified at all, then sure, write wrappers with both endianness options from day one. You need to do all the same work, just proactively.

(I have actually encountered one big endian machine lately, namely Teltonika RUT955 embedded PC. Interfacing it with ARM microcontrollers required taking care of endianness on one of the ends - the Teltonika. I still used structs, and no, there was nothing magical involved.)

These are design choices to make. In my MCU projects, these are not factors. Code which I write for Cortex-M4 for example, relies on some other Cortex-M features such as interrupt handlers being regular functions. I also rely on it staying little endian - and I also rely on using a compiler with packed struct available. All ARM CM capable compilers are such, have always been. I think it's a safe assumption no one wants to suddenly compile my commercial projects on some specialized scientific or toy compiler.

Life is such choices. And it has absolutely nothing to do with "robustness". A train can be robustly designed and built even if it needs certain width of tracks and cannot fly. These are limitations, and not some arbitrary limitations, but well thought out ones. Every project has those.

And I happen to be 110% sure the uer166 guy:
* will always use a compiler with packed attribute
* will only use little endian CPUs in his BMS communication

Worst case? When the unimaginable happens and uer166 directly connects his BMS to Teltonika router, then add endianness wrappers. Same work, but Just-In-Time.

NorthGuy · « **Reply #39 on:** December 06, 2022, 07:25:01 pm »

You usually don't need to use packed structs. Most of the protocols will use struct fields which are naturally aligned. There may be padding added to ensure this, which you explicitly add to your struct where needed. Thus, once you ensure that the beginning of the struct is aligned, the rest of the fields get aligned automatically.

Siwastaja · « **Reply #40 on:** December 06, 2022, 07:35:57 pm »

Quote from: NorthGuy on December 06, 2022, 07:25:01 pm

You usually don't need to use packed structs. Most of the protocols will use struct fields which are naturally aligned. There may be padding added to ensure this, which you explicitly add to your struct where needed. Thus, once you ensure that the beginning of the struct is aligned, the rest of the fields get aligned automatically.

Well designed protocols do this for performance. You need to remember the ages old good trick, which is still useful: when the order of fields does not matter for you (and usually it does not), just order the fields from biggest to smallest (e.g. floats go first, chars last), for least amount of padding.

Then, for protocols, you can add padding yourself. The advantage over compiler-generated padding is also that these padding fields become natural "reserved" fields which allows you to extend the protocol later.

But I'm sure this is somehow wrong, too, definitely not professional and robust

Sherlock Holmes · « **Reply #41 on:** December 06, 2022, 07:54:26 pm »

Quote from: Siwastaja on December 06, 2022, 07:35:57 pm

Quote from: NorthGuy on December 06, 2022, 07:25:01 pm
You usually don't need to use packed structs. Most of the protocols will use struct fields which are naturally aligned. There may be padding added to ensure this, which you explicitly add to your struct where needed. Thus, once you ensure that the beginning of the struct is aligned, the rest of the fields get aligned automatically.
Well designed protocols do this for performance. You need to remember the ages old good trick, which is still useful: when the order of fields does not matter for you (and usually it does not), just order the fields from biggest to smallest (e.g. floats go first, chars last), for least amount of padding.

Then, for protocols, you can add padding yourself. The advantage over compiler-generated padding is also that these padding fields become natural "reserved" fields which allows you to extend the protocol later.

But I'm sure this is somehow wrong, too, definitely not professional and robust

Perhaps such field ordering could be done with greater flexibility, I agree that the "largest" first is a sensible strategy sometimes but .Net does that automatically, one has to take special steps to order them oneself.

SiliconWizard · « **Reply #42 on:** December 06, 2022, 08:27:33 pm »

Quote from: Siwastaja on December 06, 2022, 08:48:14 am

Quote from: SiliconWizard on December 06, 2022, 05:26:32 am
Any platform I've ever used in C from 8-bit to 64-bit had compilers allowing struct packing. The only problem to be aware of is again alignment, which may be a problem on some platforms for accessing certain ill-aligned struct members directly. That's the case in particular for the 16-bit PIC MCUs for accessing 16-bit or 32-bit struct members. They need to be aligned.
But, once again, with the packed attribute, compiler uses correct instructions to access unaligned 16- or 32-bit member; e.g. 8-bit loads and shifting + orring to combine the result into a register. Just like what you do manually in C by writing val = (uint32_t)in[1]<<24 | ... and so on. This is obviously slower than aligned loads, and the reason not to pack structs "just because".

I have certainly run into compilers (xc16 here, don't remember the version, but it's based on GCC although the port is handled by Microchip and never made it to mainline GCC, so it's possibly very not on par with what GCC is supposed to be doing) that wouldn't automatically handle unaligned access in the way you describe, and accessing unaligned struct members would crash the CPU.

Given that the "packed" attribute AFAIK is compiler-specific and not standard C, you're of course not guaranteed any of this anyway. Let me know if this has made it to the standard though. Maybe C23, as I think they finally introduced attributes?

That said, I checked on godbolt using Clang and GCC and various targets and your statement seems to hold, so I'm gonna assume that it should be safe if you use a reasonably recent version with a reasonably well supported target. I'm again not too sure about 16-bit PICs here, but xc16 is kind of a nasty beast.

Nominal Animal · « **Reply #43 on:** December 07, 2022, 04:38:23 am »

I use both (packed) structures and (character) buffer accessors.

The structures work really well with fixed structures, like say PNG files. When the actual layout depends on the content, the buffer accessors work much better than trying to use structures. IPv4 datagrams are on the borderline, because the first 20 bytes of the header are fixed, and only options may vary; so I might use a packed structure (as all C compilers I use do support those) for the first 20 bytes, but accessors for the options.

I do not understand the argument here, because in my opinion, the two do not overlap that much.

I use packed structures with fixed data structures when I have ensured they work and documented the need for them to work, and buffer accessors for varying data structures and when packed structures may not be available.

Even when using structures with associated data, I tend to use the struct fields through accessors when the field deals with things like data length to ensure memory safety: that through normal use of the interface, one will not accidentally access past the known/received data, because the content happens to be buggy or malicious.

This checking is important. The recent CVE-2022-41674 is a good example. Code did not check that the MaxBSSID field in a WiFi packet had a sensible value (between 1 and 8, inclusive), so a buffer overrun ensued. Oopsie.

Thus, even when using a packed struct, you should never trust its contents to be safe; and to make sure, we use helper accessors to do the checking for us. (Well, not for pure data fields, those that just state some fact, but do not affect the layout or access to other data.)

In other words, the one wrong use case of packed structs is to avoid using accessor functions. The two are not mutually exclusive.

For worry-free binary data stream protocols, I use prototype binary representations of known values as both the identifier as well as byte order indicator, so that the data generator does not need to care about the byte order at all. On the host computer, the datastream can be trivially 'normalized' to local-endian, or kept as-is. To detect the byte order, one simply tests the byte order permutations (including 'none'/'as-is') that yield a recognized value. For example, for IEEE 754 Binary32 'float' type, one can use 0x3F2C1880 = 0.67224884f.

For structures, on a computer, the "packed" datastream can easily be expanded into a stream/array of native structures. It is an extra copy, yes, but it turns out that due to the cache behaviour of current machines, doing it as soon as the data is read or received, is actually very, very cheap in terms of CPU or wall time used, because on typical computers we tend to be memory bandwidth limited, and not at all CPU limited. Even in Python (when interfacing to microcontrollers) this makes sense –– for example, one should definitely convert an incoming bytes stream into NumPy.arrays via struct.unpack() for manipulation. (On the prototype part, just try the different byte order prefixes on the struct formats, until the values match the expected, and then instantiate that using struct.Struct for mass unpacking.)

I really like the prototype approach, because not only does a single value (of each type used) ensure byte order, but they also act like protocol and format identifiers, 'magic pattern', in the Unix file command magic sense.

Siwastaja · « **Reply #44 on:** December 07, 2022, 08:32:20 am »

Quote from: Nominal Animal on December 07, 2022, 04:38:23 am

In other words, the one wrong use case of packed structs is to avoid using accessor functions.

One could also claim that having one accessor function per member for validation is not always the best strategy. Sometimes validity is in combination of variables, for example: assert(a<10 && b<10 && (a+b)<15).

I like to do the sanity check of the len field in the protocol level handler (which is agnostic to the payloads), but when len and CRC check out and thus non-corrupted data is received (leaving logical mistakes / bugs at the sender), I use a handler function which gets the whole payload type struct, and verifies the sanity of data fields "as it goes", in the same way one usually checks function input arguments.

In fact, I don't think there is any fundamental difference in verifying the input values in a "normal" function in a program, or input values coming in a struct that was received through UART.

But as is obvious to most of sensible programmers, different paradigms work for different problems.

Nominal Animal · « **Reply #45 on:** December 07, 2022, 09:42:40 am »

Quote from: Siwastaja on December 07, 2022, 08:32:20 am

Quote from: Nominal Animal on December 07, 2022, 04:38:23 am
In other words, the one wrong use case of packed structs is to avoid using accessor functions.
One could also claim that having one accessor function per member for validation is not always the best strategy. Sometimes validity is in combination of variables, for example: assert(a<10 && b<10 && (a+b)<15).

I agree. It is equally valid to just verify all the fields at some point.

In my own code, when the structures are accessed randomly –– so, a tree or other structured data thingy –– I sometimes unpack and validate the structures at reception/reading, because that tends to have the best cache footprint and I/O thoroughput. Idea being, since we touch the data anyway, we could make it cheaper to access.

Similarly, when receiving data via an UART, I prefer methods that update checksums and calculate command hashes on the fly, online.

Quote from: Siwastaja on December 07, 2022, 08:32:20 am

In fact, I don't think there is any fundamental difference in verifying the input values in a "normal" function in a program, or input values coming in a struct that was received through UART.

So much of the code I write is centered around working with structured data, that I find the idea that verifying input could be considered somehow special or different, completely bonkers.

I like to think of all data suspicious, because that way any oddity or issue in processing that cannot be compensated becomes an issue for the user to deal with. I absolutely hate programs that silently corrupt data, because "those errors occur so rarely it is not worth it to check for them", because with my luck, it is my most precious data that gets silently corrupted. No, I want programs to do their best, but also tell me if something odd or unexpected happened.

In an embedded appliance, especially microcontrollers, it can be a difficult user interface question to decide how to handle communications errors. But, if you start with "I'll worry about that later", then you basically decide to handle them randomly, because that kind of functionality is a design issue, and not something that can be bolted on top later on. It has never been succesfully fixed later on, only semi-patched with ugly kludges.
Security is like that: it is either designed in, or it does not exist.

SiliconWizard · « **Reply #46 on:** December 07, 2022, 08:09:04 pm »

Key advice.
Validate your inputs.

A gigantic number of bugs, crashes (that can lead to death) and security exploits have come, come and will come from not validating inputs.

Even a very large number of memory-related bugs are linked to not properly validated inputs. That's the case for - I'd dare say - most buffer overflows.

Yet it looks like most quality-oriented discussions and programming language features these days obsess over memory access and management, and almost nothing about input validation (and more generally invariant checks, and "contracts".) That's probably because we tend to focus on adding safety nets rather than on prevention, and that doesn't just apply to software.

westfw · « **Reply #47 on:** December 07, 2022, 11:56:13 pm »

Quote

Validate your inputs.

Yeah. Once upon a time, I wrote one of the first implementations of IP option support. It was pretty neat; you could put "record route" options in ping packets and get interesting data, and etc.

It was "done" just before an "InterOp" trade show. So we took it to the show and loosed it upon the "show network" (you could get away with things like that in those days.) Broadcast pings with options. AFAIK, everything we sent out was correct. But other systems all over crashed and burned. I hadn't done much to validate things coming in, so some of the other system tried to respond, and sent out ... really horribly formatted attempts. We crashed too!

It was a very educational experience for a lot of people/companies! Me included, I think.

Nominal Animal · « **Reply #48 on:** December 08, 2022, 06:49:04 am »

All that I describe below has quite naturally lead me to where I am now, and to say:

Validate your inputs.
Make sure you understand the limitations of the processing or calculations.
Present the results in an effective, useful form.
Let the user know if anything unexpected happens.
Document your intent and purpose for your code in detail.

I started doing web backend stuff very early in the era, before the turn of the century.
I had web clients that did not care about accept-charset, and simply sent data using whatever encoding they liked.
I had to use hidden input fields to detect the character set (codepage 437, codepage 1252, ISO 8859-1, ISO 8859-15, MacRoman, early UTF-8).

Since the first Microsoft web servers were vulnerable to path exploits –– ../../../../../../system32/cmd.exe "escaping" the web root, and referring to the command-line interpreter ––, I saw lots of these in my server logs. I also found out that many users had difficulties remembering the difference between uppercase and lowercase, and complaining about case sensitivity in URL path components.
So, early on, I did my own URL path and query string verification too: removed any illegal sequences, replaced common non-ASCII characters with their closest ASCII equivalents, lowercased it all, and removed all but alphanumerics (including dashes not at the beginning or end) between slashes.
It, too, had a twofold reason: one was security, the other helped users if they accidentally used say /pähkähullu/ instead of /pahkahullu/.

Not verifying the data simply would have never worked at all.

Later on, when I got back to Uni and started doing computational physics (ie. computer simulations, mostly molecular dynamics simulations), I soon learned the most important rule in simulation at the visceral level: you first check if it makes any sense. If the total energy in an isolated system changes, your simulation is faulty, and so on.

My favourite early failure in simulation was with Morse copper (Morse potential fitted roughly to copper), where I investigated the limit at which a system composed of perfect cubic lattice, one half of which was at absolute zero, and the other half had very high random (Boltzmann-distributed) kinetic vectors, would end up melted; with the system being periodic and infinite. My error was to join the two parts so that atoms at the interface were much closer than normal in the lattice. The effects I observed from this specific type of "superheating" were really odd. I saw everything from acoustic waves with wavelength much longer than my system, to a molten droplet encased in solid Morse copper, apparently shielded by the Leidenfrost effect.

That taught me that even tiny errors in the inputs could make the outputs completely unphysical and irrelevant (albeit interesting).

Around the same time, I also started working on sonification and visualization, the former helping me focus on what is important and useful (in that case, when monitoring a running simulation), and the latter on how to control the information conveyed, to convey the most important data to the viewer in an intuitive manner. Showing atoms as gleaming marbles is pretty, but the intuitive associations and extrapolations such images generate are completely wrong.
For example, "atoms colliding" sounds like a trivial, well defined phenomenon. Except it isn't, when you actually investigate say ion implantation in solid matter, and try to find when "atoms collide". You can say that the ion collides, yes, but as to the solid matter, the atoms there are in continuous interaction and movement (due to finite temperature; even at room temperature the movement is significant), so the macro-scale concept of "collision" has no useful meaning or relevance: they all "collide" with each other, all the time, exchanging energy.

That taught me intuition is useful, but easily mistaken; and most importantly, aimable/directable. So, I learned to be suspicious of my own assumptions and intuition, and shifted my learning more towards understanding (in an intuitive manner) the underlying concepts correctly, rather than just applying existing knowledge to solve a specific problem.

(I've left out my endeavours related to the demoscene, art, multimedia, et cetera, so the above is a simplified storyline, picking only the major plot-relevant points here. I do consider myself lucky for having such a winding path, because every step in a different domain has taught me something useful and applicable here.)

All this has taught me to consider the inputs with suspicion; consider the limits and errors inherent in the processing; and how to present the results in an effective and useful manner. And when privileged information is at stake, the privileges needed at each step, and how to stop information leaking (including backwards in the above chain). It all starts at the conceptual level, at the core design level.

Of course, I'm no "ideas man, leaving the details for the lesser mortals to implement". At the core design level, I constantly bump into details and problems I haven't worked on before, so I create practical test cases –– unit tests, in the form of complete executable programs or scripts; sometimes brute-force searching for the 'optimal' solution in the solution phase space –– to find out. I nowadays write these for even ideas I have, just to work out if that idea would work in practice. I quickly learned to file them away well: I use a separate directory for each test case, with a README, sources, and an executable, that when run without any command-line parameters tells exactly what it does. This way I can do a quick find or grep to find an old test case based on keywords I may recall, or at least limit the subset to check (by running the test cases) to a dozen or less, and a quick review of the README followed by the source code tells me all pertinent information.

And that is also the source of why I always remind new programmers to write comments that describe their intent, and not what the code does. We/I can always read the code to see what it does, but lacking crosstime telepathy, I cannot divine whether that was the intent of the programmer or not.
I myself had to learn to do this later on, after already being adept (paid to write code) in several programming languages, so it is very hard for me: even now, I still slip into the useless type of comments, unless I'm careful.

nctnico · « **Reply #49 on:** December 08, 2022, 09:47:32 am »

Quote from: Nominal Animal on December 08, 2022, 06:49:04 am

And that is also the source of why I always remind new programmers to write comments that describe their intent, and not what the code does. We/I can always read the code to see what it does, but lacking crosstime telepathy, I cannot divine whether that was the intent of the programmer or not.

I agree here. I deeply loath 'documentation' made using Doxygen because the output just lacks the one important thing: why is the code the way it is and what are the relations of functions? On top of that, any decent IDE can produce an overview of functions realtime; I don't need Doxycrap for that.

Siwastaja · « **Reply #50 on:** December 08, 2022, 11:34:16 am »

Quote from: nctnico on December 08, 2022, 09:47:32 am

I deeply loath 'documentation' made using Doxygen because the output just lacks the one important thing: why is the code the way it is and what are the relations of functions?

Exactly, Doxygen is cancerous. The purpose is to let the programmers be lazy, and give the impression of having documentation to the higher-ups who just look at it and say "looks nice, 'have nice documentation' checkbox marked".

Just worked with Eclipse PAHO embedded mqtt client, and the Doxygen documentation is just... none:

Code: [Select]

/** MQTT Subscribe - send an MQTT subscribe packet and wait for suback before returning.
 *  @param client - the client object to use
 *  @param topicFilter - the topic filter to subscribe to
 *  @param message - the message to send
 *  @return success code
 */
DLLExport int MQTTSubscribe(MQTTClient* client, const char* topicFilter, enum QoS, messageHandler);

This is nearly useless, literally the only useful information not conveyed already by the names is "wait for suback before returning".

For example, crucial piece of information in any C library where pointers are passed to functions which modify the internal state of the library is, who is responsible of the lifetime (memory allocation) of the object? Many other similar looking functions internally do memcpy on the pointer to some internal buffer, but this particular piece of crap just copies topicFilter pointer to be used later, so you can't generate the filter string in stack and just call the function, or everything breaks down in mysterious ways. Both are valid strategies but you have to tell the user which it is, and it would have been literally 15 seconds of typing to tell me that.

Further, you can see they copy-pasted the description from some other function, so not only the documentation is lacking, it's wrong, too. Bright side is, as this documentation is useless anyway, no one reads it so no one notices the mistake, and managers are happy about the nice-looking "API documentation".

We programmers - we just look at the code and figure out what is does, how it does it - and finally, try to guess why. And when we can't figure out why PAHO, for example, checks if return codes are < 0 (errors) and then replace them with fixed -1 to hide the original error code and just tell the application about a generic error, we come into conclusion there is no sensible reason to do so, and fix the crap, removing those extra steps and make error codes propagate again.

SiliconWizard · « **Reply #51 on:** December 08, 2022, 06:32:56 pm »

People often use Doxygen or similar because they were asked to produce documentation, and documentation is notoriously hard and annoying to produce. So they're looking for the easy way out just to make their bosses happy.

And yes, a few comments where it really matters, and well written code with clear structure and decent type/variable/function naming is worth 10x times that, but since it actually requires competent people to be able to judge, it doesn't work that well in the corporate world.

NorthGuy · « **Reply #52 on:** December 08, 2022, 08:05:55 pm »

Documentation may be good or bad, but it doesn't matter how it's written. It's possible to write good documentation with Doxygen, as it is possible to do the same without Doxygen.

A software engeineer is the best person to write documentation, but most don't want to do that. Furthermore, the employers often don't want to pay them for that. Therefore documentation is usually written by technical writers who do not have a clue what they're writing about. This is not much better than phony Doxygen comments.

ledtester · « **Reply #53 on:** December 09, 2022, 07:12:39 am »

On the topic of documentation, here is one of the main developers of the Haskell compiler talking about a note system that has proven to be a boon to maintaining the codebase over its lifetime (which now exceeds 20 years):

Past and Present of Haskell – Interview with Simon Peyton Jones
https://youtu.be/4RuLzL_q0zs?t=44m9s

(segment starts at 44:09)

DiTBho · « **Reply #54 on:** December 09, 2022, 08:09:02 am »

Quote from: SiliconWizard on December 08, 2022, 06:32:56 pm

People often use Doxygen or similar because they were asked to produce documentation, and documentation is notoriously hard and annoying to produce. So they're looking for the easy way out just to make their bosses happy.

"Stood" is the best tool-way to produce both code skeleton and doc by design.
Kind of compromise.

abyrvalg · « **Reply #55 on:** December 10, 2022, 10:51:29 am »

uer166, the cmd in this protocol is that byte you’ve described as a fixed 01 actually. And the “cmd” from your description is a “param” (for cmd 01 it’s a register address), check my description here: https://github.com/etransport/ninebot-docs/wiki/protocol
This protocol is quite relaxed, lost packets are not a problem at all since the data changes slowly and is cached on both sides. The original M365 ECU maintains a full 256-byte cache of BMS’s registers and treats valid cmd 01 replies as cache syncs. There is no “request-response” pairing at all, the ECU is a real time device, it just sends read requests periodically and a separate superloop “task” handles responses in it’s time slots, caching new data at offsets specified by “param” field of the reply. All other ECU tasks use cached data.

What are you trying to achieve? Porting VESC to M365?

uer166 · « **Reply #56 on:** December 11, 2022, 04:04:36 am »

Quote from: abyrvalg on December 10, 2022, 10:51:29 am

What are you trying to achieve? Porting VESC to M365?

This is a new experimental dual BLDC (so I can have 2 hub motors) of my own design. I need to know the pack status and voltages, but no attempt is made to keep any compatibility of the original M365 bits with the exception of pack. With this much feedback there's lots to think about, although TBH I already have it working well enough for my needs. No decisions (e.g. graceful torque de-rating) would be made here based on the BMS data so the stakes aren't too high.

Perkele · « **Reply #57 on:** December 11, 2022, 12:37:41 pm »

What you're describing is a fault of project lead, team lead or other person in charge to enforce rules. Most of people are slackers, they will invest minimum effort into uninteresting things. Documentation is boring.

Doxygen is only a tool, it all depends on how you use it. To be a bit controversial, I find it useful.
"Details" command is really useful, one might say the most important one, although rarely used in most of documentation.
For languages which use exceptions, a dedicated command to list and describe them is also really useful.

If making a documentation for a third party, you can make an introduction/main page with detailed desription on subpages. I find searchable and clickable HTML documents done this way more useful than a static document made in a text editor. "Just read the code, bro" might not be valid for compiled closed-source libraries. Yes, some of us still use those.

When trying to understand large codebases, graphing tool can help to figure-out the critical parts, or to compare the code with its design documentatio or class diagrams.

And at least in some editors there should be tools available to keep doxygen comments for classes and fuctions in sync and warn you if there is a mismatch. It really helps when an IDE gives has a mass rename of "refactoring" tool without a preview when executing it.

Nominal Animal · « **Reply #58 on:** December 11, 2022, 12:47:24 pm »

Quote from: Perkele on December 11, 2022, 12:37:41 pm

Doxygen is only a tool, it all depends on how you use it. To be a bit controversial, I find it useful.

Doxygen, and things like Python docstrings (visible via pydoc3), are very useful for documenting an API, for those using the module or library.

They do not help document the intent behind the code, needed for efficiently and effectively maintaining the module or library itself. In fact, they kinda-sorta steer away from it (documenting the programmer intent), which makes them less than useful for the code maintenance purposes.

Sherlock Holmes · « **Reply #59 on:** December 11, 2022, 06:09:03 pm »

Quote from: uer166 on December 05, 2022, 06:13:16 pm

Quote from: Sherlock Holmes on December 05, 2022, 04:43:40 pm
Quote from: Siwastaja on December 05, 2022, 04:37:47 pm
Quote from: Sherlock Holmes on December 05, 2022, 04:29:46 pm
You cannot counter someone's argument by simply insinuating they are incompetent, ignoring "facts" and so on.

Unlike you, nctnico is highly competent and I deeply respect his competence. He just has a few weird "blind spots" where his opinions, which are just matter of taste which I could accept as opinions, like disliking structs in networking and preferring bit manipulation reconstruction, make him close his eyes from objective facts. I'm sure he still does excellent job.

I have offered all the relevant information, unlike you; in other words, please get out of our argument, your comments are completely unhelpful. We have enough trolls on this forum already. Thank you in advance.

I see through this Sir, when you speak it is to be regarded as "facts" as "objectively true" but the words of another (who disagrees with you) are merely "opinions" as the words of one who has "closed" their eyes, has "blind spots".

You are a brave man indeed, hiding behind a keyboard and unhesitatingly insulting me and my presumed lack of competence. Is that how you speak to coworkers who dare to disagree with you? Is that really the way you'd communicate in person? I doubt it, I bet you're as nice as pie under those circumstances, nobody who bullies as you do here, would last long in the workplace.

If you have nothing technical to add to the discussion, feel free to avoid the thread. I don't mind technical discourse but this is not useful.

I did have something to add and said that here.

My other post was not directed at you but at another who's post was - IMHO - rude, demeaning and condescending.

abyrvalg · « **Reply #60 on:** December 11, 2022, 08:16:59 pm »

uer166, note one important thing original ECU does based on BMS data: regenerative brake current limiting.
M365 BMS (and all other Xiaomi/Ninebot BMSes) doesn't have an overcharge protection on the discharge port (a sudden cutoff during regen braking would be no fun). Instead of that the BMS reports an overvoltage status to the ECU (bit 9 of reg 0x30) and the ECU decreases the regen current gradually until the OV condition disappears.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Parsing UART packets / state machines / packed structs (Read 8285 times)

Share me