Author Topic: UART Hardware fifo  (Read 6181 times)

0 Members and 1 Guest are viewing this topic.

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
UART Hardware fifo
« on: October 05, 2022, 09:26:12 am »
I am devising my first UART coms protocol between 2 MCUs.

Data is JSON messages which have no real timing or length.  They arrive when they arrive and are mixed length, but around 1024-2048 and I'm using 4096 buffers in my code.

The protocol uses the age old technique of fixed length header containing a "start byte", then the length of the body... followed by the body.

I can use an interrupt, maybe with input compare on the UART (?) for that first byte.  In that handler I would then read the next 2 bytes to get my 16 bit length.

My first question is...  it takes some CPU cycles to get me into the ISR, where I have only read 1 byte.  How long do I have to read the next 2 bytes before they get evicted from the hardware buffer?  I assume this will be MCU specific and possible UART periph specific.

I could block listen on the full header, but if I get garbage on the line on the buffers or for some other reason the "start byte" is not the first byte I receive I don't want to consume any more from the buffer or I will need to shuffle it myself.

My ideal anyway is to trigger on the start-byte.  (is it wise to try and use the lower 30 ASCII characters over UART?  I have hardware flow control disabled and the SOH character looks to be exactly what I want, but I have a feeling those codes are used to control the UART hardware flow control.  As that is OFF, I should be okay to use the SOH character as my start byte?  I ask as the start of the SOH character in binary looks very much like a UART start bit.

Once I have the start byte, I can set a state machine flag and resubmit the transfer request for 2 bytes.
In the interrupt for that, after a sanity check, I can start a DMA transfer into the first 4k buffer.  In the TransferComplete ISR I can flip the DMA buffer to the second buffer and set the state machine flags that the full buffer is ready to process.

Processing will involve, parsing a JSON message, doing some basic processing (max, min, avg, etc.) and ...   render a 240x320 color LCD screen.  The DMA transfer to draw the screen will then commence.

I suppose I just have to test it.  I could also work out the expected bit rate and make sure I have plenty of overhead.  On testing, I can make the buffers expandable, maybe give it 8 4k buffers and rotate them like a circular array queue.  I can log over the other UART how many buffers it needs to "hold the line".

I expect I am massively over-reacting and the STM32F411RE I am using for this will not even break a sweet and will probably be fine with the default DMA double buffering.
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline Psi

  • Super Contributor
  • ***
  • Posts: 9930
  • Country: nz
Re: UART Hardware fifo
« Reply #1 on: October 05, 2022, 11:29:56 am »
It depends what the baud rate is and the clock speed of the MCU. That will govern how many cycles you have to play with. But with a STM32F4 you're not going to have any problems reading the UART. Even a 10mhz 8bit AVR with no hardware buffer can read your typical UART data stream fast enough.

You just have an interrupt that copies the incoming byte (or buffer if your mcu has a hardware buffer) into an array.  The interrupt also does some quick checks to identify the start /end of the packet and set some flags that you can check in main().  Once the package is done you swap the array pointer to a new array so reading can continue while you process the previous one in main().

I've never had a problem where UART data reading was so taxing that i had to resort to DMA, but yeah, you can if you need to.

I've not looked into the STM32F4 but some of them have an interrupt for "buffer 1/2 full".
Which is useful to read out the buffer without the risk of if overflowing.
« Last Edit: October 05, 2022, 11:42:27 am by Psi »
Greek letter 'Psi' (not Pounds per Square Inch)
 

Offline voltsandjolts

  • Supporter
  • ****
  • Posts: 2297
  • Country: gb
Re: UART Hardware fifo
« Reply #2 on: October 05, 2022, 11:52:46 am »
You might want to consider using COBS to form the data packets.
 
The following users thanked this post: paulca

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #3 on: October 05, 2022, 12:07:32 pm »
I've not looked into the STM32F4 but some of them have an interrupt for "buffer 1/2 full".
Which is useful to read out the buffer without the risk of if overflowing.

It has got the half-full interrupt, it also has hardware double buffering, just requires you check which buffer is active and which is locked in the interrupt as it's already swapped them.  While in the ISR you have the opportunity to readdress the inactive DMA buffer register.  So you can stack buffers if 2 are not enough.

I don't think it's the processing of the UART that is my concern.  It's how that interacts with the LCD display.

Taking the simple approach of allowing the incoming message completion "event" from the ISR to process the message, update state, re-render the full screen, which will result in a huge number of floating point calculations for screen positioning/etc. Depends on how well optimized that driver code is to avoiding FPU needs.  (The STM32F411 has a hardware FPU though).  I might be processing that screen render for "quite a while" in clock cycle terms, many 10s or 100s of thousands before I can release the screen transfer to DMA for SPI.

I've started with 115200 baud and the SPI is about 1.2Mbit/s.  I'm just writing them as they arrive from network onto the UART with an ESP8266.  Then raw reading it in 255 byte buffers and ... I don't see any data loss.  I am only dumping strings to the LCD display and it's optimizing a lot of them out as they are off screen.

The other approach is to decouple the data frames from the screen updates and not make the screen realtime, but fire a timer interupt aiming to achieve a frame rate of maybe 2 fps (there are no animations).  The message processing can do it's filtering and enriching, update state and go back to sleep.  When the screen refresh timer fires, I can mutex lock the state, copy it, unlock it, render the screen and transfer it over SPI.

That limits the concurrent monitor zone of competition onto just that memcpy to snapshot the state.  It of course will need to wait if a message processor is updating the state.  So that presents a problem as spinning on a mutex in an ISR is not wise or possible.

State machines then.  I hate state machines, but I realise my only other choice is to use an RTOS which would make things very much more simple, after the weeks of learning curve to get started with it.  (it's on the list).

"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #4 on: October 05, 2022, 12:18:12 pm »
Looking at the positives.

The F411 renders the bouncing, spinning ball animation at over 20 FPS.  If it can render those frames in under 50ms.  Then that's not a long time in terms of UART baud.

I should also, in production builds be able to time and fix the render time fairly accurately and change timings to suit where and when is best to do it.

Messages very bursty with 4 or 5 arriving near simultaneously, but message loss is not a big deal if it happens occasionally, such as a failed parity or buffer over/under run.  As long as I can catch those events and "Carry on".
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline tellurium

  • Regular Contributor
  • *
  • Posts: 226
  • Country: ua
Re: UART Hardware fifo
« Reply #5 on: October 05, 2022, 12:53:43 pm »
The alternative approach is to pre-allocate a buffer for UART JSON messages, like max of 4k.
Then the UART reader can read byte-by-byte, appending to a buffer. Either polling or interrupt would work just fine.
When a marker (a newline for example) appears, then the JSON message is fully buffered - you can parse it, handle, and clear the receive buffer.

A newline is a good marker cause it cannot be present in the JSON strings.
Also, the whole comms will be human readable.
Open source embedded network library https://mongoose.ws
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 

Online Siwastaja

  • Super Contributor
  • ***
  • Posts: 8168
  • Country: fi
Re: UART Hardware fifo
« Reply #6 on: October 05, 2022, 01:44:37 pm »
Assuming baud rate is significantly slower than CPU clock, just processing byte-by-byte in ISR tends to be the most trivial solution. Just try not to do actual JSON parsing in such one-byte handler.

For example, assuming one megabaud per second, one start bit and one stop bit, ISRs can occur every 10us. At CPU clock of 50MHz, this would be 500 CPU clock cycles. Given approximately 20 cycles for ISR entry and exit, the remaining 480 cycles is plentiful for simple processing like setting a counter, writing to an array, comparing if enough bytes have written, and so on.

What you need to look at, if you have other interrupt sources with same or higher priority enabled, what is the worst case execution time for those? They can postpone your UART ISR from getting started, so you need to verify the total time is still below the interval of two bytes arriving from the peripheral.

DMA is usually not very helpful because you still need to parse the few first bytes to get the transfer length, and your timing must work out for the header. If it does, the same ISR approach will then work out for the complete message. DMA can of course leave more average CPU time for other tasks.

One stupidly trivial thing I have sometimes done is that whenever I need a periodic timer handler (for example, 1ms or 100us tick) anyway, then I pick UART byte rate slower than this, so I don't need to have UART interrupt at all, just check if a byte is available in the periodic timer handler. Because of how the UART works, it physically cannot generate interrupts faster than this.
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #7 on: October 05, 2022, 02:09:03 pm »
Well testing the protocol (sending side, ESP8266) and I came face to face with Arduino framework's "Hey kids it's ok, we don't tell you we only write/print uint8_t, but, just to keep you from seeing those pesky compiler warnings, we will just cast anything you send to uint8_t for you.  You're welcome."

        inline size_t write(unsigned int t) { return write((uint8_t)t); }

Nice.  It took quite a few WTF moments before I found that.  I suppose it was better than what Arduino framework did with write( (uint8_t*)&a16bitInt, 2) ... LOL  Nope!

At least now I have found it I can split my 16bit word myself.  It's just so annoying that they hide it all under polymorphic overloaded, inherited "Print" class code.


The funny thing is..
write( (uint8_t*)&a16bitInt, 2)
On the STM compiler side gave me a warning about ... basically being a dick.
On the Arduino framework side, it went, "Sure, whatever dude." and then did something completely nuts.
« Last Edit: October 05, 2022, 02:11:08 pm by paulca »
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #8 on: October 05, 2022, 02:20:18 pm »
Well for now I have muted the ESP8266 networking code completely.  I constructed a known static message and am sending that once a second.  Basically a test harness.

I am seeing issues with just "block reads" straight away.  It looks like the UART buffer alignment is screwed up or something.

So I read 1 byte.  Check if it's  "~" and loop until it is.
Then I read 2 bytes and examine them in the debugger and ... I find it contains \n~  which is the last byte of the previous message and the previously read single byte.  It's not even the same memory address these are returned in.  They are individual and separate buffers.  It's like the memcpy out of the UART buffers is maybe 32bit aligned and I'll keep getting repeats until I read at least 4 bytes? 

I'll keep digging.
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6239
  • Country: fi
    • My home page and email address
Re: UART Hardware fifo
« Reply #9 on: October 05, 2022, 03:37:58 pm »
I'd use a stream format, where each message starts with a reserved start byte, followed by the data (that can contain any 8-bit code points except reserved ones), followed by a reserved checksum byte, followed by DJB2 XOR checksum using say 7-bit codes 128-255 (if not reserved), followed by a final reserved end byte.  Data between an end byte and start byte are ignored, and can be used as a heartbeat/keepalive signal.

This lets one parse the data while it is being received, or buffer it.  With reserved start and end bytes, a single linear pass over a circular buffer will tell you exactly where the messages are in the buffer; you only need to know the index of the most recently received byte to determine in which order they were received.  Many MCUs have a comparator interrupt, so that when the reserved end byte is received, an interrupt is fired; this is useful for marking the input buffer (and setting RTS when the buffer is close enough to full, so that existing data is not overwritten).  For binary data, you'll need a fourth reserved byte to use as an escape code.  With at most three-byte escape sequences, your messages have maximum overhead of 33%, and that only in very specific and rare cases, and a very simple buffer parser/message copier can handle both de-escaping and checksumming.

I probably would extend the format so that it begins with a one to four character identifier, with the other end responding with an acknowledgement or resend request, when a transmission error occurs.  It depends on the needs.

Yes, indeed: I do prefer to overengineer things.  Here, I want the underlying transport to be reliable; at minimum, detect when a transmission error occurs, so my code won't operate on faulty inputs.
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #10 on: October 05, 2022, 04:37:25 pm »
Well.  I brushed off the old USB logic analyser just so I could be 100% sure I am sending the data correctly.

I am.

7E 2E 00 54 45 53

~
46
0
T
E
S

Thats my start byte '~' and a 16bit length LSB first, then the start of the message.  TES(T_TOPIC/+/#/TOPIC)

That's not what the F411 reads though.
It reads:

~\n~

OR

~\0~
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Online PCB.Wiz

  • Super Contributor
  • ***
  • Posts: 1535
  • Country: au
Re: UART Hardware fifo
« Reply #11 on: October 05, 2022, 10:39:07 pm »
My first question is...  it takes some CPU cycles to get me into the ISR, where I have only read 1 byte.  How long do I have to read the next 2 bytes before they get evicted from the hardware buffer?  I assume this will be MCU specific and possible UART periph specific.

Data only gets 'evicted' from hardware, when there is no room left for more data in the FIFOs. (ie empty rate is too slow)
At extreme speeds (some Megabaud) I have written code to recheck for another byte just before exiting the ISR which helps nudge up the sustained peak speed, but that's a rare case.

I've started with 115200 baud and the SPI is about 1.2Mbit/s.  I'm just writing them as they arrive from network onto the UART with an ESP8266.  Then raw reading it in 255 byte buffers and ... I don't see any data loss.  I am only dumping strings to the LCD display and it's optimizing a lot of them out as they are off screen.
115200 baud is quite slow, it's just over a byte a millisecond in 87us
« Last Edit: October 06, 2022, 04:37:05 am by PCB.Wiz »
 

Offline coppice

  • Super Contributor
  • ***
  • Posts: 8637
  • Country: gb
Re: UART Hardware fifo
« Reply #12 on: October 05, 2022, 11:25:15 pm »
115200 baud is quite slow, it's just over a byte a millisecond.
Its just over 10 bytes per millisecond.
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #13 on: October 06, 2022, 09:18:54 am »
I gave up at it last night.  I ended up starting a new project with just a UART read in it. 

The first calls to UART receive gave me my '~' start byte and then the garbage ~\n~.

Then I realised I hadn't actually connected the UART pins.  So there was zero signal!  So, I can only conclude that the HAL libraries and bizarrely borked or I'm doing something hideously wrong.

Why, after a fresh boot, when I call receive to get 1 byte, and then 4 bytes, does it return immediately with data that had been in the chip for hours while it wasn't running!

I must be missing something basic, like a start up call or something, I'll have another go today.

I may also add a buffer drainer at the start to read 1024 bytes and throw them away before starting to read.
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #14 on: October 06, 2022, 11:38:17 am »
So.  It's time to trawl the reference manual. 

If I call the HAL_UART_Receive function for 1 byte, I get garbage, which jumps around but ultimately just repeats the same garbage.

If I call the HAL_UART_Receive function for 4 bytes, I get a slightly corrupted stream of data.

If I call the HAL_UART_Receive function for 50 bytes, I get a nice sane data stream.

While it's possible that the overheads of using the high level HAL function for 1 byte reads is too much, maybe even 4 bytes it's dropping data and getting out of sync because of overheads?

The fact that 1 byte seems to loop the same garbage makes me suspicious.  The code at the bottom of the receive function spins on the ready to read register and ultimately reads a 32bit register and shifts it to be an 8 bit uint.  I couldn't quite determine what causes the "DR" register to advance and by how much.  I assume the "ready flag" spinner waits for another read.  I suppose the 32bit register is just because ARM all registers are 32bit, probably including peripheral data registers for 8 bit data.

But something somewhere is broken.

I'm going to retry this with the interupt based approach to see if I can get single byte reads working.  Then I'll try with register banging myself to see what is actually going on.  I expect I'll be staring at many a bus architecture and datasheets for a while.
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #15 on: October 06, 2022, 06:23:27 pm »
The problem came down to taking too long writing the last buffer out to the debug port and by the time I was returning to receive on first UART again it had over run it's buffer, that result in the state machine coming apart a little and dead locking in some cases.  I think you are meant to handle the overrun error in the error call back, but emptying the buffer and reseting it back to ready state.

Anyway.  I gave up on it and implemented things  with interrupts instead.  I had way less problems unsurprisingly.

Well this part of the protocol works now at least.

Code: [Select]
void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart)
{
  if(huart == &message_uart) {
      switch(protocol_state) {
case INIT:
  break; // No op
case START_BYTE_WAIT:
  if(buffer[0] == START_BYTE) {
      protocol_state=START_BYTE_READ_OK;
  } else {
      HAL_UART_Receive_IT(&message_uart, buffer, 1);
  }
  break;
case START_BYTE_READ_OK:
  break;//noop
case LENGTH_READ_WAIT:
  protocol_state = LENGTH_READ_OK;  // we think
  break;
case LENGTH_READ_OK:
  break;//noop
case PAYLOAD_READ_WAIT:
  protocol_state = PAYLOAD_READ_OK;
  break;
case PAYLOAD_READ_OK:
  break; //noop
default:
  debugPrint("PROTOCOL ERROR 2");
  break;
      }
  }
}
/**
  * @brief  The application entry point.
  * @retval int
  */
int main(void)
{
  HAL_Init();
  SystemClock_Config();
  MX_GPIO_Init();
  MX_USART6_UART_Init();
  MX_USART2_UART_Init();
 
  while (1)
  {
      switch (protocol_state){
case INIT:
  protocol_state = START_BYTE_WAIT;
  HAL_UART_Receive_IT(&message_uart, buffer, 1);
  break;
 
case START_BYTE_WAIT:
  break; // no op
 
case START_BYTE_READ_OK:
  protocol_state = LENGTH_READ_WAIT;
  HAL_UART_Receive_IT(&message_uart, buffer, 2);
  break;
 
case LENGTH_READ_WAIT:
  break; //no op
 
case LENGTH_READ_OK:
  protocol_state = PAYLOAD_READ_WAIT;
  uint16_t pktLength = (buffer[0] ) | (buffer[1]<<8);  // TODO test!
  HAL_UART_Receive_IT(&message_uart, buffer, pktLength);
  break;
 
case PAYLOAD_READ_WAIT:
  break; // no op
 
case PAYLOAD_READ_OK:
  debugPrintB(buffer, pktLength);
  protocol_state = INIT;
  break;
default:
  debugPrint("PROTOCOL ERROR 1");
  break;
      }
  }
}
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4199
  • Country: us
Re: UART Hardware fifo
« Reply #16 on: October 06, 2022, 10:13:15 pm »
Yeah, I really don't understand HAL UART receive functions that take a buffer address and a count.  That might make sense for higher-level APIs, but at the bottom level it seems like either a gratuitous extra copy, and without a more complex buffer-chaining scheme of some kind, and invitation to losing data.  A traditional interrupt-driven circular buffer seems more appropriate, or a considerably more complex DMA-based scheme with chaining...
(And it's not just ST; the (bad) idea seems popular with many vendors.  Including the ARM CMSIS DRIVER spec.)
 
The following users thanked this post: newbrain

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #17 on: October 06, 2022, 10:28:37 pm »
Well, with those hurdles out of the way I went ahead and created a kinda ring buffers system.

Not the way you are thinking though, I mean a ring of buffers, not a ring buffer.

Look, it's a Queue.

Basically I have an array of buffers. 

The message reading state machine requests a new buffer in INIT state.  If it doesn't get one, it gets NULL, it does not advance the state.  (This is extremely unlikely, but would probably result in data loss).

After it's gone through it's full life cycle and back to init it requests another write buffer.

On requesting a new write buffer the count of ready buffers is incremented.

A second state machine sitting in INIT will be trying to get a pointer to the next ready buffer and while it's NULL it will not advance.

As soon as it gets a (now called) locked buffer, it is free to do with it as it chooses.

It currently writes it to the debug serial and the debug serial TX complete call back returns it back to INIT.

The next step if for that debug serial print to be replaced with the actual updating of the model and rendering the memory buffer for the screen if appropriate, then launching the DMA transfer for the LCD.

...

When I say "Queue", it's going to get a lot more funky, because I need the queue to overwrite the oldest data if it becomes full.  However, it has to bypass the "locked buffer" in use.  If I just skip it, then I lose chronology.  That probably isn't a problem in the use case, but it's janky unless I solve it.  I could end up with a linked list of buffers and reorder them so that the reader always gets the "next" oldest data.  Basically a FIFO with one of the buffers allowed to be out of sync temporarily.

You might ask, why not just have 2 buffers, one being used, one being overwritten until I'm ready.  Trouble is I'm not just receiving messages on one "topic" a full render might include a dozen different metrics, all in separate topics and separate messages at separate times.

I'm having fun now :D
« Last Edit: October 06, 2022, 10:31:34 pm by paulca »
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Online Siwastaja

  • Super Contributor
  • ***
  • Posts: 8168
  • Country: fi
Re: UART Hardware fifo
« Reply #18 on: October 07, 2022, 09:47:07 am »
I like the following:

Have two buffers: buf_a, buf_b. Have pointers: accumulation = &buf_a, processing = &buf_b. Higher priority ISR writes into *accumulation. Once it detects the message is finished, it swaps the pointers, and uses NVIC->STIR to trigger a lower-priority interrupt, which then accesses *processing to do parsing / whatever. All you need to do is to make sure the processing function finishes before the next full dataset is collected. Of course use volatile bool processing_running which you set before triggering processing function, and reset at the end of processing function. Then before triggering the STIR, you can check if this has returned to zero as it should be, if not report error. In any case, this is not a substitute for actually proving to yourself the worst-case timing. When you have different types of messages, it obviously takes different time to process, and also different time to accumulate. The worst sequence is slow-to-process message immediately followed by the shortest possible message (might be even malformed; the key metric is: what is the shortest message the accumulation ISR will trigger processing for?). Make sure it works out, and everything else works out, too.

The same idea can be easily extended from buffer swapping into a FIFO, but the returns are diminishing. With double buffering, you decouple the processing timing and receiving timing so that the processing does not need to happen within one BYTE interval, but within one MESSAGE interval, which can be, by design, orders of magnitude longer. By introducing a FIFO, you still have to process each message within one message interval, just that if processing timing varies, it's now more about average than worst-case timing. But even if you have 16-element FIFO, it's entirely possible you have 16 worst-case messages in succession, in which case it's not any better than simpler double buffering; just more prone to false sense of security, and harder to prove.

FIFO is useful when you can easily prove, by design, that there are burst of messages but they are always followed by some silence. And because a FIFO is Just A Few Lines Of Code and offers a familiar and simple interface (one process pushes, other pops), it's not a bad idea, even if it's just a few elements long.
« Last Edit: October 07, 2022, 09:54:26 am by Siwastaja »
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #19 on: October 07, 2022, 10:25:40 am »
Interesting thanks.  You raise a thorny point.  I am not attempting any atomic protection on state machine enums or buffer pointes.  A statement as simple as buffer++ could be interrupted part way through.

However, I am also away I do only have a single core, so I can't have concurrent variable/register access. So I'm just working to avoid contention zones.  The first "Wow, good catch!", was when I attempted to re-use my pkt_length variables between the reader and processor state machines.  That would have been messy.

I lay awake last night trying to work out how to actually process this data, and how exactly am I supposed to render a screen that's 320x240x24 ... a frame buffer would be 70K for 8 bit colour!

When I thought about both, firstly 90% of the data coming in, should be got rid of ASAP.  Out of the entire JSON blob, I only want two key:value pairs.  Ironically called "key" and "value".

I had to slap my Java self who splurted out, "HashMap".  No.  At some point I have to drop the generic approach and actually state, somewhere, what data I expect and what I am going to do with it in concrete form.

So a key like, say, "livingroom_media_power_power" (don't ask about the repeat).  Does not need me to start allocating variable length strings and storing the MQTT version of the key.  I can string match it and save it as a int from a look up table.  Stored right along with it's Display text and even a pointer to it's render function.

Which brings me to how to render such a large display with such little memory and processing power.  The answer is... don't.

Don't render the whole screen.  Stack it and selectively render or re-render portions as they are needing to be updated.

This brings the happy path of a message down to...

(I'm going to call them threads, as they kind of are jury rigged poorman threads)
Message_UART "thread" finishes the payload, advances state machine which hands buffer off and take a new writebuffer.
The processing, "thread" accepts the freshly written buffer, extracts and resolves the key in a lookup table, updates the struct for it's display element with the new_value, etc, and sets the "dirty" flag to true and moves on.

At this point the state is back to a fixed size with pre-allocated memory only.  Phew.

A low priority 100ms(?) timer fires the render "thread" another state machine, which will render ONE dirty display struct per iteration until no more remain.

It's really fun doing this "poor mans threads", it's a bit like programming multi-threaded code in Python!

Yes, I know, it will be all well and good, singing and dancing until I start to get odd concurrency errors and my mood will diminish in hours.
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #20 on: October 07, 2022, 10:37:27 am »
On performance testing...  and testing how it behaves when you hit it with garbage ....

That happens as part and parcel of my test rig.  Anytime I restart the test harness (including the ESP8266 MQTT client), the MQTT server continues to queue MQTT messages on my topics until the heartbeats time out after a minute.

So, yes..  LOL, as soon as the STM32 boots, the ESP8266 starts sending it a continuous stream of all the queued messages, the UART is pinned.  I have not rigorously tested the output on the debug port, but it remains text and the newlines remain in the right places.  My buffer switch LEDS go dim.

I reprogram of the ESP8266 tests how the STM32 code behaves when hammered with garbage.  The ESP8266 Serial0 is also used at 1M baud for flashing the 8266.  The STM32 is not amused, but nothing comes through on the payload buffers, just a few firings of the error handlers and it picks up straight away on the first message.  I do need more protection to filter out garbage which might contain a '~' from generating a payload of random length.  I do check the length, empty the buffer, truncate it and then drop it at the end of the statemachine (don't advance the buffer).

"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Online Siwastaja

  • Super Contributor
  • ***
  • Posts: 8168
  • Country: fi
Re: UART Hardware fifo
« Reply #21 on: October 07, 2022, 10:41:01 am »
IMHO, the elephant in the room is, do you really need to use such heavy-weight protocol such as JSON in this UART link? Whatever is communicating with the MCU can't talk a simpler ASCII protocol, or even better, binary?
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #22 on: October 07, 2022, 10:55:51 am »
IMHO, the elephant in the room is, do you really need to use such heavy-weight protocol such as JSON in this UART link? Whatever is communicating with the MCU can't talk a simpler ASCII protocol, or even better, binary?

Absolutely, and I am currently deliberately trying to stuff it down the UART pipe.

I don't really want the ESP8266 to need to know what the STM32 is doing.  I'm trying to keep it as a re-usable compontent, which you configure via Wifi as normal and accepts MQTT topic subscriptions over UART and does the needful.

It is rolling around in my head, that a simple "JSON filter/selector" config could list top level elements to keep while stripping everything else.  So a message like:

{"key": "mainsTotal", "name": "Mains Total", "shortName": "MT", "value": 9197730.8, "type": "float", "units": "Wh", "timestamp": 1665095673.7407892}

With a selector config of:

key,value

Gets send on the wire as just:

{"key": "mainsTotal", "value": 9197730.8}

If I was going to (and I might) make the MQTT module more aligned with my own systems I would have it validate the timestamp for "staleness" for a start and maybe include the "name" and maybe "units".

There already is a Fork of WifiManager w/ PubSubClient out there.  I could technically fork that and add the UART command/message channels and release it back to the wild.  Then I can branch it for my own more custom work.
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #23 on: October 07, 2022, 11:04:47 am »
On binary and even bare ASCII protocols.  I know they are common in the embedded space for size/overhead considerations.

However, when I started my automation system I did indeed start with a bare ascii protocol.  Messages like:
KEY|NAME|UNITS|VALUE^

I moved to JSON after I spent a few evening renumerating over how to better design it to be more flexible.  I wanted to encapsulate the messages and define them, force structure on them in use etc.

After too much debate, the need to having common defintions of what are effectively ASCII structs and having to keep updating all the various and expanding number of components involved in the "data fabric", that JSON seemed worth the cost, even on MCUs.  Granted I was using ESP8266s and ESP32s which are no way shy on memory or power.
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4031
  • Country: gb
Re: UART Hardware fifo
« Reply #24 on: October 07, 2022, 11:07:43 am »
A bigger elephant in the room is...

An ESP32 with dual core and RTOS would make this all work on a single MCU without any UART and it would probably be capable of framebuffer rendering the display continously on the second core.

The reason is that I want to learn the STM32 and I also want to learn inter-MCU communications.  If I'm going to expand to (hopefully)  CODEC / DSP boards, I had best get comfortable with ARM cores and high speed peripherals.
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf