UART Hardware fifo

Electronics > Microcontrollers

UART Hardware fifo

(1/11) > >>

paulca:
I am devising my first UART coms protocol between 2 MCUs.

Data is JSON messages which have no real timing or length. They arrive when they arrive and are mixed length, but around 1024-2048 and I'm using 4096 buffers in my code.

The protocol uses the age old technique of fixed length header containing a "start byte", then the length of the body... followed by the body.

I can use an interrupt, maybe with input compare on the UART (?) for that first byte. In that handler I would then read the next 2 bytes to get my 16 bit length.

My first question is... it takes some CPU cycles to get me into the ISR, where I have only read 1 byte. How long do I have to read the next 2 bytes before they get evicted from the hardware buffer? I assume this will be MCU specific and possible UART periph specific.

I could block listen on the full header, but if I get garbage on the line on the buffers or for some other reason the "start byte" is not the first byte I receive I don't want to consume any more from the buffer or I will need to shuffle it myself.

My ideal anyway is to trigger on the start-byte. (is it wise to try and use the lower 30 ASCII characters over UART? I have hardware flow control disabled and the SOH character looks to be exactly what I want, but I have a feeling those codes are used to control the UART hardware flow control. As that is OFF, I should be okay to use the SOH character as my start byte? I ask as the start of the SOH character in binary looks very much like a UART start bit.

Once I have the start byte, I can set a state machine flag and resubmit the transfer request for 2 bytes.
In the interrupt for that, after a sanity check, I can start a DMA transfer into the first 4k buffer. In the TransferComplete ISR I can flip the DMA buffer to the second buffer and set the state machine flags that the full buffer is ready to process.

Processing will involve, parsing a JSON message, doing some basic processing (max, min, avg, etc.) and ... render a 240x320 color LCD screen. The DMA transfer to draw the screen will then commence.

I suppose I just have to test it. I could also work out the expected bit rate and make sure I have plenty of overhead. On testing, I can make the buffers expandable, maybe give it 8 4k buffers and rotate them like a circular array queue. I can log over the other UART how many buffers it needs to "hold the line".

I expect I am massively over-reacting and the STM32F411RE I am using for this will not even break a sweet and will probably be fine with the default DMA double buffering.

Psi:
It depends what the baud rate is and the clock speed of the MCU. That will govern how many cycles you have to play with. But with a STM32F4 you're not going to have any problems reading the UART. Even a 10mhz 8bit AVR with no hardware buffer can read your typical UART data stream fast enough.

You just have an interrupt that copies the incoming byte (or buffer if your mcu has a hardware buffer) into an array. The interrupt also does some quick checks to identify the start /end of the packet and set some flags that you can check in main(). Once the package is done you swap the array pointer to a new array so reading can continue while you process the previous one in main().

I've never had a problem where UART data reading was so taxing that i had to resort to DMA, but yeah, you can if you need to.

I've not looked into the STM32F4 but some of them have an interrupt for "buffer 1/2 full".
Which is useful to read out the buffer without the risk of if overflowing.

voltsandjolts:
You might want to consider using COBS to form the data packets.

paulca:

--- Quote from: Psi on October 05, 2022, 11:29:56 am ---I've not looked into the STM32F4 but some of them have an interrupt for "buffer 1/2 full".
Which is useful to read out the buffer without the risk of if overflowing.

--- End quote ---

It has got the half-full interrupt, it also has hardware double buffering, just requires you check which buffer is active and which is locked in the interrupt as it's already swapped them. While in the ISR you have the opportunity to readdress the inactive DMA buffer register. So you can stack buffers if 2 are not enough.

I don't think it's the processing of the UART that is my concern. It's how that interacts with the LCD display.

Taking the simple approach of allowing the incoming message completion "event" from the ISR to process the message, update state, re-render the full screen, which will result in a huge number of floating point calculations for screen positioning/etc. Depends on how well optimized that driver code is to avoiding FPU needs. (The STM32F411 has a hardware FPU though). I might be processing that screen render for "quite a while" in clock cycle terms, many 10s or 100s of thousands before I can release the screen transfer to DMA for SPI.

I've started with 115200 baud and the SPI is about 1.2Mbit/s. I'm just writing them as they arrive from network onto the UART with an ESP8266. Then raw reading it in 255 byte buffers and ... I don't see any data loss. I am only dumping strings to the LCD display and it's optimizing a lot of them out as they are off screen.

The other approach is to decouple the data frames from the screen updates and not make the screen realtime, but fire a timer interupt aiming to achieve a frame rate of maybe 2 fps (there are no animations). The message processing can do it's filtering and enriching, update state and go back to sleep. When the screen refresh timer fires, I can mutex lock the state, copy it, unlock it, render the screen and transfer it over SPI.

That limits the concurrent monitor zone of competition onto just that memcpy to snapshot the state. It of course will need to wait if a message processor is updating the state. So that presents a problem as spinning on a mutex in an ISR is not wise or possible.

State machines then. I hate state machines, but I realise my only other choice is to use an RTOS which would make things very much more simple, after the weeks of learning curve to get started with it. (it's on the list).

paulca:
Looking at the positives.

The F411 renders the bouncing, spinning ball animation at over 20 FPS. If it can render those frames in under 50ms. Then that's not a long time in terms of UART baud.

I should also, in production builds be able to time and fix the render time fairly accurately and change timings to suit where and when is best to do it.

Messages very bursty with 4 or 5 arriving near simultaneously, but message loss is not a big deal if it happens occasionally, such as a failed parity or buffer over/under run. As long as I can catch those events and "Carry on".

Navigation

[0] Message Index

[#] Next page

There was an error while thanking

Thanking...

Go to full version