Author Topic: FreeRTOS and TLS - task priority problem  (Read 2795 times)

0 Members and 1 Guest are viewing this topic.

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3701
  • Country: gb
  • Doing electronics since the 1960s...
FreeRTOS and TLS - task priority problem
« on: October 21, 2023, 07:59:55 pm »
I have dealt with this issue in various other parts of this project but here I am not seeing an obvious solution and wonder if anyone can think of something.

I have TLS running, once a minute. It runs at a low priority, but grabs the CPU totally for up to 3 seconds.

I have a data copy process running between two serial ports (each having interrupt driven tx and rx buffers, size 1k). This runs at a high priority because  I don't want data loss due to the (baud rate limited) data input filling up the rx buffer faster than I can empty it. This runs in a loop with osDelay(10) so every 10ms it yields to the RTOS and allows other tasks to run (tasks of all priorities). The rest of the product has been written to yield to RTOS when waiting, so everything runs nicely.

The problem is that the 3 sec TLS activity inevitably causes serial data loss unless I have an interrupt driven rx buffer big enough for 3 secs at max baud rate, which is too much RAM. This happens because the osDelay(10) lets in any task for as long as it wants.

Apart from diving into TLS and inserting osDelay(1) in there somewhere, I can't see an obvious fix. TLS is a huge chunk of code...

I have a global flag TLS_active which can be used to stop doing something which you can expect will get buggered up by TLS.

Pre-emption is enabled in FreeRTOS, and this rather complex product does all work great.

Thank you for any suggestions.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 19522
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: FreeRTOS and TLS - task priority problem
« Reply #1 on: October 21, 2023, 08:37:56 pm »
Priorities often cause more problems than they solve.

I'm a fan of the half-sync-half-async design pattern.

Generally I prefer any i/o to cause an interrupt. The ISR captures the information in the event and puts it in a FIFO. The background task is a busy loop which waits until there is event info in that FIFO, grabs it and processes it to completion. That completion might involve putting another event in that FIFO.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online mikerj

  • Super Contributor
  • ***
  • Posts: 3240
  • Country: gb
Re: FreeRTOS and TLS - task priority problem
« Reply #2 on: October 21, 2023, 08:40:38 pm »
I have TLS running, once a minute. It runs at a low priority, but grabs the CPU totally for up to 3 seconds.

Pre-emption is enabled in FreeRTOS, and this rather complex product does all work great.

This doesn't make a lot of sense, why is your low priority task preventing higher priority tasks from running?  Are you disabling the scheduler in the TLS task e.g. do you have an extremely long critical section?

This runs in a loop with osDelay(10) so every 10ms it yields to the RTOS and allows other tasks to run (tasks of all priorities)

This doesn't sound quite right either, osDelay(10) will yield for 10ms, not necessarily every 10ms unless the execution time of the task is negligible.

Does your serial port data copy process have to run at an RTOS priority level i.e. is it calling RTOS APIs?  If you do need to call RTOS functions could you run the copy process with higher priority interrupts (higher priority than configMAX_SYSCALL_INTERRUPT_PRIORITY) and then trigger a lower priority interrupt to call any RTOS functions you may need?
 
The following users thanked this post: newbrain

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 3727
  • Country: us
Re: FreeRTOS and TLS - task priority problem
« Reply #3 on: October 21, 2023, 09:00:41 pm »
This:

This happens because the osDelay(10) lets in any task for as long as it wants.

 And this:

Quote
Pre-emption is enabled in FreeRTOS,

Are contradictory.

You need to make sure than on exit from the serial port ISR, the high priority task that is waiting on the interrupt fifo is marked runnable, and that the scheduler schedules it next rather than returning to the TLS code.

If there is no high priority task waiting on the serial fifo, then there needs to be one.
 
The following users thanked this post: newbrain

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3701
  • Country: gb
  • Doing electronics since the 1960s...
Re: FreeRTOS and TLS - task priority problem
« Reply #4 on: October 22, 2023, 08:01:51 am »
Quote
Priorities often cause more problems than they solve.

I agree totally; relying on a complicated priority hierarchy is a great way to get into a mess eventually. For a long time on this project I had it running great with everything in IDLE priority (and interrupt-driven I/O, obviously) so basically the RTOS was switching round-robin at 1kHz, but eventually this had to be partly abandoned because stuff like ETH and USB needed faster servicing.

Quote
why is your low priority task preventing higher priority tasks from running? 

A very good question... TLS performs a lot of crypto stuff (RSA or EC) and it holds the CPU during this time, but the RTOS should still be slicing stuff at 1kHz.

Quote
Are you disabling the scheduler in the TLS task e.g. do you have an extremely long critical section?

No; it is just a chunk of code which wasn't written for an RTOS etc.

Quote
This doesn't sound quite right either, osDelay(10) will yield for 10ms, not necessarily every 10ms unless the execution time of the task is negligible.

Yes; what I have is

while(1)
{
 ...
 ...
 ...
 osDelay(10);
}

and this produces a 100Hz rate around the loop. The code in the loop takes a relatively negligible time. It is checking for data in the rx fifo, space in the tx fifo, and if these are met it moves data no bigger than the size of a local buffer (256).

TLS runs at the priority of the calling task, which is quite low. I have tried this data copy loop at the highest possible and it still gets blocked in the osDelay() call.

I agree it doesn't make much sense. It is exactly like TLS was blocking RTOS switching, but it isn't.

Quote
You need to make sure than on exit from the serial port ISR, the high priority task that is waiting on the interrupt fifo is marked runnable, and that the scheduler schedules it next rather than returning to the TLS code.

Now that's interesting. That loop (reproduced below) doesn't need to run at all unless some data has arrived in the rx fifo i.e. serial_get_ipqcount()>0. That is another approach which would avoid the need to use the osDelay() call to enable the rest of the box to run. The gotcha is that solid rx data will prevent other stuff running... that old problem :)

I have never interacted with FreeRTOS from within an ISR, and there seem to be lots of problems with it. A part of the FR API can be called from ISRs.

Code: [Select]

void vCopyser1(void *pvParameters)
{

uint8_t buffer[256];
uint16_t tomove;
uint16_t bytes1;
uint16_t bytes2;

// Initial startup delay.
osDelay(1000);

while(1)
{

// Get data from 1st port into temp buffer and output it to the 2nd port

bytes1 = serial_get_ipqcount(copyser1_port1);
bytes2 = serial_get_opqspace(copyser1_port2);

// Run this only if the data can be copied without blocking anywhere

if ( (bytes2>=bytes1) && (bytes1>0) )
{
tomove = MIN(bytes1,sizeof(buffer));
serial_receive(copyser1_port1, buffer, tomove);
serial_transmit(copyser1_port2, buffer, tomove);
}

// And in the other direction

bytes1 = serial_get_ipqcount(copyser1_port2);
bytes2 = serial_get_opqspace(copyser1_port1);

// Run this only if the data can be copied without blocking anywhere

if ( (bytes2>=bytes1) && (bytes1>0) )
{
tomove = MIN(bytes1,sizeof(buffer));
serial_receive(copyser1_port2, buffer, tomove);
serial_transmit(copyser1_port1, buffer, tomove);
}

osDelay(10); // give time for other RTOS tasks run

}

}

« Last Edit: October 22, 2023, 08:36:30 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 19522
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: FreeRTOS and TLS - task priority problem
« Reply #5 on: October 22, 2023, 09:17:26 am »
Quote
Priorities often cause more problems than they solve.

I agree totally; relying on a complicated priority hierarchy is a great way to get into a mess eventually. For a long time on this project I had it running great with everything in IDLE priority (and interrupt-driven I/O, obviously) so basically the RTOS was switching round-robin at 1kHz, but eventually this had to be partly abandoned because stuff like ETH and USB needed faster servicing.

Yes, and the key word there is "eventually". I'd also add "unpredictably and intermittantly", and "at the most inconvient time".
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3701
  • Country: gb
  • Doing electronics since the 1960s...
Re: FreeRTOS and TLS - task priority problem
« Reply #6 on: October 22, 2023, 12:44:32 pm »
OK how could one troubleshoot this? At some point the RTOS switches to a task which hangs up this loop for 3 seconds.

Unless pre-emption is somehow getting temporarily disabled, that task must be running at some higher priority, no?

I know what is running (the TLS PK crypto) but I don't know why it apparently stops task switching.

A bizzare data point is that another RTOS task (one which just flashes some LEDs) running at an even lower priority, continues to run!
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 19522
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: FreeRTOS and TLS - task priority problem
« Reply #7 on: October 22, 2023, 12:55:48 pm »
OK how could one troubleshoot this?

I'm not familiar with FreeRTOS, so this is mere speculation...

Consider priority inheritance issues.

Can you inspect the RTOS's internals at "suitable" intervals to see what it thinks is happening. I'd look for tasks, priorities, task state (runnable, etc).

Can you debug the RTOS, e.g. use a processor's debugging features to tag when the RTOS scheduler is entered/exited.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3701
  • Country: gb
  • Doing electronics since the 1960s...
Re: FreeRTOS and TLS - task priority problem
« Reply #8 on: October 22, 2023, 01:55:23 pm »
OK I put in some GPIO waggling and I find the loop continues to run during the 3 secs. What is happening is that there is a brief pause when TLS starts the processing and that causes the data loss.

That makes a lot more sense.

I will report on what I find.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 19522
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: FreeRTOS and TLS - task priority problem
« Reply #9 on: October 22, 2023, 02:01:13 pm »
OK I put in some GPIO waggling and ...

Always a useful technique, provided the waggle doesn't invoke any RTOS activity :)

I also like appending something to a (sufficiently large) buffer, with the same proviso. The something could be a byte or a timestamp, the buffer could be global (with mutual exclusion) or thread-local.
« Last Edit: October 22, 2023, 02:04:28 pm by tggzzz »
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online mikerj

  • Super Contributor
  • ***
  • Posts: 3240
  • Country: gb
Re: FreeRTOS and TLS - task priority problem
« Reply #10 on: October 22, 2023, 04:18:26 pm »
OK I put in some GPIO waggling and I find the loop continues to run during the 3 secs. What is happening is that there is a brief pause when TLS starts the processing and that causes the data loss.

That makes a lot more sense.

I will report on what I find.

Is your TLS code hogging a mutex that other tasks require?  If you get stuck then it may be worth trying the 10 day free trial of Tracealyser, it provides a lot of information on the internal working of the RTOS though it's most useful if you have a high speed streaming connection such as Segger RTT.  https://percepio.com/tracealyzer/download-tracealyzer/
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3701
  • Country: gb
  • Doing electronics since the 1960s...
Re: FreeRTOS and TLS - task priority problem
« Reply #11 on: October 22, 2023, 04:59:13 pm »
No mutex involved but I suspect it goes something like this:

The uart rx buffer is 1k.

I am feeding in data at 38k i.e. 4k bytes/sec. So if that rx buffer isn't [substantially] emptied every 250ms, you obviously lose data.

I now suspect the issue is deeper because that loop (which BTW runs in a max of 10us in the case of worst-case data transfers) never actually gets held up. The data loss, 15 bytes out of a 500 byte packet, occurs at the onset of the 3 sec TLS operation. Just bizzarre.

And the rx ISR-driven buffer never overflows (data is never discarded) - I checked that with a breakpoint.
« Last Edit: October 22, 2023, 05:35:36 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 3727
  • Country: us
Re: FreeRTOS and TLS - task priority problem
« Reply #12 on: October 22, 2023, 05:45:45 pm »
That certainly sounds like either a mutex or the code is disabling interrupts.

First thing to do is to see which one is happening.  Check of the ISR is ever having to drop data because of a full FIFO.  Perhaps that is what you already did with the GPIO toggle.  If so that means the serial interrupt isn't being blocked.  Perhaps the scheduler interrupt is being blocked or perhaps not.

Then I would add a fifo full check in the ISR and then set a breakpoint there. When the breakpoint trips, go look at the interrupt return address and see where exactly in the TLS code it is preventing preemption.

Edit: I reread and saw that you already did this.  It the serial ISR is not getting called then something is disabling interrupts.  In that case, do you have another interrupt handler that is really slow and is being triggered by the TLS code?  Possibly something transferring a large block of data to some other peripheral?

As a matter of design I really don't like mixing time critical polling with preemption with interrupts the way it sounds like you are doing with extensive use of sleep calls to invoke the scheduler.  Polling is ok for simple applications where you can really bound the time it takes to poll every possible event, but quickly gets out of hand.  It's still fine for non time critical tasks, but in this sort of application I really prefer to have tasks wait on an event - such as a mutex, data in a fifo, or whatever.  Then the task list contains whether each task is runnable or not, and each event type contains a list of waiting tasks.  The interrupt handler, or mutex unlock function or any code that writes to a queue looks at all the tasks waiting on the event and marks them runnable.  Then the scheduler is responsible for making sure the highest priority runnable task executes.

You can take this approach with preemption or with explicit sched_yield() calls, depending on your needs.  It takes a bit of careful organization to set it up and make sure that you don't have race conditions between events being runnable and tasks waiting, but the effort is usually worth it.

Anything that requires polling is now done by a background task that waits on a timer.  It simply polls whatever is needed and then marks the appropriate tasks runnable.
« Last Edit: October 22, 2023, 05:49:29 pm by ejeffrey »
 

Online mikerj

  • Super Contributor
  • ***
  • Posts: 3240
  • Country: gb
Re: FreeRTOS and TLS - task priority problem
« Reply #13 on: October 22, 2023, 07:12:34 pm »
I now suspect the issue is deeper because that loop (which BTW runs in a max of 10us in the case of worst-case data transfers) never actually gets held up. The data loss, 15 bytes out of a 500 byte packet, occurs at the onset of the 3 sec TLS operation. Just bizzarre.

Memory corruption?  Is the task stack for your TLS task big enough, do you have stack overflow checking enabled?
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14490
  • Country: fr
Re: FreeRTOS and TLS - task priority problem
« Reply #14 on: October 22, 2023, 07:29:49 pm »
I have TLS running, once a minute. It runs at a low priority, but grabs the CPU totally for up to 3 seconds.

Pre-emption is enabled in FreeRTOS, and this rather complex product does all work great.

This doesn't make a lot of sense, why is your low priority task preventing higher priority tasks from running?  Are you disabling the scheduler in the TLS task e.g. do you have an extremely long critical section?

Yes, this would be the main design issue here. You get unintended priority inversion because you have a low-priority task becoming non-preemptable once it runs.
The solution would be either to make the TLS task preemptable, or implement hardware handshaking for the serial communication (so it can wait if the CPU is not ready to handle it), but the latter may not be feasible if that was not designed in right from the start.

Is there really no way of making the TLS task preemptable?
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 3727
  • Country: us
Re: FreeRTOS and TLS - task priority problem
« Reply #15 on: October 22, 2023, 08:12:28 pm »
And the rx ISR-driven buffer never overflows (data is never discarded) - I checked that with a breakpoint.

Does the serial port hardware have a HW fifo overflow flag?  If so you can check that in the ISR to prove whether or not interrupts are being blocked.

If interrupts are never blocked and the ISR never finds its software queue full and has to discard data, then I think the answer is data corruption, like others mentioned possibly a stack overflow.

In the first post you said:

Quote
serial data loss unless I have an interrupt driven rx buffer big enough for 3 secs at max baud rate, which is too much RAM

Have you actually tried that: making the buffer impractically large?  Does that actually work?
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3701
  • Country: gb
  • Doing electronics since the 1960s...
Re: FreeRTOS and TLS - task priority problem
« Reply #16 on: October 22, 2023, 08:50:29 pm »
I have narrowed it down quite a bit.

The data loss happens only when certain debugs from the TLS code are enabled. These come out via USB VCP. So now I am dealing with something more complex: the USB VCP code. But that actually runs pretty well in other contexts. Only when invoked from TLS does this surface.

And USB VCP is unbelievably complex.

I will do a bit more work on this but might just document that that specific debug should not be used together with that serial port copy function. It is an extremely unlikely scenario anyway.

The issue disappears if I put a 1-2ms delay after the USB VCP debug! This is not wholly surprising since sending data to the USB host (a PC) is supposed to work with flow control (see past threads on USB VCP flow control) but this is such a complex area and almost nobody understands it. In this project, flow control has been implemented and tested, but it is hard to test because of the high data rates involved. The serial port (uart) data loss occurs only when a large amount of data (a few k) is being sent out via USB VCP.

So almost certainly nothing to do with the RTOS priorities ;)

Quote
Does the serial port hardware have a HW fifo overflow flag?

I put in a GPIO wiggle in the rx ISR and verified that when the data loss occurs, the right number of interrupts have occurred.

Quote
Have you actually tried that: making the buffer impractically large?  Does that actually work?

I can't really do that; not enough RAM left. But I did check that even with tiny packets, say 10 bytes, data is still lost. With 10 byte packets, sometimes the whole 10 get lost.

But it could well be something to do with the USB interrupt (the STM 32F4 USB CDC code is entirely interrupt driven) affecting the UART interrupt. The latter is pretty simple but the former is, as I said above, so complex...

It could be something simple but after a day spent on an extremely narrow-applicability scenario...

Hard to debug too, with TLS running infrequently.

Thank you all for your suggestions :)
« Last Edit: October 22, 2023, 08:56:26 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 3727
  • Country: us
Re: FreeRTOS and TLS - task priority problem
« Reply #17 on: October 22, 2023, 10:54:09 pm »
Can you make the serial interrupt higher priority than the USB interrupt?
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3701
  • Country: gb
  • Doing electronics since the 1960s...
Re: FreeRTOS and TLS - task priority problem
« Reply #18 on: October 23, 2023, 07:12:22 am »
It already is: UARTs / USB / Systick.

There is a further complication which may be a part of this: the USB CDC implementation does flow control (previous threads) but the discussed USB debugs bypass this because they need to come out as fast as possible. The flow control is optional (boolean switch parameter on the data transmit function call) and when false there is a 2ms delay instead. This delay was missing on the debugs which bypass the USB flow control entirely. Now that delay has been added.

A year or so ago I spent many days digging around for any information on this but there isn't really any. There is/was one infamous guy on the ST forum who seems to know this area but he never posted a sufficiently self contained bit of info to be actually usable. The USB flow control itself does work when enabled but it is hard to test because so much depends on the PC application which is running; most can process data very fast and the "handshake" is not possible to monitor unless you really know the protocol.

On an overnight test I now have zero errors.
« Last Edit: October 23, 2023, 07:31:15 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online eutectique

  • Frequent Contributor
  • **
  • Posts: 393
  • Country: be
Re: FreeRTOS and TLS - task priority problem
« Reply #19 on: October 23, 2023, 10:57:10 am »
Hard to debug too, with TLS running infrequently.

Can you put together a test scenario where TLS is triggered every, say, 5 seconds?
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3701
  • Country: gb
  • Doing electronics since the 1960s...
Re: FreeRTOS and TLS - task priority problem
« Reply #20 on: October 23, 2023, 01:09:19 pm »
Yes I could, but it is hard to debug this "in theory impossible" interaction between USB VCP, and the "supposedly simple" UART RX/TX interrupt data flow.

I narrowed the serial data loss to a specific debug call emanating from HTTPS code (which invokes TLS). This call places data into a buffer from which the USB interrupt picks up data. More precisely, an RTOS task copies this buffer to a buffer from which the USB interrupt picks up data (this extra stage was done to enable USB CDC flow control to work, together with having the "opqspace" and "ipqcount" functions which the raw USB code doesn't give you).

And placing a 1-2ms delay after that debug call (probably this merely ensures that more debug data doesn't follow immediately) solves the data loss completely.

I recall messing around with this a while ago, when implementing the USB CDC flow control. The "no flow control" route (which is what is provided by ST in their USB code, ex Cube MX I think) works only by luck: USB CDC is very fast and just about any host application will consume the data fast enough to not need flow control. Before I actually implemented flow control I characterised just how much of a packet rate is needed to lose data, and it was ~10kHz packet rate / a packet length > 800 bytes, so no wonder probably nobody complained. But you can hit this limit when some other stuff happens and then USB screws up. And I think it perhaps screws up the serial UART interrupts...
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online gpr

  • Contributor
  • Posts: 21
  • Country: gb
Re: FreeRTOS and TLS - task priority problem
« Reply #21 on: October 23, 2023, 05:52:31 pm »
Quote

while(1)
{
 ...
 ...
 ...
 osDelay(10);
}

and this produces a 100Hz rate around the loop. The code in the loop takes a relatively negligible time. It is checking for data in the rx fifo, space in the tx fifo, and if these are met it moves data no bigger than the size of a local buffer (256).

I would not call this loop "interrupt-driven", it is just polling with delay, and looks like interrupts just update flags you are checking in a loop.
Better approach here could be: loop above waits on some object, and serial interrupt handler wakes up that task by posting notification. For example, using direct task notifications: https://www.freertos.org/xTaskNotifyFromISR.html
As a result, you will not waste time doing polling, no need to yield manually (other tasks can run while you are waiting for notification), no delay before incoming serial data is handled.


But this is unrelated to your findings about disabled interrupts or something, which is another issue.
 

Online newbrain

  • Super Contributor
  • ***
  • Posts: 1719
  • Country: se
Re: FreeRTOS and TLS - task priority problem
« Reply #22 on: October 24, 2023, 06:09:31 am »
gpr, you are the nth person, me included, to suggest that.
In a 200 kloc FreeRTOS project I have practically zero delays, only delayUntil for periodic tasks, eg., polling, direct to task notification from peripheral ISRs (using I2S, I2C, SPI, many DMAchannels, GPIO, QSPI, timers) and queues for inter task  message passing..

If you follow peter-h's posting hiistory (not really asking you to do that, it's huge), it seems they do not want to follow this advice, and prefer a, IMO, more fragile way of architecting their system.
At least they cleared up some initial misconceptions, but the long litany of problems seems to indicate some base system weakness.
Nandemo wa shiranai wa yo, shitteru koto dake.
 
The following users thanked this post: mikerj, JPortici

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3701
  • Country: gb
  • Doing electronics since the 1960s...
Re: FreeRTOS and TLS - task priority problem
« Reply #23 on: October 24, 2023, 07:39:09 am »
It is fine to have a loop which runs in 10us and have an osDelay(10) (10ms) at the bottom of it.

Such a program will take up 1/1000 (0.1%) of the CPU.

It is of course more "academically purist" to not have a loop at all and trigger the whole thing from some event. In this case it would be triggered by an any interrupt from any UART. I have four UARTs (plus a 5th port which is USB VCP) and each UART can generate three interrupts. To expand this philosophy fully I would have 12 RTOS tasks servicing these, plus a couple more servicing the USB VCP. This is crazy. And it is far from assured that it will be more efficient; an ISR runs in a few us and most of the time there is nothing else to do because I have a 1k buffer (for tx and 1k for rx). So the RTOS task trigger would need to be based on there being some data to process, plus a timeout to handle the case where data arrived but there wasn't (yet) room in the destination buffer. Hey, isn't that exactly what I am doing already; just with a different and much simpler structure? ;)

The above posted loop checks for any rx buffer data and checks for sufficient destination tx buffer space, and if this is met, it does a data copy. And all that takes just 0.1% of CPU time!

The system is rock solid but there is some issue with a combination of the STM USB VCP code, with flow control disabled, and outputting a big chunk of data to VCP (a few k) and (probably) UART interrupts. It's been solved by a few ms delay after that few-k block is output, which doesn't exactly surprise me given the way USB VCP works. I've been there before in a similar context. The real mystery is why UART data is affected.

The obvious mechanism would be an ints-disabled code section. Well, there is stuff like this

Code: [Select]
// If USBD_CDC_ReceivePacket was skipped (in CDC_Receive_FS) and there is useful
// room in the circular buffer for more data, then run it now. USB CDC packets are 0-64 bytes
// so we make sure there is 100 bytes space before re-enabling more data reception.
// Must run with ints disabled.

__disable_irq();

if ( (CDC_rx_blocked) && ( (CDC_RX_BUFFER_SIZE - CDC_get_ipqcount_2()) > 100) )
{
// Enable reception of next packet from Host
USBD_CDC_ReceivePacket(&hUsbDeviceFS);
CDC_rx_blocked=false;
}

__enable_irq();

All these DI/EI sections have been verified either by code inspection or with a GPIO wiggle and a scope and they are not causing the ~2-3ms ints-disabled period that would be needed to cause this data loss. I've just re-checked some of these sections with a scope and nothing is taking more than a few us, right when the data loss occurs.

I don't get why somebody dislikes that I post a fair bit. At least I post solutions when I get them; the vast majority of people posting problems online never come back with whatever solution they found. So a typical coder's life consists of many hours of useless googling and every so often finding something useful.
« Last Edit: October 24, 2023, 08:38:27 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 825
  • Country: es
Re: FreeRTOS and TLS - task priority problem
« Reply #24 on: October 24, 2023, 08:34:21 am »
To summarize a bit: you have this situation:
- UART ISR buffering data
- an "UART data mover" task polling that buffer and moving data elsewhere (the one with sleep(10))
- USB ISR feeding the USB
- an "USB data mover" task feeding USB ISR
- ISR priorities: UART>USB>SysTick
- UART ISR is firing fine (proven by GPIO waggling)
- problem arises when you create a decent input data amount for the USB ISR directly, bypassing the "USB data mover"

 I’d suspect too much time claimed by USB ISR in a burst manner. The UART>USB ISR priority doesn’t help in this case since the "total" UART priority is a combination of ISR’s and "data mover" task’s priorities (and the latter is no higher than SysTick’s one scheduling it), but "total" USB’s priority becomes equal to it’s ISR’s one when you bypass the data mover task and feed the ISR directly. Try waggling another GPIO from the USB ISR to check that. The solution in this case would be to obey the rules and feed debug data to the data mover task, not to the ISR.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf