QuoteTaking 4 µs on a 32 bit processor running at 120 MHz would indeed be awful. 480 clock cycles. Hard to see how you could even do that.Here it is in all its ugliness!
micros()->(crit)elapsed_time->(crit)slicetime->(crit)ticker_read_us->initialize/(core_crit)/update_present_time->math
unsigned long micros() {
1000448a: f002 fc80 bl 10006d8e <mbed::TimerBase::elapsed_time() const>
10006d8e <mbed::TimerBase::elapsed_time() const>:
10006d98: f002 f820 bl 10008ddc <mbed::CriticalSectionLock::CriticalSectionLock()>
10006da0: f7ff ffda bl 10006d58 <mbed::TimerBase::slicetime() const>
10006db6: f002 f817 bl 10008de8 <mbed::CriticalSectionLock::~CriticalSectionLock()>
10006d58 <mbed::TimerBase::slicetime() const>:
10006d60: f002 f83c bl 10008ddc <mbed::CriticalSectionLock::CriticalSectionLock()>
10006d74: f001 fff0 bl 10008d58 <ticker_read_us>
10006d86: f002 f82f bl 10008de8 <mbed::CriticalSectionLock::~CriticalSectionLock()>
10008d58 <ticker_read_us>:
10008d5c: f7ff ff2e bl 10008bbc <initialize>
10008d60: f000 fc24 bl 100095ac <core_util_critical_section_enter>
10008d66: f7ff fe41 bl 100089ec <update_present_time>
10008d70: f000 fc32 bl 100095d8 <core_util_critical_section_exit>
100089ec <update_present_time>:
100089fa: 6803 ldr r3, [r0, #0]
100089fc: 685b ldr r3, [r3, #4]
100089fe: 4798 blx r3
10008a1a: f7f7 feb1 bl 10000780 <__aeabi_llsl>
10008a40: f7f7 fe92 bl 10000768 <__aeabi_llsr>
10008a4a: f7f7 fe99 bl 10000780 <__aeabi_llsl>
10008a6c: f7f7 ff34 bl 100008d8 <__aeabi_lmul>
10008a7c: f7f7 ff0c bl 10000898 <__aeabi_uldivmod>
10008ddc <mbed::CriticalSectionLock::CriticalSectionLock()>:
10008de0: f000 fbe4 bl 100095ac <core_util_critical_section_enter>
10008de8 <mbed::CriticalSectionLock::~CriticalSectionLock()>:
10008dec: f000 fbf4 bl 100095d8 <core_util_critical_section_exit>
100095ac <core_util_critical_section_enter>:
100095ae: f7ff f979 bl 100088a4 <hal_critical_section_enter>
100095c0: f7ff ff46 bl 10009450 <mbed_assert_internal>
100095d8 <core_util_critical_section_exit>:
100095ea: f7ff f96f bl 100088cc <hal_critical_section_exit>
100088a4 <hal_critical_section_enter>:
100088a6: f3ef 8010 mrs r0, PRIMASK
100088aa: b672 cpsid i
100088cc <hal_critical_section_exit>:
100088ce: f3ef 8210 mrs r2, PRIMASK
100088de: f000 fdb7 bl 10009450 <mbed_assert_internal>
100088ee: b662 cpsie i
As has been known sinc the 60s, mutexes and semaphores are the fundamental mechanism necessary for RTOSs.
For tasks at the same priority level, a single timer and a binary min-heap to hold the next firing time works well
For tasks at the same priority level, a single timer and a binary min-heap to hold the next firing time works well
That's the same solution I am using in hardware for managing some simple devices.
Not the best, but it's the simplest solution to be implemented and tested, and it does the job!
OS without mutexes and semaphores is possible
Why is it not the best?
This may require preemption, which may cause priority inversion, a bug that occurs when a high priority task is indirectly preempted by a low priority task.
So, you understand why last week I didn't implement preemption, and why I didn't implement different priorities.
This is hella good thread guys. Learning a lot.
One time i created two work fifos with different priorities. Interrupts placed tasks to be done in the proper fifos.
When a task finished i checked the high priority fifo for items to process first and if none there then checked the normal priority fifo.
It seemed to work fairly well.
Without mutexes? Sure, never share any memory or other resources between tasks. Without semaphores? Uh, sure, if you have other means of synchronizing tasks, or if you don't even need to synchronize them (which is pretty rare.)
Both can be "solved" using message passing only. Not that this is necessarily the most efficient in all use cases, but it works.
It would be useful to :
- NOT compile with -O0. There is far too much stuff that should simply be inlined, and too much memory traffic.
BUT WHY ON EARTH DO MANUFACTURERS INSIST ON WRITING HAL CODE SO INEFFICIENTLY?
I made a whole thread about message passing a while ago. As I remember, it was an interesting discussion but got quite some "resistance" and a few misconceptions.
I am increasingly resorting to message passing schemes these days. They are much, much easier to get right.
Whether they yield better performance than other approaches all depends on a number of factors though, and without at least some hardware support, message passing can be inefficient (in terms of throughput/latency.)
Interestingly, out of security reasons, there's a trend with large applications to distribute the work over a number of *processes* (rather than just threads). In which case, communicating between them requires some form of IPC, so that's often close to message passing.
On general-purpose OSs, processes tend to be a bit heavy though. So there would surely be some benefit for intermediate execution units, some kind of lightweight processes. Some languages (usually through a runtime) and some particular OSs do offer that, but that's still relatively rare.
All you have to do is specify an API in terms of the message contents and sequences of messages. That's how telecoms systems work, and they are arguably the biggest computing systems mankind uses.
void show_connections() {
for (conn = connection_ll_head; conn; conn = conn->next) { // Traverse our linkedlist
printf("Name: %s, dest: %s, prot %s\n", conn->name, conn->dest, conn->protocol);
printf(" DataIn: %d, DataOut %d", conn->incount, conn->outcount);
}
}
It doesn't look very dangerous, does it? But it is! Worse, "how to fix" is unclear - presumably actually maintaining the list of connections is more important than displaying it, so you don't really want to lock either the list or the individual connection data structures.Consider:Code: [Select]void show_connections() {
It doesn't look very dangerous, does it? But it is! Worse, "how to fix" is unclear - presumably actually maintaining the list of connections is more important than displaying it, so you don't really want to lock either the list or the individual connection data structures.
for (conn = connection_ll_head; conn; conn = conn->next) { // Traverse our linkedlist
printf("Name: %s, dest: %s, prot %s\n", conn->name, conn->dest, conn->protocol);
printf(" DataIn: %d, DataOut %d", conn->incount, conn->outcount);
}
}
It's dangerous even on a cooperative multi-tasking system, because printf() is a thing that is likely to block.
I'd like to see a simple project, e.g. blinky with UART based control (e.g. to change blink intervals, blink counts, or pwm), implemented in 3 paradigms:
1. superloop
2. os
3. ISRs with priorities
Then I have noticed that the board heats up significantly... That was surprising. And after some time, an LED stopped blinking - apparently, it blew up! WTF? I can see only two things to blame - either my code gave too much stress to the board, or the LED did not have a resistor to limit the current. I can hardly believe in (2), but I cannot see anything wrong with the code either, it looks innocent.
So there I have it, the effort ended on the very start.
BUT WHY ON EARTH DO MANUFACTURERS INSIST ON WRITING HAL CODE SO INEFFICIENTLY