EEVblog Electronics Community Forum
Products => Computers => Programming => Topic started by: Karel on October 31, 2022, 09:34:38 am
-
Regarding STM32 MCU programming (ARM), in many code examples and libraries, when setting a hardware register of a peripheral,
the register is read back in order to make sure the hardware is actually written before the next instructions take place.
What is the reason for reading back the register instead of inserting a "__DSB()" (Data Synchronization Barrier)?
-
The ARM instruction ensures that the previous instructions have completed.
It does not take into account other delays that may be present in the peripheral. For example, some peripherals operate in a different clock domain and will have additional clock delays before the write actually reaches the register.
Reading back the register will ensure these have completed.
-
The ARM instruction ensures that the previous instructions have completed.
It does not take into account other delays that may be present in the peripheral. For example, some peripherals operate in a different clock domain and will have additional clock delays before the write actually reaches the register.
Reading back the register will ensure these have completed.
The DSB instruction will do that as well:
DSB acts as a special data synchronization memory barrier. Instructions that come after the DSB,
in program order, do not execute until the DSB instruction completes. The DSB instruction
completes when all explicit memory accesses before it complete.
https://documentation-service.arm.com/static/5f2ac76d60a93e65927bbdc5 (https://documentation-service.arm.com/static/5f2ac76d60a93e65927bbdc5) page 208
-
Now the question is, when is this actually useful? Doing that everytime for no actual reason is useless. The cases for which you need to ensure that the write has completed before executing the next instruction are not that common.
-
.., when is this actually useful?
For example, when you want to enable some hardware and you need to initialize some (other) hardware first
because it is dependent to it.
ST's HAL is full of examples where they read back a register immediately after writing to that register.
-
The DSB instruction will do that as well:
I do not see how the DSB instruction has any relation to the INTERNAL data paths within a peripheral.
The DBS instruction works at the processor bus level.
-
The DSB instruction will do that as well:
I do not see how the DSB instruction has any relation to the INTERNAL data paths within a peripheral.
The DBS instruction works at the processor bus level.
Good point. Maybe there's an (embedded) arm specialist in the room who can shine a light on this...
-
STM 32F peripherals run off their own clock which is a fraction of the CPU clock, and they seem to be asynchronous so there is synchronisation involved to prevent metastability. The CPU, having written some peripheral register, has no way of knowing what the peripheral is doing internally. This is because the various "arm32" manufacturers have licensed the ARM code as an IP block and they put their own peripherals (which they probably also bought in as IP blocks) around that.
-
This is in the errata
(https://peter-ftp.co.uk/screenshots/20221102225034314.jpg)
https://www.st.com/content/ccc/resource/technical/document/errata_sheet/0a/98/58/84/86/b6/47/a2/DM00037591.pdf/files/DM00037591.pdf/jcr:content/translations/en.DM00037591.pdf (https://www.st.com/content/ccc/resource/technical/document/errata_sheet/0a/98/58/84/86/b6/47/a2/DM00037591.pdf/files/DM00037591.pdf/jcr:content/translations/en.DM00037591.pdf)
-
.., when is this actually useful?
For example, when you want to enable some hardware and you need to initialize some (other) hardware first
because it is dependent to it.
I get that, but how often is it required in practice? Note that it would be a potential issue only if you access registers of peripherals that are NOT on the same clock domain (different APBs). If they share the same APB, the sequence is guaranteed. Isn't it?
ST's HAL is full of examples where they read back a register immediately after writing to that register.
Not going to comment too much on HAL's example code. ::)
But generally speaking, they want to make it as generic as possible to cover all situations (which is the whole approach of the HAL anyway), so I would treat this code as such. Do things because you know what they do and what they are useful for, not just because they were seen somewhere and so this has got to be the right way of doing things. Magical thinking, you know the deal.
-
What is interesting is that they (screenshot above) do recommend DSB as a solution even for peripherals running at a much slower clock.
This tells us that there is a "back sync" action from the peripheral.
Well, the process of a 180MHz CPU writing to a 40MHz peripheral (could be much slower e.g. 10MHz but look out; e.g. the ethernet stuff breaks, with an undocumented and recently discovered bug if the divisor is above a certain value) is probably implemented with loads of wait states - same as it always was, all the way back to 1975. So DSB will work because the wait states will stretch whatever the instruction (which triggered the wait states) was.
HAL code is good for reference and for getting started. The whole business of Cube MX generating chunks of code is pretty horrible but gets people started much faster than reading the 2000 page RM would. But it becomes clear that much of the HAL code was written by inexperienced programmers. For example look at HAL_SPI_TransmitReceive() and how they cover every possible SPI config option. You would never write real code that way. Occassionally this bloatware bites you on the bum
https://www.eevblog.com/forum/microcontrollers/32f417-what-could-stop-this-memory-spi3-dma-transfer-running/ (https://www.eevblog.com/forum/microcontrollers/32f417-what-could-stop-this-memory-spi3-dma-transfer-running/)
-
Yep, that's why I wrote all my libs for clock & pll, uart, spi, i2c, dma, adc, ethernet (not TCP/IP), etc. myself.
The problem starts when I want to use usb...
-
You will spend a large % of your life doing USB (CDC, MSC) and ETH (TCP/IP, TLS, etc) yourself. Ask me how I know :) Libs like TLS and LWIP have tens of man years in them. In my product, total size is 400k of which MbedTLS is 200k! Unless your product is trivial, or you are building on the top of something else, you cannot develop it yourself anymore.
Anyway this is digressing.
-
I get that, but how often is it required in practice? Note that it would be a potential issue only if you access registers of peripherals that are NOT on the same clock domain (different APBs). If they share the same APB, the sequence is guaranteed. Isn't it?
Maybe, but I wouldn't necessarily depend on it. Some peripherals might have internal clocks that are different from their bus clock. DSB ensures that the writes are issued in a well defined order. The register read ensures that the peripheral has completed processing the write before any additional commands are executed. It's two different behaviors and intents. I'm not sure when it is needed: ideally the peripheral documentation would include this sort of thing, but often it doesn't. It's not something I would change without a good reason.