General > General Technical Chat
ASM programming is FASCINATING!
<< < (15/24) > >>
ebastler:

--- Quote from: Nusa on July 30, 2020, 09:26:54 am ---To the OP that's just having fun learning, I recommend the following book:

--- End quote ---

We seem to have lost the OP a while ago. More specifically, we lost him right after his original post. Maybe the fascination didn't last.  ::)
Siwastaja:

--- Quote from: Berni on July 30, 2020, 03:09:47 pm ---Yes above around 100MHz the flash tends to start having latency issues keeping up with the CPU. It depends a lot on the implementation so it varies between vendors. One of the common vendors like ST uses cache accelerated flash for running a ARM Cortex M4 core at 180MHz. All of the larger chips from ST using the ARM Cortex M7 and similar then have a full on instruction and data cache right in the CPU. The fastest H7 family doesn't even run the internal RAM at full CPU speed anymore, so if you want to actually make proper use of all of those 480MHz you need to use the cache.

--- End quote ---

No, these STM32H7 (even the lower-end F7 series) parts do have the core-coupled RAMs that run at the CPU frequency, so no need to run with caches. Cache becomes very useful when you have large applications you want to run off SD card or something.

The whole idea of running code from RAM benefits significantly from using a dedicated RAM just coupled to the core, so that instruction fetches do not appear on the bus. From chip design viewpoint, this is a relatively small extra cost to having a single larger RAM. The SRAM kilobytes themselves are expensive. (Of course, dividing the total kilobytes to several different interfaces makes the usage of the RAM less flexible if you need large tables or dynamic allocation.)

In my experience, the ITCM / "boost" ram / scratchpad / whatever they call it sections tend to be so large (for example, 64kB for the H7 parts I have worked with) that you can fit most of your code there, not just the most obviously timing-critical pieces, making the decision what to put there much easier; place it in ITCM unless you are sure the performance doesn't matter. All sorts of init functions tend to remain in flash for running, for me.
MK14:

--- Quote from: Siwastaja on July 30, 2020, 05:53:10 pm ---This isn't actually true IMHO, all of this is easily controlled using the volatile qualifier in C.

--- End quote ---

Nice post, overall, but 'volatile' is not always enough. The thing is that some hardware setup, has strict timing 'rules', imposed by the datasheet and/or hardware design.
E.g. An A to D converter, which insists, that after enabling it, you have to wait X microseconds or machine cycles, before using it. So that it can stabilise/initialise itself properly.
Or a watch-dog timer(s), which can only be enabled or disabled, a certain time (e.g. clock cycles), after initially coming out of reset. To stop errant software from being able to interfere with the watch-dog timer, later.

In assembler, these timings, are fairly easily bolted down, by counting the clock cycles (which isn't necessarily that precise these days, as already discussed in this thread).
But in a highly optimised compiler (although one option, might be to disable or reduce the optimisation setting, for the hardware initialisation part of the code). The compiler may interfere too much with the timings. Bit-banging, hardware timings, can also be tricky, when compiler optimisations are enabled.

In some extreme cases (I remember worrying about this a long time ago, I can't remember if I had to do it or not, in the end). Even the initialisation performed by the C compiler (e.g. clearing sections of memory, compiler dependent), can take so long (in fast hardware terms, like milliseconds). That you can't meet the specifications of the product, without putting assembler start up code, even BEFORE the C code has initialised itself.
Even before the first C statement, can be executed.
E.g. To check for some startup options (defined by the statues of some external I/O port pins), then do things and/or disable/enable other I/O things. To meet those requirements.

I can't remember exactly, but off the top of my head, it was something worrying long, like 500 milliseconds (on a relatively slow processor, too long ago, I could be misremembering, but it was longish), to finish the C compiler initialisations.
It was to switch on or off, some stuff, to protect the hardware, immediately after switch on (coming out of reset). Otherwise, the hardware could damage itself (arguably poor design, not my design anyway).

Anyway, I agree. that there are various things you can do, so that if you are really determined. You can solve most/all of these issues and implement such schemes in C code.

E.g. Using hardware timers and/or interrupts. To get the timing right and independently of the instruction execution times.
Siwastaja:
MK14,

Indeed, setting up the C "runtime" for running standard-compliant C does take time. More specifically, this means that because the C standard guarantees that global (or static, inside of functions) variables, tables etc. are all initialized zero, there must be some piece of code before main() that zeroes all these variables. Similarly, there must be a piece of code which copies the initial values of all initialized globals and statics from the flash memory. That's basically what the magical "startup code" is - completely trivial.

However, you can decide not to use the compiler-generated startup code and do that yourself - either in asm or C. This should be possible in most compilers.

Clearing the bss and copying data in ASM isn't going to magically be any faster than the same in C, the implementation is trivial, and C compiler does the most optimized copy and zeroing, likely even at the lower optimization levels.

You can, of course, decide to do things like perform certain important tasks before bss is zeroed or data copied - I do that all the time, you basically just move the important main() things out of main, to the reset vector handler or whatever you call it -, or decide not to conform to the C standard and not clear .bss, and so on, but you don't need any ASM to do any of this - the same result can be achieved in both C or ASM.

That really leaves us with those peripherals that require cycle-accurate control sequences. I find those are exceedingly rare although yes they do still pop up from time to time, and using some asm to access them has better chances of being consistent than writing some dummy operations in C and hoping it stays the same.
MK14:

--- Quote from: Siwastaja on July 30, 2020, 07:27:28 pm ---However, you can decide not to use the compiler-generated startup code and do that yourself - either in asm or C. This should be possible in most compilers.

--- End quote ---

That particular example, was from a very old compiler (compared to ones, these days). It was an 8 bit processor (8051/2 series), and those compilers, were trickier and somewhat/potentially eccentric (the 8051 can be a funny sort of beast, e.g the way its I/O ports work!). Some old 8 bit processors (like that), with old compiler technology, were not especially brilliant.
It sticks in my mind, because I remember the difficulties, that needed to be sorted out.
Navigation
Message Index
Next page
Previous page
There was an error while thanking
Thanking...

Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod