General > General Technical Chat

ASM programming is FASCINATING!

<< < (5/24) > >>

tggzzz:

--- Quote from: MK14 on July 28, 2020, 06:57:15 pm ---
--- Quote from: MK14 on July 28, 2020, 12:32:28 pm ---
--- Quote from: Kleinstein on July 28, 2020, 12:09:40 pm ---I does offer the advantage of a well defined run time. So those old days style waiting loops get accurate.

--- End quote ---

With the rather old processors/architectures, your are right. The odd modern one, here and there can be an exception as well (e.g. XCore).

Disclaimer: The above paragraphs were written quickly and may have small (or bigger) technical inaccuracies. If you are really interested in this stuff, there is some good/interesting stuff (books and good internet stuff) out there.

--- End quote ---

No one else seems to have so far, so I will 'attack' my own post.

Actually, no, even the old processors, were not necessarily, cycle by cycle accurate.

Examples:
WAIT state lines (might be called something else, depends on cpu, e.g. ready), which inserts wait states, to help with slower memory, I/O and sometimes other things. Some schemes are somewhat non-deterministic (e.g. a video card, which may or may not, be accessing the video memory, at the same instant in time).

If the looping software is waiting for a bit or value, in an internal or external peripheral, to occur or change, the timing can be asynchronous and/or random. Because (e.g. the A to D converter), the precise timing may depend on how long the analogue and/or digital circuitry, takes (potentially asynchronous to the master cpu clock).

Some instructions, have a variable number of cycles, depending on the exact values, and don't seem to bother to define what they are. E.g. Some integer divide instructions. Might say, takes from 2 to 15 cycles, depending on values involved. It could be deterministic, if you researched how the cycle count varies, but most people probably won't do that. There could be documentation somewhere that defines it (e.g. each 1 in specified register, adds a cycle to the time plus a minimum of 2 cycles plus address mode penalty cycle count).

Not an exhaustive list. Also not the best of examples, because they are sort of deterministic, it is just the exact times in practice will vary, depending on somewhat defined things (such as A to D response time variation) etc.

--- End quote ---

All true, but most are "outside" the computer, and so "don't count". The "difficult" cases involve instructions like that division instruction, and non-fixed loops in general.

MK14:

--- Quote from: tggzzz on July 28, 2020, 08:22:29 pm ---All true, but most are "outside" the computer, and so "don't count". The "difficult" cases involve instructions like that division instruction, and non-fixed loops in general.

--- End quote ---

I agree.
Assuming no wait states (fixed duration defined time ones, can be calculated as extra cycles), interrupts and anything else that can affect the timings, is active. The cycle times (on certain cpus, only, e.g. old ones), is predictable/deterministic.
My bringing in less deterministic, external (sometimes internal) events, is wrong. Until you start worrying about the overall system, in a complex real time system, which is not what we are discussing.

tggzzz:

--- Quote from: MK14 on July 28, 2020, 09:01:19 pm ---
--- Quote from: tggzzz on July 28, 2020, 08:22:29 pm ---All true, but most are "outside" the computer, and so "don't count". The "difficult" cases involve instructions like that division instruction, and non-fixed loops in general.

--- End quote ---

I agree.
Assuming no wait states (fixed duration defined time ones, can be calculated as extra cycles), interrupts and anything else that can affect the timings, is active. The cycle times (on certain cpus, only, e.g. old ones), is predictable/deterministic.
My bringing in less deterministic, external (sometimes internal) events, is wrong. Until you start worrying about the overall system, in a complex real time system, which is not what we are discussing.

--- End quote ---

It isn't "wrong" per se, and is important, but it is arguably a different discussion.

T3sl4co1l:
The lesson on timing is, of course: if you have enough CPU power to do it, and deterministic timings, then you can hard code it; if not, then buffer it, and make sure you have enough CPU power to get through the worst-case paths to refreshing that buffer in time.  AVR (most of them) don't have DMA, but most ARMs do (or above the entry level Cortex M0 tier, say).  Those ARMs usually also have caches -- though they may be documented cryptically, e.g. a "Flash memory accelerator"...

This is what PCs do; although, that PCs can deliver multimedia at all, is still something of a miracle on top of that.  Most OSs don't guarantee program execution within, really, any period of time.  The CPU is one thing, but the OS is a huge pile of APIs, caches and priority queues, and the granularity is not very impressive.  They just happen to work most of the time, say, jumping back into your program every millisecond or so.

Indeed, modern application processors are so fast that you might not care at all, about their indeterministic execution time; an AVR or Cortex-M0 can execute one or two instructions, in the time it takes the big CPU to execute a hundred -- and those instructions are vastly more powerful, operating on more data (including SIMD extensions) in ever richer ways.  In that fraction of a microsecond, the entire computation might be complete, whereas the deterministic CPUs are just sitting down to work.  Not to mention if multiple cores are employed (not that their outputs will be combined until much later, due to inter-CPU communication and cache coherency).  This is partly why a , if imperfectly (but you need to use a kernel mode driver to get around the OS's context switching).

Tim

MK14:

--- Quote from: T3sl4co1l on July 28, 2020, 09:24:18 pm ---The lesson on timing is, of course: if you have enough CPU power to do it, and deterministic timings, then you can hard code it; if not, then buffer it, and make sure you have enough CPU power to get through the worst-case paths to refreshing that buffer in time. 

--- End quote ---

In practice, there are various techniques, for mostly getting back, near (enough) deterministic capabilities. From powerful modern cpus, such as the latest PCs. You described one way.
For example, in game programming. It is usual to use the fact that you can easily get the actual exact elapsed time, and use that, when you are calculating time dependent things.
So, although the actual time jitters about, like crazy. The calculations, move things in proportion, to how much time has elapsed, since the last event/redraw/movement, of the game object that you are currently processing.

I began to be impressed with your video clip, putting Doom, onto relatively ancient, very low capability hardware. Until I heard them say about putting a Raspberry PI in it. Not to be confused with, totally and utterly cheating, I guess. It is still a significant challenge. To interface it, into the cartridge, and get the low resolution, limited number of colours. To act, like a much more modern, somewhat high resolution, many different colour, display.
So all things considered, it wasn't too bad!

Navigation

[0] Message Index

[#] Next page

[*] Previous page

There was an error while thanking
Thanking...
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod