I was just toying with the idea of implementing caches for arithmetic operations that last for several cycles, and was wondering if they could be worth considering.
Then while looking up if that idea had already been considered, I found a patent (damn! But it's expired): https://patents.google.com/patent/US5260898A/en
What do you guys think? Could it have any real-life benefit in your opinion, has any of you ever implemented that, and/or have you seen a real processor implementing it?
Somewhat often, the incoming (to be processed) data, into a function (or similar), is only BYTE sized. Such as TEXT files/arrays, and/or other smallish values (e.g. Dates 1..31, Ages 1..150, Most MPH Car speeds).
So, consider an encryption routine (or similar).
Let's say (hypothetical, i.e. NOT correct in real life) you take each BYTE of an array or file, and perform...
EncryptionHash += (((CHAR_BYTE x 68) + 12) / 112) and 0xFF. (Pseudocode).
On the one hand, this will take (let's assume DOUBLE's are used), many cycles. E.g. The divide alone, might take 50 cpu cycles, especially on an older cpu. (Ignoring that pipelining/superscaler etc, may slightly/largely eliminate, that delay/time).
So, there are only 256 (char) possible values to calculate (to fill the cache), even if the text array/file is millions of bytes long. Hence (no arithmetic cache), might take 75 (cycles per byte) x 2,000,000 (text bytes to encrypt), = 150 million cpu cycles.
With the arithmetic cache, it might take 1 (cycle, if cache is fast enough) x 2,000,000 = 2 million cycles.
= x75 speedup.
N.B. Ignoring superscaler/pipelining/other-stuff etc, which may significantly change these figures.
This could apply to many actual programs, it could of been pseudorandom number generation, checksum calculation, hashing, database, weather prediction data (temps in deg C), etc etc.
Even if the variable(s), are considerably bigger than shown above. The actual values, may be only limited, to a tiny subset of the available values.
E.g. House numbers, being hashed for a database. Although the variable might be an INT32 type, to allow for very big house numbers. In practice, they will usually be less than 10,000. So, such a hashing algorithm, may only actually see 5,000 different house number types, while processing, tens of millions of addresses.
This technique is sometimes used in software. Where very time-consuming calculations are done, but where similar/same data, is often repeated. Quick hashed, lookup tables, can be used, for already calculated values.
E.g. Electronic circuit simulation, where the voltage hasn't changed (yet, or it has returned to a previous calculated value, e.g. oscillator) at some nodes, so the calculation(s), have already been done.
Despite the millions (approx, depends on algorithm used and resolution etc), of repeated, small time steps, being calculated.