Author Topic: 68030 prefetch thoughts?  (Read 288 times)

0 Members and 1 Guest are viewing this topic.

Offline ZaneKaminskiTopic starter

  • Contributor
  • Posts: 34
  • Country: us
68030 prefetch thoughts?
« on: July 20, 2024, 12:07:25 am »
Anyone ever implement a prefetcher for an MC68030 or similar generation system?

For fun, I'm working on a RAM controller for a 33 MHz MC68030-based system. The RAM controller is implemented in a modern FPGA and the RAM is 133 MHz SDR SDRAM connected to the 68030 data bus through some 74LVC245A level-shifting buffers. I have achieved 4-clock read access latency for "random reads" and 3-clock latency when there’s a hit in an open DRAM row. The row size is 2 kB and there are 8 SDRAM banks with independent sense amps. My controller always leaves the row open after a read/write and it interleaves each 2 kB chunk of RAM across the 8 banks so that you have to go 16 kB to get a bank conflict. In case of a bank conflict, the precharge and activate can both be completed in 4 clocks, same as if the bank was closed.

So with this arrangement, plus a 32 kB, 8-way, 3-clock latency L2 cache in the FPGA, I expect to be getting 3-clock accesses much of the time. The pin-to-pin latency of the FPGA through the level shifting buffers is too slow to do a 2-clock access in any case, so 3 clocks is the best I can do. In addition to the cache and fast-ish RAM, I’d like to implement a “prefetcher” to reduce the remaining number of 4-clock RAM accesses. Since a row hit is 3 clocks and I can’t go faster than that, what I really need is not so much a prefetcher, but a pre-precharge-and-activator. It needs to decide to close a particular row in one of the 8 banks and open a different one, thus converting a future 4-clock access into a 3-clock access (at least if the prediction is right).

Any thoughts on a good algorithm? A lot of prefetchers have been studied in academia and industry, but this particular situation is kind of odd. Modern systems prefetch something on the order of 64 or 128 bytes at a time and the latency is reduced by 10x when there’s a correct prediction. Here I just want to open the right row, so the equivalent prefetch size 2 kB and the latency is only reduced by 25%. Quite a different situation from what has been studied.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf