Author Topic: FPGA Memory arbitration  (Read 4121 times)

0 Members and 1 Guest are viewing this topic.

Offline kd5pevTopic starter

  • Newbie
  • Posts: 3
FPGA Memory arbitration
« on: May 12, 2013, 12:38:38 am »
Hello.
I am working on a project where I need to interface a microcontroller (AT90USB1286) with an FPGA (Spartan 6 lx9).

I currently have the AVR connected to the FPGA via the AVR's XMEM interface.
Inside the FPGA, this is connected to a dual port 65536x8 blockram (64 KByte).

Now, I want the majority of this RAM to be left to the AVR for its own purposes -- heap, stack, etc.
But, I want the FPGA to be more than a glorified SRAM chip; I want there to be a few memory mapped peripherals made from FPGA fabric.

Attached to this post is a general picture of what I want to do: ~48Kbyte of RAM for the AVR, and then 4 peripherals of 4Kbyte each.
My question is: how do I arbitrate access to the shared RAM to the peripherals?
 

Offline Rufus

  • Super Contributor
  • ***
  • Posts: 2095
Re: FPGA Memory arbitration
« Reply #1 on: May 12, 2013, 01:09:50 am »
My question is: how do I arbitrate access to the shared RAM to the peripherals?

Don't think arbitrate is the right word but then not exactly clear what you are trying to do.

For what I think you are trying to do you just decode the AVR address bus providing a select signal for the RAM for addresses up to 48k and select signals for the 4k blocks above that. The select signals gate writes to the associated RAM or block and gate data onto the AVR data bus for reads.
 

Offline kd5pevTopic starter

  • Newbie
  • Posts: 3
Re: FPGA Memory arbitration
« Reply #2 on: May 12, 2013, 01:18:42 am »
Thank you.
I think I was trying to make it more complex than necessary.
I tend to do that.  |O

What I was thinking of doing was:
Code: [Select]

AVR <-> RAM <-> Black box <-> Peripheral 1
                          <-> Peripheral 2
                          <-> Peripheral 3
                          <-> Peripheral 4

What I got from your post was this:
Code: [Select]
AVR <-> Address decoder <-> RAM
                        <-> Peripheral 1
                        <-> Peripheral 2
                        <-> Peripheral 3
                        <-> Peripheral 4

Is that a correct interpretation?
If so, I think an address decoder would be much simpler to build than what I was originally going after.
 

Offline Rufus

  • Super Contributor
  • ***
  • Posts: 2095
Re: FPGA Memory arbitration
« Reply #3 on: May 12, 2013, 02:08:00 am »
Code: [Select]
[quote author=kd5pev link=topic=16824.msg230961#msg230961 date=1368321522]
AVR <-> Address decoder <-> RAM
                        <-> Peripheral 1
                        <-> Peripheral 2
                        <-> Peripheral 3
                        <-> Peripheral 4

Is that a correct interpretation?
If so, I think an address decoder would be much simpler to build than what I was originally going after.
[/quote]

In simplistic schematic terms yes. It is just the way microprocessors have always managed memory and peripherals on external busses.
 

Offline kd5pevTopic starter

  • Newbie
  • Posts: 3
Re: FPGA Memory arbitration
« Reply #4 on: May 12, 2013, 03:38:22 am »
In simplistic schematic terms yes. It is just the way microprocessors have always managed memory and peripherals on external busses.

Okay, I think I understand that now.

Now this is more of a hypothetical question at this point:
How do I handle two (or more) components that want access to the same RAM module?

Code: [Select]
Component A <-> |
                 <=> [ ?? ] <=> RAM
Component B <-> |
 

Offline marshallh

  • Supporter
  • ****
  • Posts: 1462
  • Country: us
    • retroactive
Re: FPGA Memory arbitration
« Reply #5 on: May 12, 2013, 04:57:55 am »
arbiter is the correct term. I did a design that had 4 separate ddr controllers, and 6 client modules that wanted to use them (only 2 had global access)

first thing is to have combinational muxes for the inputs to the actual ddr controller proper (this is just part of the arbiter)



here is the set of muxes for the outputs to the client module



and finally a very simple FSM for each controller that handles requests in a prioritized manner



all these are utilizing a block based approach. also you will notice there are synchronizers, each client was in a separate clock domain. CL device is at the top of the chain so it gets first priority, being the most timing critical.

This is probably more info that you wanted but there you go. You have X physical resource and you need to have a system for acknowledging requests (whether its for 8 bits, a 128bit word, or a block transfer) and then signaling completion and giving priority to some requests over the others.

Something this design lacks is pre-emption of transfers. In this case the granularity of transfers was small enough (512 bytes) that this could be overlooked. But if you all share a single RAM and you have sometihng that MUST have data NOW, you can extend the FSM to save the transfer state and hang up the client while the most important one cleans up.
Verilog tips
BGA soldering intro

11:37 <@ktemkin> c4757p: marshall has transcended communications media
11:37 <@ktemkin> He speaks protocols directly.
 

Offline free_electron

  • Super Contributor
  • ***
  • Posts: 8517
  • Country: us
    • SiliconValleyGarage
Re: FPGA Memory arbitration
« Reply #6 on: May 12, 2013, 02:06:38 pm »
You can do that perfectly fine. I use it all the time. Just instantiate dual port ram.

Port 1 goes to your processor. Port2 goes to your peripherals. The drawback is that you need to make a 'trap' meaning : a detector that sees that you just touched something in the top 4 k and tells the peripherals : there is new data.

I use dual ( and sometimes 3 port) memory as a passgate.

There is 2 blocks of 1 kbyte of dp ram. Cpu has read/ write on block 1 and read only on block 2.
Pc (through a usb controller) has read write on block 2 and read only on block 1.

So, the pc can see what the cpu is doing , and the cpu can see what the pc is doing.

Let's say you want to make a 'coprocessor'

Byte 1 and byte 2 of block 2 are data in. (Pc can readwrite, cpu can read)
Byte 1 and 2 of block 1 are data out( pc can read but cpu can readwrite)
Byte 1024 of block 2 is instruction (pc can readwrite)
Byte 1024 of block 1 is status (pc can read)

There is a 'trap' on both sides that detects writing to the top byte and provides an interrupt

So: the cpu code looks like this

Pragma location 1 volatile byte d0
Pragma location 2 volatile byte d1
Pragma location 1024 volatile byte opcode
Pragma location 1025 int result

Void interrupthandler(void) handles int1
   Case opcode
        0: result =d0+d1
        1: result = d0-d1
        2: result = d0*d1
   Endcase

And so on.
The pc writes data in d0,d1 and opcode. The trap detects the write to opcode and kicks the interrupt pin of the cpu. The cpu calculates in place. The writing of result triggers the return trap kicking the interrupt to the pc.

All data is written directly in place. No need for io routines, no need for moving data, no need for printf scanf or any other time consuming stuff. Its even faster than dma ( during dma most cpu cores are stalled as the dma controller blocks the bus to do the transport. Some dma controllers have segmented busses but those are an exception in small microcontroller land)

I made a piece of software where is define in and outgoing data and it produces the header files for cpu and pc code with the correct mapping and variable names.

Works like a charm. I do this 3 way. Pc, arm, fpga logic.
Professional Electron Wrangler.
Any comments, or points of view expressed, are my own and not endorsed , induced or compensated by my employer(s).
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf