Author Topic: Selection and general know how of building an Arcade ROM board emulator (FPGA)  (Read 2491 times)

0 Members and 1 Guest are viewing this topic.

Offline mclarksonTopic starter

  • Contributor
  • Posts: 10
  • Country: ca
I've spend the last month analyzing and dissecting the sega system 32 arcade ROM Boards.   These come in 3 flavors, and so far I've analyzed and determined all traces in the first 2 ROM boards released (I'm currently waiting on the 3rd on in the mail, but they all seem to be very similar).

A picture of a board and the busses outlined is attached, as well as a export of a schematic I've generated out of EasyEDA which is a combo of the two boards (They are all very similar, identical IC's, and busses between all).  Obviously I don't care much about the commercial aspect here, this is a hobby to me.  :)

The large picture idea that I have here is that I would like to make a multi purpose, and reusable FPGA board (FPGA & memory), and a separate larger board that would sit between the sega motherboard and the mentioned fpga board that would do all voltage conversions (5v to 3.3v and vice versa) and contain a controlling micro with sd card for loading fgpa's associated memory. 
If I can do this, I would like to apply the same FPGA board to all previous Sega ROM board based systems (System 24, 18 and 16 variants).  This would save on complexity and cost, and anyone would just need a single fpga board, and it's respective IO and micro board for each system that they own)

The question that I have based on my inexperience is this, what kind of FPGA should I select for this?  What I am dealing with is the following
  • about ~ 160 IO pins
  • 4 Seperate busses, 2 16bit output busses, and 2 8bit output busses (see attached image)
  • I will need to determine the speed / access times of each of the busses, but from the specs on the system, the graphics card runs at 50 Mhz, while the rest of the system is 16Mhz (CPU) and 8Mhz (Sound).  The factory roms have a access time of 45ns based on https://segaretro.org/Sega_System_32

My initial thoughts on this is that this could be done one of many ways
  • I could utilize a Trion FPGA with ample IO pins, like a t120 with 256 pins, and connect 3 sdram chips to the IO on it, and hope I can stay within the total GPIO pin count of the t120 and serve data out this way (utilizing 1 single sdram chip for the slower CPU and SOUND busses)
  • utilizing the same t120 with ddr this time, I could possibly server out all 4 busses with a single ddr chip and the built in ddr controller.  However, I don't know how this would work, is this possible to read 4 independent busses worth of data within a single clock cycle and serve data out to all 4 busses?  I don't need to do any data bursting or anything from the memory
  • utlizing a couple Trion t20's (cheaper ones this time) I would just split it up to two busses per t20, and put multiple sdram chips (1 per bus) to serve out data to their respective busses.  This is possibly the simplest of solutions, but more difficult to manage as it's now 2 fpga's instead of 1.

Based on others experience, is this doable?  Is there one approach that should supersede any other?  Am I completely out to lunch?

Thanks in advance if anyone can share their thoughts and experience on this, and steer me in the proper direction.
 

Offline CountChocula

  • Supporter
  • ****
  • Posts: 208
  • Country: ca
  • I break things—sometimes on purpose.
Howdy! Other things you should probably consider are whether the FPGA you choose supports the complexity of your design (e.g.: number of gates, memory access, custom IP, etc.), the availability, and how easy it is to develop for and program (as this can add both $$$ and aggravation to the cost of the chip itself, which I'm assuming you've already considered). If you need multiple clocks and they run at frequencies that are not multiples of each other, you will also need an FPGA capable of supporting them. Some FPGAs have 5V-tolerant IOs, which might save you having to deal with level shifting.

My suggestion would be to start by spelling out all your requirements, and identifying those things you don't yet know (e.g.: how many gates you will need, or whether you will need any kind of custom IP), and then look for a possible candidate by doing a parametric search at one of the large distributors. Rather than going directly to building a custom PCB, I would start by buying an evaluation board (the availability of which could also well be one of your req's) and use that as a starting point. The good news is that, as long as you don't use any custom IP, the code tends to be pretty portable, so it's probably not the end of the world if your choice ends up being incorrect for one reason or another.

IMO, breaking things down into multiple FPGAs is risky. It can be done, and it's probably easier if you isolate whole functions, but it will likely also increase cost and complexity.

Oh—and don't forget to look for existing solutions you can borrow from. There are lots of emulators out there, so someone may have already solved your problem, at least in part.


—CC
Lab is where your DMM is.
 

Offline mclarksonTopic starter

  • Contributor
  • Posts: 10
  • Country: ca
Thanks for the quick reply.

As I'm new to this, I don't know the number of required gates yet, but since the function of this board is just simply accessing memory, I would imagine my gate requirement is going to be very minimal and almost any FPGA would fit the bill here (there's some flip flops (4 IC's total), latches (4 IC's total again) and multiplexors (1) that I need to put into the fpga, but it is highly just dependent on memory access as the primary function).

Breaking it down to separate FPGA's isn't bad in this particular situation, as all busses are are completely independent, it is purely just loading the memory that would add more complexity here.  I didn't think much about this before, but I am actually now starting to lean this way.  A couple of the less expensive Trion t20's is actually less expensive than a t120 that I was looking at.

I suppose the thing I am struggling with the most is memory access.  Is it possible to request 4 individual and independent 16-bit calls from a single sdram or ddr2l chip in a single clock cycle?

Oh yeah, and I haven't had much luck sourcing 5v tolerant boards, I think these are a thing of the past, especially with the number of IO that I need.  Also, I was going to get a t20 Dev board and play with that (not sure why I'm so sold on Trion  :), probably just the cost)

Thanks again
« Last Edit: September 23, 2023, 08:52:20 pm by mclarkson »
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Without being able to both know when your source is performing a read and the ability to add a wait-state if you need to, you potentially will never be able to use your own DRAM controller to replace a static ROM board.  The moment you need to perform a refresh cycle, something which can pop up at random and take more than 100ns, you may fail the system if it requests a read at the same time.  I know we can read super-fast any bottom end DDR device, jumping address rows and banks, closing old banks and that occasional refresh will break your system.  Older simple slow systems used to have knowledge of the CPU  bus cycle and reserved a refresh every 'X' bus phase or cycles.

Using 1 mega big static eprom or parallel flash ram with just setting the top address bits with a modified existing board may be the easiest way for you to continue.

I see the sound uses 8megabits devices while the other sections use 1x16 and 2x8x1 and 2x8x2 (16bits), 4 of each.  This is a lot of rom memory.

Super large parallel flash memories do exist, but they are all 3.3v.  All you need to do is use 'LVT245's (powered at 4.5v) the data buss out, using the OE on the 245 with the correct logic to convert the 3.3v data to 5v.  All the address lines can be feed to the flash through a resistor divider or 'LVC/LVT245's (powered at 3.3v) in the other direction.  You will also need a 5v-3.3v regulator onboard.

For example 4 of your 1mx8 sound eproms could be replaced by a single 4mx8 eeprom.  For 4 different games, you will need a 16mx8 eeprom.  A single 6$ device at digikey would fill all your sound needs.  Though, they are 70ns access time.  4 of them can populate your entire rom board.

Something like this: https://www.digikey.com/en/products/detail/issi-integrated-silicon-solution-inc/IS29GL128-70SLEB/11568727
  (Note they have 8/16bit data IO port versions.)

One of those for sound, and half/quarter sized for the other side.
Getting an FPGA to emulate a rom is not a problem, but if the source is DRAM, you will need to add random wait states to the reading device's CPU to wait for a read.  The other problem is the programming, make sure you can program them some way.
« Last Edit: September 23, 2023, 11:38:53 pm by BrianHG »
 
The following users thanked this post: Someone

Offline mclarksonTopic starter

  • Contributor
  • Posts: 10
  • Country: ca
Thank you BrianHG!

Super good information, and your solution is consistent to similar solutions for other systems that have been made.  I suppose I was just thinking "Why can't I do it this way?", well, I suppose that's probably why (refreshes).

For some reason, I really like the idea of throwing a FPGA in as a mediator (maybe so I can do the flip flop, latches and a multiplexor in there) to make it generic and re-usable for other systems, but ultimately I am going to have to determine if it is worth saving 9 IC's at the end of the day and to add all the complexity and cost of the FPGA.  One way or another I will have to do voltage changing, so it won't negate the need for a ton of 245's.

Thanks again, I've got a lot of researching to do by the looks of it!
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf