Electronics > FPGA

GOWIN Semi FPGA - BRAM IP usage

<< < (3/3)

SiliconWizard:
I don't know how many LUTs a Picoblaze would take. But I know it's a pretty limited architecture that is terribly inefficient for most programming languages, except using directly assembly, or using Forth, from what I've seen.

As an alternative, I can suggest the LatticeMico8 : https://www.latticesemi.com/Products/DesignSoftwareAndIP/IntellectualProperty/IPCore/IPCores02/Mico8.aspx
It takes as few as ~200 LUTs on a MachXO2 (LUT4), so that should fit here. And IMO it's more usable than the Picoblaze if for instance you want to use C.
Source code is available (Verilog) along with a version of GCC that supports it as a target. The fact it's in Verilog is no problem, you can absolutely interface it with VHDL code. I have never used it on FPGAs other than Lattice's, though, so I don't know how much effort "porting it" (if there is any significant porting to do) would be.

These days, I would of course also have suggested a RISC-V core, but even the small PicoRV32 in the smallest configuration will take more than 400 LUTs.

Anyway, 400 LUTs - that's really not much. Even if you can fit a soft core (and the LatticeMico8 would), you'll need additional resources for interfacing it with the rest of you design and maybe add a couple peripherals to enable it to do anything useful, so that'll be kind of hard to fit.

nctnico:

--- Quote from: SiliconWizard on December 01, 2021, 07:28:56 pm ---I don't know how many LUTs a Picoblaze would take. But I know it's a pretty limited architecture that is terribly inefficient for most programming languages, except using directly assembly, or using Forth, from what I've seen.

--- End quote ---
Don't even bother using anything else than assembler on the Picoblaze! Part of the fun when using the Picoblaze is determining whether to implement something in the Picoblaze software or in the FPGA. At some point I had to do something with samples using u-law compression. Instead of decompressing in software, I created a ulaw compressor / decompressor in the FPGA and all the Picoblaze had to do is move the data around.

Picoblaze needs only a handfull of LUTs. That is the whole purpose of it. IIRC it was tailored specifically to fit the LUTs in Xilinx Virtex2 / Spartan devices. The whole thing was coded by instantiating device specific primitives. The Gowin FPGA the OP uses is really really small and with what is leftover it will be hard to fit a core with more features.

c64:
Another option, if you have some extra time, you can make your own cpu. You don't need many instructions and it can easily fit in 100-200 LUT. Registers can be stored in memory to save LUTs, like on MC6800. But you can only code in assembly

up8051:
After changes to the main program, I have about free 650 LUTs .

I wouldn't like to use Verilog because I don't know it and making corrections can be confusing.
I will try to use Micro8 (because there are sources in VHDL) or pauloBlaze .

Gowin has IP for picoRV but only for larger FPGAs.

gnuarm:
I have worked with my own soft core CPUs and looked at others.  I prefer the MISC style processor (Minimal Instruction Set Computer) which is pretty much a simple stack machine (also called NOSC for No Operand Set Computer).  I also program in assembly/Forth since they are very similar on a stack machine and you don't need fancy debugging tools. 

Here is a generic reference to a wide variety of soft core CPUs from the most simple to the most complex.  If you spend some time looking at this data (it's a LOT btw) you will find some things that blow your mind. 

https://opencores.org/projects/up_core_list/summary

https://opencores.org/projects/up_core_list/downloads

https://opencores.org/usercontent/doc/1523749899

There is a microBlaze that only uses 260 6LUTs in a kintex-7-3.  Yeah, he doesn't try to level the reporting, he just tells you what type of LUTs were used, 4 input, 6 input, A which I assume means Altera?

Even more interesting is the 320 6LUTs risc-v virtex-u-2 by Jan Gray that gets 1171.9 KIPS per LUT!  It is 1 clock per instruction and runs at up to 375 MHz.  Ohhh... it's proprietary, only in the list for reference... :(

If you are interested in the stack processors (labeled "Forth" in the tables) check out the J1 and J1a from James Bowman.  It is a simple design, 1 clock per instruction up to 400 MHz.  It is in Verilog, but I believe it has been translated to VHDL which is not hard even if you do it yourself.  Rather than worry with language issues, just look at what registers and logic the Verilog is describing and write that in VHDL.  It is most likely pretty straight forward code. 

Anyway, lots and lots of data to analyze there.  The guy's name is Jim Brakefield.  He must really love this stuff to do all this work.  He's been keeping this list for at least 10 years, probably longer. 

Navigation

[0] Message Index

[*] Previous page

There was an error while thanking
Thanking...
Go to full version