Author Topic: softcore for learning purposes  (Read 7453 times)

0 Members and 1 Guest are viewing this topic.

Offline prophossTopic starter

  • Contributor
  • Posts: 46
  • Country: us
softcore for learning purposes
« on: October 09, 2020, 03:17:52 pm »
Hello all I am in need of some advice. I am currently a student working toward a computer engineering degree. I have a dev-board with an altera cyclone 4 on it. I have done some blinking of LEDs and used the seven seg display. I would like to begin to build a soft core of some kind. The kind of softcore doesn't matter as much as the doing. I am just trying build up my chops with the hope of maybe doing this fulltime one day.

Where to begin is the first problem I guess. What I would like to ask of you fine people is two things. What kind of softcore to build and how to create a list for creating said softcore. My assumption is that I will need to build several different "parts" that when put together will become the softcore. Could y'all give some good suggestions along these lines? I look forward to reading your responses.
Thanks, Brian
 

Offline chris_leyson

  • Super Contributor
  • ***
  • Posts: 1544
  • Country: wales
Re: softcore for learning purposes
« Reply #1 on: October 09, 2020, 03:52:53 pm »
Hi Brian, if you had a Xilinx dev board then I would say Ken Chapmans 8-bit Picoblaze microcontroller. I learnt a lot about the underlying Spartan-3 FPGA fabric by looking at the Picoblaze design and just for fun drew it up as a schematic. It was all designed using primitive logic functions so that no matter what version of the ISE synthesizer you used you would always get the same result. It's also probably one of the smallest if not the smallest 8-bit softcore processor. It's a straightforward fetch/execute architecture but it's a good starting point.
 
The following users thanked this post: NivagSwerdna

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9925
  • Country: us
Re: softcore for learning purposes
« Reply #2 on: October 09, 2020, 10:15:49 pm »
I think the LC3 project is the way to go.  It's a 16 bit RISC architecture with only word addressing.  Hold off on the LC3b version...

There a book: "Introduction to Computing Systems from bits and gates to C and beyond" Patt & Patel

https://www.amazon.com/Introduction-Computing-Systems-Gates-Beyond/dp/0070595003

What you really need to do is Google for the appendices:

https://www.cs.colostate.edu/~cs270/.Fall19/resources/PattPatelAppA.pdf
http://people.cs.georgetown.edu/~squier/Teaching/HardwareFundamentals/LC3-trunk/docs/LC3-uArch-PPappendC.pdf

Somewhere, you will find a form for creating the microcode if you choose that approach.  I didn't, I just coded a state machine from the state diagram.  I have often thought of recreating the project using microcode.

A lot of universities use this project so there is info all over the Internet.

One hint:  When you implement the registers, build 2 identical sets.  When you do a register write, you do it to both registers but you can read two different registers at the same time.  Like R0 = R0+R1 - you need to read both registers and send them to the ALU and then write the R0 register of both banks.  All in one clock and no intermediate storage.

As I said, I built mine with a big state machine with state numbers matching the diagram.  In the state diagram, it describes exactly which signals need to be asserted and those signals are carried over into the hardware block diagram.

You can find the assembler with just a little effort.  Here's one:
https://github.com/davedennis/LC3-Assembler

I broke my project into a dozen files as shown in the attachment.  The LC3 file is my top level, everything else is just another entity in a different file.  Notice there are two instances of Registers.vhd - as I mentioned above.  The SingleShot.vhd is out of scope, it was used to single step the clock during debugging.

The project is a little over 1700 lines of code, much of which is boilerplate and whitespace.  the main file with the state machine is just about 1000 of those lines of code.  It would be cut substantially when using microcode - probably by at least half.

I used a more substantial board and some of the code is related to displaying various registers on the 7 segment display.  The DisplayMux and Display are certainly optional.

I used the Digilent A7 100T board:

https://store.digilentinc.com/nexys-a7-fpga-trainer-board-recommended-for-ece-curriculum/

It is truly overkill, I'm using less than 1% of the logic and less than 25% of the BlockRAM.  But I like having switches, LEDs, 7 segment displays and pushbuttons.


« Last Edit: October 09, 2020, 10:32:03 pm by rstofer »
 

Offline cruff

  • Regular Contributor
  • *
  • Posts: 75
  • Country: us
Re: softcore for learning purposes
« Reply #3 on: October 09, 2020, 11:42:07 pm »
If your focus isn't also developing all of the software tools to go with it, be sure to pick something with a preexisting development tool chain.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2777
  • Country: ca
Re: softcore for learning purposes
« Reply #4 on: October 10, 2020, 12:03:17 am »
I'd suggest you forget about all this 8 and 16-bit old rubbish, and instead look at Risc-V. RV32I instruction set contains only 40 commands, a very simple multi-cycle implementation of it can be done in a day or two, and there are a ton of open source cores of various complexity and functionality. On top of that, there is a readily-available C/C++ gcc toolchain for the ISA, so you don't have to muck around with assembler for long if you don't want to.

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4401
  • Country: nz
Re: softcore for learning purposes
« Reply #5 on: October 10, 2020, 12:34:35 am »
I think the LC3 project is the way to go.  It's a 16 bit RISC architecture with only word addressing.  Hold off on the LC3b version...

I just had a look at LC3, which I hadn't previously come across.

It's a perfectly reasonable (if limited) RISC architecture, with the exception of LDI and STI -- load and store indirect which require a single instruction to access memory twice. This is not RISCy because it complicates the implementation. Ugh. Maybe they did it for pedagogical reasons. But ugh.

Arithmetic is limited to ADD, AND, and NOT. That's logically sufficient, but it means subtraction needs three instructions a + ~b + 1, OR needs four ~(~a & ~b), XOR needs .. uh .. seven minimum I think ~(~a & ~b) & ~(a & b). It would be better off with just ADD and NAND as primitives (AND, OR, XOR need 2, 3, 5 operations respectively). Or just NOR. Or just BIC.

The NOT encoding is strange, with the usual 6 bit immediate field required to be all 1s, where other instructions (JMP. JSRR) that don't use it require it to be all 0s. It's almost as if NOT is really an XORI instruction. Which would be very handy.

Weird that JMP and JSRR effectively have an immediate field to be added to the register, but require it to be 0. You could implement them using the standard reg+offset calculation to save logic, but you'd have to have the decoder check the offset is 0. Why not just allow non-zero?


All in all, I can't see why today you wouldn't implement RV32I or RV32E instead, maybe leaving out the byte and halfword loads and stores.

RV32I is about as simple to decode, has barely any more instructions with just four basic formats (ALU, ALU-IMM/LOAD/JALR, LUI/AUIPC/JAL, STORE/Bcc) with the extra instructions made up simply of what operation is enabled in the ALU.

And you get the benefit of a huge and growing software infrastructure with toolchain, libraries, OSes and RTOSes.

It doesn't make any difference to your Verilog whether your registers and buses and ALU are 8 or 16 or 32 bits wide -- it's a few more LUTs and blockram/FFs of course, but no more lines of code.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4401
  • Country: nz
Re: softcore for learning purposes
« Reply #6 on: October 10, 2020, 12:53:04 am »
I think the LC3 project is the way to go.  It's a 16 bit RISC architecture with only word addressing.  Hold off on the LC3b version...

I just had a look at LC3, which I hadn't previously come across.

It's a perfectly reasonable (if limited) RISC architecture

Scrub that.

Absolutely no shift or rotate instructions. Add can do a left shift, but to do right shift you'd need ... a table in memory. But for the table size to be reasonable you'd need to limit it to 4 or 8 bits at a time. And with no right shift and no byte load/store you've got no way to extract and right-align anything to make an index into the table.

Completely useless.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9925
  • Country: us
Re: softcore for learning purposes
« Reply #7 on: October 10, 2020, 01:08:53 am »
I'd suggest you forget about all this 8 and 16-bit old rubbish, and instead look at Risc-V. RV32I instruction set contains only 40 commands, a very simple multi-cycle implementation of it can be done in a day or two, and there are a ton of open source cores of various complexity and functionality. On top of that, there is a readily-available C/C++ gcc toolchain for the ISA, so you don't have to muck around with assembler for long if you don't want to.

Is there a documented project out in the wild where this has been done?  Maybe some kind of hardware block diagram, a prototype FSM, a state diagram or any other documentation that might make the project achievable by beginners?

This is a start: http://users.ece.cmu.edu/~jhoe/course/ece447/S18handouts/L02.pdf

For LC3 there is a C compiler based on LCC - I haven't tried it but I really should.

http://users.ece.utexas.edu/~ryerraballi/ConLC3.html

I would start by recreating a hex loader program similar to the old Intel 8080 Monitor.  This would require some firmware inside the FPGA that would load code over a serial port.  Trivial in assembly language and even easier in C.  An amazing number of systems were brought up with such a Monitor back in the day.

It isn't clear from the OP that the author has much, if any, experience with VHDL but rather is a student working on a side project for interview points.  The project needs to be achievable in some reasonable period of time from that starting point.  The "Introduction to Computing Systems" book is intended as a two semester freshman level course or as a 1 semester course following some other introductory course.  Of course, the fun stuff is in the second half and the first half can be assumed for any EE student beyond the first semester of computing.

I would think LC3 is achievable and a simple Monitor is almost trivial.  Adding CP/M (or other OS) is just topping when it comes to an interview.


« Last Edit: October 10, 2020, 01:28:02 am by rstofer »
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4401
  • Country: nz
Re: softcore for learning purposes
« Reply #8 on: October 10, 2020, 01:38:58 am »
I'd suggest you forget about all this 8 and 16-bit old rubbish, and instead look at Risc-V. RV32I instruction set contains only 40 commands, a very simple multi-cycle implementation of it can be done in a day or two, and there are a ton of open source cores of various complexity and functionality. On top of that, there is a readily-available C/C++ gcc toolchain for the ISA, so you don't have to muck around with assembler for long if you don't want to.

Is there a documented project out in the wild where this has been done?  Maybe some kind of hardware block diagram, a prototype FSM, a state diagram or any other documentation that might make the project achievable by beginners?

LC3 is close enough to RISC-V / MIPS that a microarchitecture for it could implement one of those instead by just widening the registers and buses and changing the instruction decoder and ALU a little.


Quote
For LC3 there is a C compiler based on LCC - I haven't tried it but I really should.

It can't be full C as there is no way to do a right shift.

Technically you could implement C with 16 bit "char", but a lot of software wouldn't like that. You can extract the low byte of a 16 bit word with AND, but there is no way to extract the high byte into the low byte.

Ok, I lie. You could use (using only operations directly supported):

Code: [Select]
short hi_byte(short n){
  short res = 0, hi = 256, lo = 1;
  while (hi){
    if (n & hi) res += lo;
    hi += hi;
    lo += lo;
  }
  return res;
}

Ugh.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9925
  • Country: us
Re: softcore for learning purposes
« Reply #9 on: October 10, 2020, 01:44:30 am »
That RV32I project seems interesting.  I have found at least one link re: building the gcc toolchain.  I consider this the difficult end of the project and it is nice to find a link to a working solution.

https://github.com/cliffordwolf/picorv32#building-a-pure-rv32i-toolchain

Now to go find out just how complicated the CPU is...


 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9925
  • Country: us
Re: softcore for learning purposes
« Reply #10 on: October 10, 2020, 01:48:17 am »
LC3 is close enough to RISC-V / MIPS that a microarchitecture for it could implement one of those instead by just widening the registers and buses and changing the instruction decoder and ALU a little.

If I can find enough documentation of the RV32I ISA, I may just give it a try.  First, I'll build the toolchain...
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4401
  • Country: nz
Re: softcore for learning purposes
« Reply #11 on: October 10, 2020, 01:50:11 am »
That RV32I project seems interesting.  I have found at least one link re: building the gcc toolchain.  I consider this the difficult end of the project and it is nice to find a link to a working solution.

https://github.com/cliffordwolf/picorv32#building-a-pure-rv32i-toolchain

Now to go find out just how complicated the CPU is...

Here's a simple RV32I by our own @hamster_nz

https://github.com/hamsternz/Rudi-RV32I
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4401
  • Country: nz
Re: softcore for learning purposes
« Reply #12 on: October 10, 2020, 01:51:38 am »
LC3 is close enough to RISC-V / MIPS that a microarchitecture for it could implement one of those instead by just widening the registers and buses and changing the instruction decoder and ALU a little.

If I can find enough documentation of the RV32I ISA, I may just give it a try.  First, I'll build the toolchain...

https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/riscv-spec-20191213.pdf

All you need is Chapter 2
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2777
  • Country: ca
Re: softcore for learning purposes
« Reply #13 on: October 10, 2020, 01:54:06 am »
If I can find enough documentation of the RV32I ISA, I may just give it a try.  First, I'll build the toolchain...
All ISA documentation is here: https://riscv.org/technical/specifications/
But there is no need to build anything, you can easily find already-made toolchain via Google.

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4401
  • Country: nz
Re: softcore for learning purposes
« Reply #14 on: October 10, 2020, 01:57:20 am »
True, but building a toolchain for Linux is as easy as copying and pasting a handful of lines from a web page. On a modern machine the git checkouts will take longer than the actual build.
« Last Edit: October 10, 2020, 02:00:49 am by brucehoult »
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9925
  • Country: us
Re: softcore for learning purposes
« Reply #15 on: October 10, 2020, 02:02:34 am »
Here is another resource which has a 3500 line verilog implementation plus build instructions for the toolchain.  For giggles, I'm trying to build it on a Raspberry Pi 4.  I think it will take a while...

https://github.com/cliffordwolf/picorv32

ETA:  Tomorrow I'll build it on my workstation.  The Pi 4 is pretty quick, for a Pi, but if the build works, I'll just do it over.


« Last Edit: October 10, 2020, 02:20:59 am by rstofer »
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2777
  • Country: ca
Re: softcore for learning purposes
« Reply #16 on: October 10, 2020, 02:15:44 am »
My point is that if you want to study an ISA, you might as well study one that is actually modern and used, as opposed to ancient stuff that now only exist in museums. RV popularity seems to be growing faster and faster, I can't wait to get my hands on PolarFire SoC devices which contain quad RV64 CPU with PCIe and a lot of FPGA resources to implement other peripherals - as long as it's reasonably priced and free tools are available for it.

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4401
  • Country: nz
Re: softcore for learning purposes
« Reply #17 on: October 10, 2020, 02:31:01 am »
My point is that if you want to study an ISA, you might as well study one that is actually modern and used, as opposed to ancient stuff that now only exist in museums. RV popularity seems to be growing faster and faster

I certainly agree with all that.

Quote
I can't wait to get my hands on PolarFire SoC devices which contain quad RV64 CPU with PCIe and a lot of FPGA resources to implement other peripherals - as long as it's reasonably priced and free tools are available for it.

The Icicle board is $500 and was supposed to be shipping on September 30. Crowdsupply does't give tracking information but but hopefully real soon. It's a reputable company (Microchip) behind it.

The Icicle uses a pretty big 250k element FPGA, so hopefully other boards using smaller ones will be cheaper and available sometime soon.

The Silver license for Libero is free, you just have to renew it every 12 months. Apparently -- I don't have it yet.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4401
  • Country: nz
Re: softcore for learning purposes
« Reply #18 on: October 10, 2020, 02:52:27 am »
But there is no need to build anything, you can easily find already-made toolchain via Google.

Possibly not for a Raspberry Pi :-)  At least not for Raspbian. Ubuntu (which I'm running 64 bit on my Pi 4) has a package these days:

apt install g++-riscv64-linux-gnu

Not sure if that works on Pi.

However that uses glibc, wheras you probably want a newlib (and static linking by default) toolchain for an embedded CPU.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2777
  • Country: ca
Re: softcore for learning purposes
« Reply #19 on: October 10, 2020, 03:24:29 am »
The Icicle board is $500 and was supposed to be shipping on September 30. Crowdsupply does't give tracking information but but hopefully real soon. It's a reputable company (Microchip) behind it.
The Icicle uses a pretty big 250k element FPGA, so hopefully other boards using smaller ones will be cheaper and available sometime soon.
I prefer designing my own boards, so I need access to actual devices (and documentation of course).

The Silver license for Libero is free, you just have to renew it every 12 months. Apparently -- I don't have it yet.
Does it cover entire family, or there are some shenanigans going on (which is typical, but unfortunate)? And what about commonly used IPs (like memory controller, DMA, infrastructure stuff like that)? I've been somewhat spoiled by wide range of free IPs provided by Xilinx, so I'm not familiar with Microsemi's ecosystem. I mean, by now I know enough to be able to implement my own memory controllers (and would do so if Xilinx were to actually document all undocumented HW bits and pieces used in their controllers), but having something preverified and guaranteed to work makes hardware bring up much less stressful as you at least have a known-good bitstream.

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4401
  • Country: nz
Re: softcore for learning purposes
« Reply #20 on: October 10, 2020, 03:37:17 am »
The Icicle board is $500 and was supposed to be shipping on September 30. Crowdsupply does't give tracking information but but hopefully real soon. It's a reputable company (Microchip) behind it.
The Icicle uses a pretty big 250k element FPGA, so hopefully other boards using smaller ones will be cheaper and available sometime soon.
I prefer designing my own boards, so I need access to actual devices (and documentation of course).

No doubt they'll be in digikey and mouser alongside the plain PolarFire FPGAs fairly soon.

Quote
The Silver license for Libero is free, you just have to renew it every 12 months. Apparently -- I don't have it yet.
Does it cover entire family, or there are some shenanigans going on (which is typical, but unfortunate)? And what about commonly used IPs (like memory controller, DMA, infrastructure stuff like that)?

No clue as I (and everyone else) haven't had our hands on one yet.

The Icicle board comes with pre-programmed FPGA and eMMC which allows it to boot Linux out of the box, with access to the 1 GB of RAM, the ethernet, uart etc.

I certainly hope they supply the source code for the standard FPGA programming, and that the free Silver level of LIbero can build it, and that you can make modifications to it. But I don't know.
 

Offline FlyingDutch

  • Regular Contributor
  • *
  • Posts: 147
  • Country: pl
Re: softcore for learning purposes
« Reply #21 on: October 10, 2020, 06:53:46 am »
Hello,

if you have got an Altera/Intel board with Cyclone IV just use "NIOSII" soft-processor. The simpler version "NiosIIe" is available for free in "QuartusLite" software for synthessis. This sof-core is made by Intel and distributed as an IPCore. This is 32-bit soft-CPU with many peripherals and modules. There are also free tools for development projects with NiosII CPU. See: Tools -> "Platform Designer" helping with using various peripherals and conecting them, and "C SDK" and compiler for this soft-CPU "Nios II Software Build Tools for Eclipse". So you have for free tools for synthessis Nios II CPU with different peripherals and C/C++ language compiler for writing programs for this CPU. See screen and links at the end of message.

https://www.intel.pl/content/www/pl/pl/products/programmable/processor/nios-ii.html

https://en.wikipedia.org/wiki/Nios_II





Here is link to zipped project (Quartus20.1):
https://www.dropbox.com/s/gjrq33uzycsg84r/NIOSIIeMin01.zip?dl=0

BTW: If you want to use it - you should first change device from Max10 FPGA to your model of Cyclone IV, and rebuild the project!

There are also many of soft-cores on "OpenCores" web page:

https://opencores.org/

and if you need quick start with RiSC-V architecture  just try "InstantSoC":

https://www.fpga-cores.com/instant-soc/

Best Regards
« Last Edit: October 10, 2020, 10:15:02 am by FlyingDutch »
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2812
  • Country: nz
Re: softcore for learning purposes
« Reply #22 on: October 10, 2020, 07:05:02 am »
Would strongly recommend making your own RISC V design too

Have a look at my working-but-simple RV32I CPU if you want to get an idea of the size of the project you are thinking of undertaking:

https://github.com/hamsternz/Rudi-RV32I

It is nothing spectacular, but you can write programs in C and have them run at 50M+ instructions per sec.

Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4401
  • Country: nz
Re: softcore for learning purposes
« Reply #23 on: October 10, 2020, 09:00:01 am »
Hello: I cannot add packed project (Quartus 20.1) because it has about 10 MB in size. What can I do in such situation?

You can upload it to a site designed to store large files such as DropBox, Google Drive, Apple iCloud Drive, MIcroSoft OneDrive and then paste a URL here.
 
The following users thanked this post: FlyingDutch

Offline tggzzz

  • Super Contributor
  • ***
  • Posts: 20350
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: softcore for learning purposes
« Reply #24 on: October 10, 2020, 10:05:09 am »
... I would like to begin to build a soft core of some kind. ... What kind of softcore to build and how to create a list for creating said softcore. My assumption is that I will need to build several different "parts" that when put together will become the softcore. ...

Good for you; you will learn a lot and your experiences will be useful during job interviews.

You need to think about what you mean by "build" and "create":
  • Sometimes they mean "instantiate a block" and add things around it. Typically that is what happens in an industrial setting, but it is fundamentally boring.
  • Sometimes they mean "something new from primitive components like registers and gates". That is more interesting and you will learn a lot more.

In either case, it is then necessary to program them
  • in a high-level language, in which case you should probably decide on the toolchain first, which implies a standard ISA
  • in assembler, which means you could either invent your own ISA from scratch, or copy an old 8-bit ISA

An an interviewer for an embedded role, I'd be most impressed by reimplementing something like a 6502/8080, and getting something running in that. It should also be possible to find a toolchain that could generate code for those.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf