Author Topic: softcore for learning purposes  (Read 7466 times)

0 Members and 1 Guest are viewing this topic.

Offline prophossTopic starter

  • Contributor
  • Posts: 46
  • Country: us
softcore for learning purposes
« on: October 09, 2020, 03:17:52 pm »
Hello all I am in need of some advice. I am currently a student working toward a computer engineering degree. I have a dev-board with an altera cyclone 4 on it. I have done some blinking of LEDs and used the seven seg display. I would like to begin to build a soft core of some kind. The kind of softcore doesn't matter as much as the doing. I am just trying build up my chops with the hope of maybe doing this fulltime one day.

Where to begin is the first problem I guess. What I would like to ask of you fine people is two things. What kind of softcore to build and how to create a list for creating said softcore. My assumption is that I will need to build several different "parts" that when put together will become the softcore. Could y'all give some good suggestions along these lines? I look forward to reading your responses.
Thanks, Brian
 

Online chris_leyson

  • Super Contributor
  • ***
  • Posts: 1544
  • Country: wales
Re: softcore for learning purposes
« Reply #1 on: October 09, 2020, 03:52:53 pm »
Hi Brian, if you had a Xilinx dev board then I would say Ken Chapmans 8-bit Picoblaze microcontroller. I learnt a lot about the underlying Spartan-3 FPGA fabric by looking at the Picoblaze design and just for fun drew it up as a schematic. It was all designed using primitive logic functions so that no matter what version of the ISE synthesizer you used you would always get the same result. It's also probably one of the smallest if not the smallest 8-bit softcore processor. It's a straightforward fetch/execute architecture but it's a good starting point.
 
The following users thanked this post: NivagSwerdna

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #2 on: October 09, 2020, 10:15:49 pm »
I think the LC3 project is the way to go.  It's a 16 bit RISC architecture with only word addressing.  Hold off on the LC3b version...

There a book: "Introduction to Computing Systems from bits and gates to C and beyond" Patt & Patel

https://www.amazon.com/Introduction-Computing-Systems-Gates-Beyond/dp/0070595003

What you really need to do is Google for the appendices:

https://www.cs.colostate.edu/~cs270/.Fall19/resources/PattPatelAppA.pdf
http://people.cs.georgetown.edu/~squier/Teaching/HardwareFundamentals/LC3-trunk/docs/LC3-uArch-PPappendC.pdf

Somewhere, you will find a form for creating the microcode if you choose that approach.  I didn't, I just coded a state machine from the state diagram.  I have often thought of recreating the project using microcode.

A lot of universities use this project so there is info all over the Internet.

One hint:  When you implement the registers, build 2 identical sets.  When you do a register write, you do it to both registers but you can read two different registers at the same time.  Like R0 = R0+R1 - you need to read both registers and send them to the ALU and then write the R0 register of both banks.  All in one clock and no intermediate storage.

As I said, I built mine with a big state machine with state numbers matching the diagram.  In the state diagram, it describes exactly which signals need to be asserted and those signals are carried over into the hardware block diagram.

You can find the assembler with just a little effort.  Here's one:
https://github.com/davedennis/LC3-Assembler

I broke my project into a dozen files as shown in the attachment.  The LC3 file is my top level, everything else is just another entity in a different file.  Notice there are two instances of Registers.vhd - as I mentioned above.  The SingleShot.vhd is out of scope, it was used to single step the clock during debugging.

The project is a little over 1700 lines of code, much of which is boilerplate and whitespace.  the main file with the state machine is just about 1000 of those lines of code.  It would be cut substantially when using microcode - probably by at least half.

I used a more substantial board and some of the code is related to displaying various registers on the 7 segment display.  The DisplayMux and Display are certainly optional.

I used the Digilent A7 100T board:

https://store.digilentinc.com/nexys-a7-fpga-trainer-board-recommended-for-ece-curriculum/

It is truly overkill, I'm using less than 1% of the logic and less than 25% of the BlockRAM.  But I like having switches, LEDs, 7 segment displays and pushbuttons.


« Last Edit: October 09, 2020, 10:32:03 pm by rstofer »
 

Offline cruff

  • Regular Contributor
  • *
  • Posts: 75
  • Country: us
Re: softcore for learning purposes
« Reply #3 on: October 09, 2020, 11:42:07 pm »
If your focus isn't also developing all of the software tools to go with it, be sure to pick something with a preexisting development tool chain.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #4 on: October 10, 2020, 12:03:17 am »
I'd suggest you forget about all this 8 and 16-bit old rubbish, and instead look at Risc-V. RV32I instruction set contains only 40 commands, a very simple multi-cycle implementation of it can be done in a day or two, and there are a ton of open source cores of various complexity and functionality. On top of that, there is a readily-available C/C++ gcc toolchain for the ISA, so you don't have to muck around with assembler for long if you don't want to.

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #5 on: October 10, 2020, 12:34:35 am »
I think the LC3 project is the way to go.  It's a 16 bit RISC architecture with only word addressing.  Hold off on the LC3b version...

I just had a look at LC3, which I hadn't previously come across.

It's a perfectly reasonable (if limited) RISC architecture, with the exception of LDI and STI -- load and store indirect which require a single instruction to access memory twice. This is not RISCy because it complicates the implementation. Ugh. Maybe they did it for pedagogical reasons. But ugh.

Arithmetic is limited to ADD, AND, and NOT. That's logically sufficient, but it means subtraction needs three instructions a + ~b + 1, OR needs four ~(~a & ~b), XOR needs .. uh .. seven minimum I think ~(~a & ~b) & ~(a & b). It would be better off with just ADD and NAND as primitives (AND, OR, XOR need 2, 3, 5 operations respectively). Or just NOR. Or just BIC.

The NOT encoding is strange, with the usual 6 bit immediate field required to be all 1s, where other instructions (JMP. JSRR) that don't use it require it to be all 0s. It's almost as if NOT is really an XORI instruction. Which would be very handy.

Weird that JMP and JSRR effectively have an immediate field to be added to the register, but require it to be 0. You could implement them using the standard reg+offset calculation to save logic, but you'd have to have the decoder check the offset is 0. Why not just allow non-zero?


All in all, I can't see why today you wouldn't implement RV32I or RV32E instead, maybe leaving out the byte and halfword loads and stores.

RV32I is about as simple to decode, has barely any more instructions with just four basic formats (ALU, ALU-IMM/LOAD/JALR, LUI/AUIPC/JAL, STORE/Bcc) with the extra instructions made up simply of what operation is enabled in the ALU.

And you get the benefit of a huge and growing software infrastructure with toolchain, libraries, OSes and RTOSes.

It doesn't make any difference to your Verilog whether your registers and buses and ALU are 8 or 16 or 32 bits wide -- it's a few more LUTs and blockram/FFs of course, but no more lines of code.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #6 on: October 10, 2020, 12:53:04 am »
I think the LC3 project is the way to go.  It's a 16 bit RISC architecture with only word addressing.  Hold off on the LC3b version...

I just had a look at LC3, which I hadn't previously come across.

It's a perfectly reasonable (if limited) RISC architecture

Scrub that.

Absolutely no shift or rotate instructions. Add can do a left shift, but to do right shift you'd need ... a table in memory. But for the table size to be reasonable you'd need to limit it to 4 or 8 bits at a time. And with no right shift and no byte load/store you've got no way to extract and right-align anything to make an index into the table.

Completely useless.
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #7 on: October 10, 2020, 01:08:53 am »
I'd suggest you forget about all this 8 and 16-bit old rubbish, and instead look at Risc-V. RV32I instruction set contains only 40 commands, a very simple multi-cycle implementation of it can be done in a day or two, and there are a ton of open source cores of various complexity and functionality. On top of that, there is a readily-available C/C++ gcc toolchain for the ISA, so you don't have to muck around with assembler for long if you don't want to.

Is there a documented project out in the wild where this has been done?  Maybe some kind of hardware block diagram, a prototype FSM, a state diagram or any other documentation that might make the project achievable by beginners?

This is a start: http://users.ece.cmu.edu/~jhoe/course/ece447/S18handouts/L02.pdf

For LC3 there is a C compiler based on LCC - I haven't tried it but I really should.

http://users.ece.utexas.edu/~ryerraballi/ConLC3.html

I would start by recreating a hex loader program similar to the old Intel 8080 Monitor.  This would require some firmware inside the FPGA that would load code over a serial port.  Trivial in assembly language and even easier in C.  An amazing number of systems were brought up with such a Monitor back in the day.

It isn't clear from the OP that the author has much, if any, experience with VHDL but rather is a student working on a side project for interview points.  The project needs to be achievable in some reasonable period of time from that starting point.  The "Introduction to Computing Systems" book is intended as a two semester freshman level course or as a 1 semester course following some other introductory course.  Of course, the fun stuff is in the second half and the first half can be assumed for any EE student beyond the first semester of computing.

I would think LC3 is achievable and a simple Monitor is almost trivial.  Adding CP/M (or other OS) is just topping when it comes to an interview.


« Last Edit: October 10, 2020, 01:28:02 am by rstofer »
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #8 on: October 10, 2020, 01:38:58 am »
I'd suggest you forget about all this 8 and 16-bit old rubbish, and instead look at Risc-V. RV32I instruction set contains only 40 commands, a very simple multi-cycle implementation of it can be done in a day or two, and there are a ton of open source cores of various complexity and functionality. On top of that, there is a readily-available C/C++ gcc toolchain for the ISA, so you don't have to muck around with assembler for long if you don't want to.

Is there a documented project out in the wild where this has been done?  Maybe some kind of hardware block diagram, a prototype FSM, a state diagram or any other documentation that might make the project achievable by beginners?

LC3 is close enough to RISC-V / MIPS that a microarchitecture for it could implement one of those instead by just widening the registers and buses and changing the instruction decoder and ALU a little.


Quote
For LC3 there is a C compiler based on LCC - I haven't tried it but I really should.

It can't be full C as there is no way to do a right shift.

Technically you could implement C with 16 bit "char", but a lot of software wouldn't like that. You can extract the low byte of a 16 bit word with AND, but there is no way to extract the high byte into the low byte.

Ok, I lie. You could use (using only operations directly supported):

Code: [Select]
short hi_byte(short n){
  short res = 0, hi = 256, lo = 1;
  while (hi){
    if (n & hi) res += lo;
    hi += hi;
    lo += lo;
  }
  return res;
}

Ugh.
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #9 on: October 10, 2020, 01:44:30 am »
That RV32I project seems interesting.  I have found at least one link re: building the gcc toolchain.  I consider this the difficult end of the project and it is nice to find a link to a working solution.

https://github.com/cliffordwolf/picorv32#building-a-pure-rv32i-toolchain

Now to go find out just how complicated the CPU is...


 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #10 on: October 10, 2020, 01:48:17 am »
LC3 is close enough to RISC-V / MIPS that a microarchitecture for it could implement one of those instead by just widening the registers and buses and changing the instruction decoder and ALU a little.

If I can find enough documentation of the RV32I ISA, I may just give it a try.  First, I'll build the toolchain...
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #11 on: October 10, 2020, 01:50:11 am »
That RV32I project seems interesting.  I have found at least one link re: building the gcc toolchain.  I consider this the difficult end of the project and it is nice to find a link to a working solution.

https://github.com/cliffordwolf/picorv32#building-a-pure-rv32i-toolchain

Now to go find out just how complicated the CPU is...

Here's a simple RV32I by our own @hamster_nz

https://github.com/hamsternz/Rudi-RV32I
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #12 on: October 10, 2020, 01:51:38 am »
LC3 is close enough to RISC-V / MIPS that a microarchitecture for it could implement one of those instead by just widening the registers and buses and changing the instruction decoder and ALU a little.

If I can find enough documentation of the RV32I ISA, I may just give it a try.  First, I'll build the toolchain...

https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/riscv-spec-20191213.pdf

All you need is Chapter 2
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #13 on: October 10, 2020, 01:54:06 am »
If I can find enough documentation of the RV32I ISA, I may just give it a try.  First, I'll build the toolchain...
All ISA documentation is here: https://riscv.org/technical/specifications/
But there is no need to build anything, you can easily find already-made toolchain via Google.

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #14 on: October 10, 2020, 01:57:20 am »
True, but building a toolchain for Linux is as easy as copying and pasting a handful of lines from a web page. On a modern machine the git checkouts will take longer than the actual build.
« Last Edit: October 10, 2020, 02:00:49 am by brucehoult »
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #15 on: October 10, 2020, 02:02:34 am »
Here is another resource which has a 3500 line verilog implementation plus build instructions for the toolchain.  For giggles, I'm trying to build it on a Raspberry Pi 4.  I think it will take a while...

https://github.com/cliffordwolf/picorv32

ETA:  Tomorrow I'll build it on my workstation.  The Pi 4 is pretty quick, for a Pi, but if the build works, I'll just do it over.


« Last Edit: October 10, 2020, 02:20:59 am by rstofer »
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #16 on: October 10, 2020, 02:15:44 am »
My point is that if you want to study an ISA, you might as well study one that is actually modern and used, as opposed to ancient stuff that now only exist in museums. RV popularity seems to be growing faster and faster, I can't wait to get my hands on PolarFire SoC devices which contain quad RV64 CPU with PCIe and a lot of FPGA resources to implement other peripherals - as long as it's reasonably priced and free tools are available for it.

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #17 on: October 10, 2020, 02:31:01 am »
My point is that if you want to study an ISA, you might as well study one that is actually modern and used, as opposed to ancient stuff that now only exist in museums. RV popularity seems to be growing faster and faster

I certainly agree with all that.

Quote
I can't wait to get my hands on PolarFire SoC devices which contain quad RV64 CPU with PCIe and a lot of FPGA resources to implement other peripherals - as long as it's reasonably priced and free tools are available for it.

The Icicle board is $500 and was supposed to be shipping on September 30. Crowdsupply does't give tracking information but but hopefully real soon. It's a reputable company (Microchip) behind it.

The Icicle uses a pretty big 250k element FPGA, so hopefully other boards using smaller ones will be cheaper and available sometime soon.

The Silver license for Libero is free, you just have to renew it every 12 months. Apparently -- I don't have it yet.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #18 on: October 10, 2020, 02:52:27 am »
But there is no need to build anything, you can easily find already-made toolchain via Google.

Possibly not for a Raspberry Pi :-)  At least not for Raspbian. Ubuntu (which I'm running 64 bit on my Pi 4) has a package these days:

apt install g++-riscv64-linux-gnu

Not sure if that works on Pi.

However that uses glibc, wheras you probably want a newlib (and static linking by default) toolchain for an embedded CPU.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #19 on: October 10, 2020, 03:24:29 am »
The Icicle board is $500 and was supposed to be shipping on September 30. Crowdsupply does't give tracking information but but hopefully real soon. It's a reputable company (Microchip) behind it.
The Icicle uses a pretty big 250k element FPGA, so hopefully other boards using smaller ones will be cheaper and available sometime soon.
I prefer designing my own boards, so I need access to actual devices (and documentation of course).

The Silver license for Libero is free, you just have to renew it every 12 months. Apparently -- I don't have it yet.
Does it cover entire family, or there are some shenanigans going on (which is typical, but unfortunate)? And what about commonly used IPs (like memory controller, DMA, infrastructure stuff like that)? I've been somewhat spoiled by wide range of free IPs provided by Xilinx, so I'm not familiar with Microsemi's ecosystem. I mean, by now I know enough to be able to implement my own memory controllers (and would do so if Xilinx were to actually document all undocumented HW bits and pieces used in their controllers), but having something preverified and guaranteed to work makes hardware bring up much less stressful as you at least have a known-good bitstream.

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #20 on: October 10, 2020, 03:37:17 am »
The Icicle board is $500 and was supposed to be shipping on September 30. Crowdsupply does't give tracking information but but hopefully real soon. It's a reputable company (Microchip) behind it.
The Icicle uses a pretty big 250k element FPGA, so hopefully other boards using smaller ones will be cheaper and available sometime soon.
I prefer designing my own boards, so I need access to actual devices (and documentation of course).

No doubt they'll be in digikey and mouser alongside the plain PolarFire FPGAs fairly soon.

Quote
The Silver license for Libero is free, you just have to renew it every 12 months. Apparently -- I don't have it yet.
Does it cover entire family, or there are some shenanigans going on (which is typical, but unfortunate)? And what about commonly used IPs (like memory controller, DMA, infrastructure stuff like that)?

No clue as I (and everyone else) haven't had our hands on one yet.

The Icicle board comes with pre-programmed FPGA and eMMC which allows it to boot Linux out of the box, with access to the 1 GB of RAM, the ethernet, uart etc.

I certainly hope they supply the source code for the standard FPGA programming, and that the free Silver level of LIbero can build it, and that you can make modifications to it. But I don't know.
 

Offline FlyingDutch

  • Regular Contributor
  • *
  • Posts: 147
  • Country: pl
Re: softcore for learning purposes
« Reply #21 on: October 10, 2020, 06:53:46 am »
Hello,

if you have got an Altera/Intel board with Cyclone IV just use "NIOSII" soft-processor. The simpler version "NiosIIe" is available for free in "QuartusLite" software for synthessis. This sof-core is made by Intel and distributed as an IPCore. This is 32-bit soft-CPU with many peripherals and modules. There are also free tools for development projects with NiosII CPU. See: Tools -> "Platform Designer" helping with using various peripherals and conecting them, and "C SDK" and compiler for this soft-CPU "Nios II Software Build Tools for Eclipse". So you have for free tools for synthessis Nios II CPU with different peripherals and C/C++ language compiler for writing programs for this CPU. See screen and links at the end of message.

https://www.intel.pl/content/www/pl/pl/products/programmable/processor/nios-ii.html

https://en.wikipedia.org/wiki/Nios_II





Here is link to zipped project (Quartus20.1):
https://www.dropbox.com/s/gjrq33uzycsg84r/NIOSIIeMin01.zip?dl=0

BTW: If you want to use it - you should first change device from Max10 FPGA to your model of Cyclone IV, and rebuild the project!

There are also many of soft-cores on "OpenCores" web page:

https://opencores.org/

and if you need quick start with RiSC-V architecture  just try "InstantSoC":

https://www.fpga-cores.com/instant-soc/

Best Regards
« Last Edit: October 10, 2020, 10:15:02 am by FlyingDutch »
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2812
  • Country: nz
Re: softcore for learning purposes
« Reply #22 on: October 10, 2020, 07:05:02 am »
Would strongly recommend making your own RISC V design too

Have a look at my working-but-simple RV32I CPU if you want to get an idea of the size of the project you are thinking of undertaking:

https://github.com/hamsternz/Rudi-RV32I

It is nothing spectacular, but you can write programs in C and have them run at 50M+ instructions per sec.

Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #23 on: October 10, 2020, 09:00:01 am »
Hello: I cannot add packed project (Quartus 20.1) because it has about 10 MB in size. What can I do in such situation?

You can upload it to a site designed to store large files such as DropBox, Google Drive, Apple iCloud Drive, MIcroSoft OneDrive and then paste a URL here.
 
The following users thanked this post: FlyingDutch

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 20355
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: softcore for learning purposes
« Reply #24 on: October 10, 2020, 10:05:09 am »
... I would like to begin to build a soft core of some kind. ... What kind of softcore to build and how to create a list for creating said softcore. My assumption is that I will need to build several different "parts" that when put together will become the softcore. ...

Good for you; you will learn a lot and your experiences will be useful during job interviews.

You need to think about what you mean by "build" and "create":
  • Sometimes they mean "instantiate a block" and add things around it. Typically that is what happens in an industrial setting, but it is fundamentally boring.
  • Sometimes they mean "something new from primitive components like registers and gates". That is more interesting and you will learn a lot more.

In either case, it is then necessary to program them
  • in a high-level language, in which case you should probably decide on the toolchain first, which implies a standard ISA
  • in assembler, which means you could either invent your own ISA from scratch, or copy an old 8-bit ISA

An an interviewer for an embedded role, I'd be most impressed by reimplementing something like a 6502/8080, and getting something running in that. It should also be possible to find a toolchain that could generate code for those.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #25 on: October 10, 2020, 12:47:47 pm »
Would strongly recommend making your own RISC V design too

Have a look at my working-but-simple RV32I CPU if you want to get an idea of the size of the project you are thinking of undertaking:

https://github.com/hamsternz/Rudi-RV32I

It is nothing spectacular, but you can write programs in C and have them run at 50M+ instructions per sec.

I have been pawing through your github and the code is elegant!  I intend to rip it later today and at least try to stuff the bitfile in my BASYS 3 board.  In the meantime, I have been fooling around with the toolchain and it seems to me that I don't want any of the standard libraries.  It takes 20k bytes to do a printf("Hello World!\n");  I coded up a simple equivalent program using a fictitious puts function.  It's around 112 bytes.  Add a wee bit to flesh out the puts function plus some more code to initialize the UART if necessary.

Does the following look reasonable?  I assume the output code file has to be stripped of ELF stuff and any debugging symbols and such.

Code: [Select]
// set up a few aliases to save typing - add them to .bash_aliases in the very near future

alias rcc='/opt/riscv32i/bin/riscv32-unknown-elf-gcc'
alias robjcopy='/opt/riscv32i/bin/riscv32-unknown-elf-objcopy'
alias robjdump='/opt/riscv32i/bin/riscv32-unknown-elf-objdump'

// compile the program and use a linker script to get the segments in the right place (when I know what 'right' is)
// leave out various libraries and try to avoid using them

rcc -nostartfiles -nostdlib -nodefaultlibs -o3 -T rputs.ld rputs.c -o rputs

// strip off all the ELF and symbols stuff to get a pure binary file

robjcopy -R -S -O binary rputs rputs.bin
[/font]

I would convert the pure binary code to a hex file and slurp it up during the FPGA build process using more of your code.  Also excellent!

It's going to take a lot of work to understand what is going on with this CPU.  I'll probably head for the RISC-V documentation in the very near future.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #26 on: October 10, 2020, 02:17:37 pm »
I've written a simple program in C# (I work on Win10, so none of that python rubbish) which formats binary files to the format which can be stuffed into BRAMs (via $readmemh command) as I couldn't find a way to get binutils to do it:
Code: [Select]
using System;
using System.IO;

namespace memfmt
{
    class Program
    {
        static void Main(string[] args)
        {
            var fileName = "test.bin";
            if (args.Length > 0)
                fileName = args[0];
            if (!File.Exists(fileName))
            {
                Console.WriteLine($"ERROR - file '{fileName}' is not found!");
                return;
            }
            var destFile = Path.GetFileNameWithoutExtension(fileName) + ".mem";
            if (args.Length > 1)
                destFile = args[1];
            var destWidth = 4;
            if (args.Length > 2 && int.TryParse(args[2], out int width))
            {
                destWidth = width;
            }
            using (var sw = new StreamWriter(destFile))
            {
                var currAddr = 0;
                var srcData = File.ReadAllBytes(fileName);
                for(var i = 0; i < srcData.Length / destWidth; i++)
                {
                    for (var j = destWidth - 1; j >= 0 ; j--)
                        sw.Write(srcData[i * destWidth + j].ToString("x2"));
                    sw.Write(" ");
                    if (i % 4 == 3)
                    {
                        sw.WriteLine("\t//0x" + (currAddr - destWidth * 3).ToString("x8"));
                    }
                    currAddr += destWidth;
                }
            }
        }
    }
}
This code supports both 32 and 64 bit memories, usage is: memfmt.exe <source_binary_file>.bin [<destination_memory_file>.mem [4|8]]
First argument - source binary file to be formatted, second - destination file (if omitted, it will default to <source_file_name_without_extension>.mem), third argument is a memory width, can be 4 or 8 for 32 or 64 bit, if omitted it will default to 4 (32bit).
I also put code and data into separate BRAMs as internally I have separate datapaths for code and data memories:
$(GCC_FOLDER)riscv-none-embed-objcopy.exe -O binary -R .text -R .rodata $(FIRMWARE).elf $(FIRMWARE).data.bin
$(GCC_FOLDER)riscv-none-embed-objcopy.exe -O binary -R .data -R .bss $(FIRMWARE).elf $(FIRMWARE).code.bin
« Last Edit: October 10, 2020, 02:19:56 pm by asmi »
 

Offline prophossTopic starter

  • Contributor
  • Posts: 46
  • Country: us
Re: softcore for learning purposes
« Reply #27 on: October 10, 2020, 03:50:26 pm »
Oh My gosh y'all have hijacked my thread! :-DD

Should have known this would happen. Well of all the replies I will start looking at RISCV and maybe the 8080/6502. I don't want to just copy and paste, but see how to build each part. The older stuff probably leans more that way. As I dig into whatever I wind up doing I will hopefully start to realize what questions really need to be asked. No matter what I know it will take a while because I will be doing this between work and school and family. So small pieces will be helpful. Thanks for all the comments. Please feel free to add anything you feel might be relevant. Just remember I am a beginner so keep the comments layman's terms please. 
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 20355
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: softcore for learning purposes
« Reply #28 on: October 10, 2020, 04:17:43 pm »
Oh My gosh y'all have hijacked my thread! :-DD

Should have known this would happen. Well of all the replies I will start looking at RISCV and maybe the 8080/6502. I don't want to just copy and paste, but see how to build each part. The older stuff probably leans more that way. As I dig into whatever I wind up doing I will hopefully start to realize what questions really need to be asked. No matter what I know it will take a while because I will be doing this between work and school and family. So small pieces will be helpful. Thanks for all the comments. Please feel free to add anything you feel might be relevant. Just remember I am a beginner so keep the comments layman's terms please.

You'll have fun whatever you do.

When interviewing new grads, it always looks good if they have done things they didn't need to do (because they liked it), chose realistic stretch goals, completed it, and know what they would do better with the benefit of hindsight.

The advantage of the 1970s processors is that they are simple (typically a few thousand transistors).
The advantage of something like RISC/V is that it is relevant now.

I have always been grateful that I started out when things were so simple that I could realistically understand everything about them, from transistors and gates, through registers and ISAs, to HLLs. Very few young people have an appreciation of all those things nowadays - and it shows :)
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #29 on: October 10, 2020, 05:56:58 pm »
Would strongly recommend making your own RISC V design too

Have a look at my working-but-simple RV32I CPU if you want to get an idea of the size of the project you are thinking of undertaking:

https://github.com/hamsternz/Rudi-RV32I

It is nothing spectacular, but you can write programs in C and have them run at 50M+ instructions per sec.

I have it working with a minor glitch.  Every time a key is pressed, a binary counter composed of the LEDs increases.  However, the characters are scrambled.  I tried 19200 (and 9600 and 115200) to no avail.

No doubt I have done something wrong in the setup.  I don't have a fast Ubuntu machine and Vivado won't install under Mint without a bunch of pushups.  I really wanted to try the script driven build but I wound up just creating a project with the Vivado IDE on Win10.

It will be a while before I know enough about the project to ask questions.
« Last Edit: October 10, 2020, 08:33:32 pm by rstofer »
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #30 on: October 10, 2020, 08:32:52 pm »
Let's revert to the OP's topic for a moment.  We'll get back to RV32I following this brief break...

If you are thinking about Z80, there is a core named 'T80' at OpenCores.org.  It is excellent!  I have used it to run CP/M and I currently have it on a Nexys 2 board running the PacMan arcade game.

Again, if you want to learn Z80, why not learn from the guy who created it

https://www.amazon.com/Microprocessor-Design-Using-Verilog-HDL/dp/0963013351

The author uses Excel spreadsheets to lay out the timing of each instruction and ultimately write the Verilog to make it play.  The use of spreadsheets seems to be unique and a very clever approach to dealing with CISC architectures.

The 6502 is easier and there is at least one example at OpenCores.  I haven't used it.

For the ARM architecture, there is a book that provides System Verilog and VHDL for the project:

https://www.amazon.com/Digital-Design-Computer-Architecture-ARM/dp/0128000562

This book starts with number systems and works up through a complete CPU.  It deals with coding the various small modules that get combined to build a system.  Very detailed...  The author describes pipelining but due to complexity reasons, the CPU is a multi-cycle design.  Enough information is provided to work it out if you really want a pipelined processor.

There is an earlier book by the same authors that provides Verilog and VHDL for a MIPS CPU which, once again, starts from scratch and works through all of the component pieces on the way to a finished CPU and again, multi-cycle.

https://www.amazon.com/Digital-Design-Computer-Architecture-Harris/dp/0123944244

I'm not signing up for the "RISC-V is a great FIRST project".  Yes, the RISC-V is going to be a 'better' project just because of the gcc toolchain and the ISA.  But the question is whether it is a good learning project.  The reason I proposed the LC3 project is that it is complete in 1700 lines of code and there is enough documentation to truly understand what every single signal does.  Actually, there are only some 50 odd control signals.  And there's a book!  Books are good!  There's probably a reason that so many universities are using the LC3 project and I suspect it is because it can be built in a one semester course with just a previous semester of logic design.  That, and there are other copies of the project all over the Internet.

I don't view the original question as "What's the best core?" but rather "What core is likely to be achievable by a beginner?".  Grabbing somebody else's work and trying to assimilate it just isn't as satisfying as building something yourself.

While I'm thinking about it, hamsternz's github is worth cloning just to learn how an expert writes code.

To my knowledge, there is no HDL code for the LC3.  The authors wanted to use microcode and that scheme doesn't require an FPGA (generally) because all the logic is in the microcode.  Pre-compiled, as it were.

So, for projects we have:  LC3, RISC-V, Z80, ARM and MIPS.  Pick one and have fun!  I'm not sure of RISC-V but the other 4 have books.

If you haven't checked out vhdlquiz.com, you should look at some of his free tutorials.  He has a couple of 'for pay' projects and both are excellent.  He also spends a LOT of time with simulation, a concept that wasn't available to me when I started.  My code went from keyboard to hardware in one giant stumble.

There are many sites with tutorials for System Verilog, Verilog or VHDL.  Pick one and get started.


 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #31 on: October 10, 2020, 09:20:16 pm »
Yes, the RISC-V is going to be a 'better' project just because of the gcc toolchain and the ISA.  But the question is whether it is a good learning project.
Yes it is. Because it's very-very simple, and the instruction encoding was designed for easy implementation of decoding logic. So all you need for RV32I implementation is a simple ALU that can do additions (subtraction is addition of 2's complement of one of arguments), shifts and bitwise operations. That's it. All of those operations can be synthesized using high-level HDL commands, and so do not require complicated logic in HDL.
Again, the simplest implementation is fetch-decode-execute-memory access-register writeback (decode can be combined with execution, and even with fetch, though for FPGAs I wouldn't recommend that as it will severely limit the frequency for no good reason), and so 4-5 states FSM will get you going. Once you have that working, you can work on pipelining this core to get much higher throughput if you want.
Just take my word for it - take a specification, read RV32I section only, then start from the clean sheet and try implementing ALU which can do the following operations (below assuming Verilog/SystemVerilog, but I'm sure VHDL has equivalent operators too):
1. signed addition - operator "+", just make sure your arguments are signed.
2. signed subtraction - you can use addition with 2's complement, or operator "-". Again, make sure your arguments are treated as signed.
3. shift left (operator "<<" is all you need).
4. set less than - operator "<"
5. set less than unsigned - same as above, but cast operands to unsigned types.
6. XOR - operator "^"
7. shift right logical - operator ">>".
8. shift right arithmetic - operator ">>>".
9. OR - operator |
10. AND - operator &.
That's all you will need to implement a full set of RV32I. As you can see, there is zero black magic, and all of that code would be nearly identical if you'd write it in C.

Once you have that, add a fetch unit which will load commands from the memory (again, a very straightforward code like command <= mem[current_instruction_pointer];
For decoding block just take a look at encodings (there is a table of command encodings at the end of the spec PDF) and you will see that the logic is very simple - you will need to know how to access vector subranges and a command to merge subranges into a single vector ("{}" in Verilog/SV). This will tell you an ALU operation you need to do (if any) and its' operands (wither directly, or indirectly via register indices). For control transfer commands, absolute jumps are the easiest - you optionally write back a register (for jump-and-link commands), change the current_instruction_pointer and restart from cycle 1. For conditional jumps, I prefer to create a mini-ALU which performs condition checks ("==", "!=", "<", ">=" is all you need), and if result is true - you do exact same thing as with unconditional jump.
After that implement memory access (again, very simple "mem[addr] <= value;" for writes, or "reg_data <= mem[addr];" for reads, writes will require a clock cycle, which is why I placed memory access into separate state). For non-memory related register writes, this section can be skipped entirely, though I'd recommend you implement it as a pass-through to make pipelining easier down the line.
And a final state is register writeback - "registers[reg_index] <= reg_data;".

So as you can see, there is nothing very complicated here. I highly recommend you actually do all of the above yourself first, without looking at others' code. It won't take much time, trust me.

Offline prophossTopic starter

  • Contributor
  • Posts: 46
  • Country: us
Re: softcore for learning purposes
« Reply #32 on: October 10, 2020, 09:24:06 pm »
I have been looking at a few different cores that I was interested in. To be honest I am not sure what I'm looking at most of the time. I looked at Hamster's code and finally found some .vhd stuff that I recognized but most of it I had no clue what it was for. I looked at the 6502 and 8080 stuff and that seemed a bit more accessible but same issue when I went to the github repo. I looked at the books suggested by Rstofr and they were very expensive but worth it I am sure. Unfortunately not an option right now but will keep in the back of my mind. The LC-3 seems very simple and practical but where would I begin? All of the different cores seem to have memory, ALU, registers, and so on but how much and what kind changes. I  am trying to break this down into something I can put into a list that I can start checking off. The books seem the best idea for that but.... What approach would you take as you began putting anyone of these together?
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #33 on: October 10, 2020, 09:41:04 pm »
What approach would you take as you began putting anyone of these together?
I think the best way to learn is to write it all yourself from the ground zero. Make sure you're familiar with your HDL of choice, then follow the plan I outlines above. This will give you the basic core, and you can work on improving it later on.

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 20355
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: softcore for learning purposes
« Reply #34 on: October 10, 2020, 10:08:28 pm »
I have been looking at a few different cores that I was interested in. To be honest I am not sure what I'm looking at most of the time. I looked at Hamster's code and finally found some .vhd stuff that I recognized but most of it I had no clue what it was for. I looked at the 6502 and 8080 stuff and that seemed a bit more accessible but same issue when I went to the github repo. I looked at the books suggested by Rstofr and they were very expensive but worth it I am sure. Unfortunately not an option right now but will keep in the back of my mind. The LC-3 seems very simple and practical but where would I begin? All of the different cores seem to have memory, ALU, registers, and so on but how much and what kind changes. I  am trying to break this down into something I can put into a list that I can start checking off. The books seem the best idea for that but.... What approach would you take as you began putting anyone of these together?

Get the original datasheets which show internal block diagram components and interconnections., plus the instructions set and instruction encoding. Ignore the timings.

Make sure you understand which functionality is purely combinatorial and which is multiplexer based, and which has registers. Understand finite state machines, and use one or more FSMs to control the registers and multiplexers etc.

Do not try to duplicate the gate-level implementation. Do understand each block's functionality, and code that behaviourally in the HDL. Then use the HDL to structurally compose the blocks.

Do have decent test suites, so that you can prove a block works (and continues to work after you make a small change).
« Last Edit: October 10, 2020, 10:14:43 pm by tggzzz »
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #35 on: October 10, 2020, 10:45:56 pm »
So, for projects we have:  LC3, RISC-V, Z80, ARM and MIPS.  Pick one and have fun!  I'm not sure of RISC-V but the other 4 have books.

Er ... does Patterson and Hennessy "Computer Organization and Design: The Hardware Software Interface" not count somehow?

https://www.amazon.com/Computer-Organization-Design-RISC-V-Architecture/dp/0128122757
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #36 on: October 10, 2020, 10:55:43 pm »
Again, the simplest implementation is fetch-decode-execute-memory access-register writeback (decode can be combined with execution, and even with fetch, though for FPGAs I wouldn't recommend that as it will severely limit the frequency for no good reason), and so 4-5 states FSM will get you going. Once you have that working, you can work on pipelining this core to get much higher throughput if you want.

If you have instructions and data in separate RAM blocks (or ROM and RAM) then you can do RISC-V with everything in a single pipe stage: PC and register contents input at one end, ripple asynch through decode, operand select, ALU, memory read or write, present new PC and register contents at the output. It'll run at 10s of MHz at least.
 
The following users thanked this post: Someone

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #37 on: October 10, 2020, 11:29:38 pm »
If you have instructions and data in separate RAM blocks (or ROM and RAM) then you can do RISC-V with everything in a single pipe stage: PC and register contents input at one end, ripple asynch through decode, operand select, ALU, memory read or write, present new PC and register contents at the output. It'll run at 10s of MHz at least.
I recommend multi-cycle implementation because it will be easier to pipeline - as you will already have all of your internal state registers, so you will only need to add a pipeline registers for control signals - and here you can do a half-step by pushing as much of your control signals into the pipeline as you can - and you will need to add some code to deal with data and control hazards. Here even very naïve implementation which will force a stall each time you encounter it, on average will have much more throughput than multicycle one. And you can progressively refine this implementation to optimize for whatever metric you like be it frequency, power or resource usage.

Online chris_leyson

  • Super Contributor
  • ***
  • Posts: 1544
  • Country: wales
Re: softcore for learning purposes
« Reply #38 on: October 11, 2020, 10:40:49 am »
The reason I mentioned Picoblaze is because it is a very small and useful 8 bit core and it is very well documented. It's a good example of very compact fetch/execute architecture. KCPSM3 uses about about 120 CLBs on a Spartan 3 and the later KCPSM6 uses 60 or so CLBs on a Spartan6. CLB or complex logic block is more or less the same as Alteras LE or logic element.

Quote
The 6502 is easier and there is at least one example at Open  Cores.  I haven't used it.
I've tried one version of the 6502 from Open Cores and on a Spartan3 it used 3500 CLBs !! Over 90% of the logic is used for instruction decoding and that is a good example of how not to write a softcore.

I've tried my own version of the 6502 by using two block rams to expand each 8 bit instruction into a 36 bit control word, 80% or 90% percent of the instructions worked, so not quite there yet but the design used less than 300 CLBs and two block rams, a 10X improvement over the Open Cores version. Maybe three block rams and a 54 bit control word would have worked. The 6502 instruction decoder used a 21x130 bit decode ROM. 6502 block diagram attached.

There is also the 8-bit Gumnut core described in "A designers Guide to VHDL" by Peter Ashenden and there are free PDF copies out there.

If you get the instruction decoding wrong you end up with the logic equivalent of bloatware, Open Cores 6502 being just one example, there are others. ALUs whether they are 8, 16 or 32 bit take up very little logic as does multiplexing for the data paths. Instruction decoding and the finite state machine that drives the core take a lot of work to get right. tggzzz outlined a good work flow, understand finite state machines and take a look some block diagrams from older designs.
« Last Edit: October 11, 2020, 10:47:42 am by chris_leyson »
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #39 on: October 11, 2020, 04:20:51 pm »
So, for projects we have:  LC3, RISC-V, Z80, ARM and MIPS.  Pick one and have fun!  I'm not sure of RISC-V but the other 4 have books.

Er ... does Patterson and Hennessy "Computer Organization and Design: The Hardware Software Interface" not count somehow?

https://www.amazon.com/Computer-Organization-Design-RISC-V-Architecture/dp/0128122757

Of course it would but I can't recommend what I don't have.  Each of the books I linked above is on my shelf.

My copy of Patterson and Hennessy is "Computer Organization And Design" 3rd edition and was published in 2005 (India printing) and covers the MIPS design with just a wee bit HDL.  It is a great reference on computer architecture and but not on how to create a core with HDL.

I see they now have an ARM and a RISC-V version.  It would be interesting to see what they have to say about the RISC-V since that is the current topic.  Now the question is:  Wait for the announced 2d Edition (with no release date at Amazon) or buy what will instantly be obsolete?

I wish I had the briefest notion of when the 2d edition was being released.

ETA:  I went to the Morgan Kauffman site and they're showing Jan 1, 2021 as the release date.  I don't know if that is a commitment or just pointing out that it won't happen in the next couple of months.

I can wait...
« Last Edit: October 11, 2020, 04:32:48 pm by rstofer »
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #40 on: October 11, 2020, 04:59:52 pm »
Quote
The 6502 is easier and there is at least one example at Open  Cores.  I haven't used it.
I've tried one version of the 6502 from Open Cores and on a Spartan3 it used 3500 CLBs !! Over 90% of the logic is used for instruction decoding and that is a good example of how not to write a softcore.

It's the nature of CISC architectures to have most of the logic in decoding.  The resources are limited, there are generally only a very few registers, but the complex addressing schemes chew up a lot of logic.

For elegance in code, hamsternz's RISC project is excellent.  The entire core uses less than 6% of the LUTs on an Artix 7 35T which is a fairly small chip considering I tend toward the 100T variants.  The code has some alternative bits where resources are reducing by a different coding and the savings are substantial.  It also uses just 4% of the BlockRAM (memory size is defined to be quite small) so there is room to expand it.

The important thing is that the core can be used with a small, less expensive, board.  I haven't tried it but apparently it will work with a CMOD chip.  This allows the FPGA board to be plugged into a daughter card with the various peripherals.  Nice!
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #41 on: October 11, 2020, 10:25:08 pm »
If you want a really small RISC-V core, there is Olof Kindgren's "SeRV", a bit-serial implementation of RV32I. Most instructions take 32 clock cycles, but jumps, load/store, SLT/SLTU take 64 and shifts take between 32 and 64.

https://github.com/olofk/serv

He keeps making it smaller, but as at August 9th Olof tweeted: "255 LUT and 225 FF on iCE40 with the standard config that support interrupts and a few CSR. The minimal RV32I only config is approximately 50 LUT and 20 FF less than that". As at May that was 167 LUT and 224 FF on Artix-7, but I think it will be a little smaller than that now as the same slide shows 266 LUT and 227 FF on iCE40.

https://diode.zone/videos/watch/0230a518-e207-4cf6-b5e2-69cc09411013
 

Offline 0db

  • Frequent Contributor
  • **
  • Posts: 336
  • Country: zm
Re: softcore for learning purposes
« Reply #42 on: October 13, 2020, 12:19:47 pm »
Wait for the announced 2d Edition (with no release date at Amazon) or buy what will instantly be obsolete?

Wait for it.
 

Offline 0db

  • Frequent Contributor
  • **
  • Posts: 336
  • Country: zm
Re: softcore for learning purposes
« Reply #43 on: October 13, 2020, 01:59:42 pm »
LC3 lacks of some features. Why not to add them? Doc them, test them. Improve them.
It seems a good approach.
There is too much people who only download stuff, without sending anything back.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #44 on: October 13, 2020, 05:43:50 pm »
LC3 lacks of some features. Why not to add them? Doc them, test them. Improve them.
It seems a good approach.
There is too much people who only download stuff, without sending anything back.

LC3 doesn't have opcode space to add more. There is only one free major opcode.

That's one of the main reasons the guys at Berkeley developed RISC-V instead of trying to enhance OpenRISC or MIPS or SPARC or ARM.

(The other version is licences weren't available at a reasonable cost or at all)
 

Offline 0db

  • Frequent Contributor
  • **
  • Posts: 336
  • Country: zm
Re: softcore for learning purposes
« Reply #45 on: October 13, 2020, 06:20:13 pm »
LC3 doesn't have opcode space to add more. There is only one free major opcode.

So make the opcode 32 bit long, or rearrange it without messing too much with the data path, and you have more space for new opcodes. This needs to adapt the machine layer of LCC, but it will teach a lot.

That's one of the main reasons the guys at Berkeley developed RISC-V instead of trying to enhance OpenRISC or MIPS or SPARC or ARM.

  • MIPS? You cannot touch it without receiving a legal letter (especially with the "nano MIPS"). Nasty guys there. In fact even now that big companies like SGI is finally dead, they prefer to keep secret on all their deprecated CPU and forget it: no matter how much you wish it happening, you won't see anything about MIPS-5 (R12K, R14K, R16K). They literally have an underground room where they only care about stacking their super secret paper datasheets to have them hidden from the public. Year after year their precious paper age even more to the color of a yellowish funeral, while the ink becomes a bit more unreadable, but that's what they want rather than releasing the doc or paying someone for making a digital copy to be shared on the internet.
  • SPARC? Too much complex, and the window-registers is ... a bad idea
  • ARM? ... stands for "Advanced Risc Machine", and the "advanced" term means too much complexity for a "single person" project
  • OpenRISC? ... well it's like the HURD project. They started with something simple, their ego made it so hyper-complex that two devs cannot talk about the same detail without feeling lost in space and time

There is also an alternative: ijvm!, the didactic stack machine invented by Andrew Stuart Tanenbaum. It's not RISCy by any mean, it's intended to teach but it's very interesting, and simple enough for FPGA and home-made compilers.
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 20355
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: softcore for learning purposes
« Reply #46 on: October 13, 2020, 06:44:49 pm »
  • ARM? ... stands for "Advanced Risc Machine", and the "advanced" term means too much complexity for a "single person" project
.

The first ARM took two people (Roger Wilson, Steve Furber) 6 man years. But that included inventing the concept, the instruction set, the custom semiconductor implementation.

It is reasonable to consider an ARM 1 processor as a single person project.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #47 on: October 13, 2020, 07:40:28 pm »
LC3 doesn't have opcode space to add more. There is only one free major opcode.

So make the opcode 32 bit long, or rearrange it without messing too much with the data path, and you have more space for new opcodes. This needs to adapt the machine layer of LCC, but it will teach a lot.

Sure. Or just start from scratch. But either way you also have to start from scratch with the software.

Quote
That's one of the main reasons the guys at Berkeley developed RISC-V instead of trying to enhance OpenRISC or MIPS or SPARC or ARM.

  • MIPS? You cannot touch it without receiving a legal letter (especially with the "nano MIPS"). Nasty guys there. In fact even now that big companies like SGI is finally dead, they prefer to keep secret on all their deprecated CPU and forget it: no matter how much you wish it happening, you won't see anything about MIPS-5 (R12K, R14K, R16K). They literally have an underground room where they only care about stacking their super secret paper datasheets to have them hidden from the public. Year after year their precious paper age even more to the color of a yellowish funeral, while the ink becomes a bit more unreadable, but that's what they want rather than releasing the doc or paying someone for making a digital copy to be shared on the internet.
  • SPARC? Too much complex, and the window-registers is ... a bad idea
  • ARM? ... stands for "Advanced Risc Machine", and the "advanced" term means too much complexity for a "single person" project
  • OpenRISC? ... well it's like the HURD project. They started with something simple, their ego made it so hyper-complex that two devs cannot talk about the same detail without feeling lost in space and time

All correct, more or less. Which is why it's so good they actually did RISC-V, and did it pretty well too. There are some things I think they got wrong, and I've debated them with Andrew and Krste and they even in some cases agree they could have been done better, but nothing major enough to be worth an incompatible change at this point (or even three years ago when I raised them).

NanoMIPS, incidentally, looks really nice, but it was dead on arrival. There is one 32 bit chip using it, apparently exclusive to MediaTek. The gcc patches to support NanoMIPS were sent to the gcc mailing list the day before the entire compiler team was fired, and have never been merged into upstream gcc, and probably never will be.

Quote
There is also an alternative: ijvm!, the didactic stack machine invented by Andrew Stuart Tanenbaum. It's not RISCy by any mean, it's intended to teach but it's very interesting, and simple enough for FPGA and home-made compilers.

Anything with an INVOKEVIRTUAL and NEWARRAY opcodes is not going to be a great target for home made FPGA CPUs. If nothing else, you're going to need a garbage collector in hardware -- or a way for NEWARRAY to trap to a software garbage collector ... but the instruction set doesn't have primitives to allow you to implement one.

Also the only boolean operations provided are AND and OR, which is very limiting. I'm not clear from the spec whether those are boolean or bitwise i.e. && and || or & and | in C terms.  And there are no shifts.

As I demonstrated in a previous post in this thread about LC3, if you have ADD and AND and a test for equality to zero then you can fake up shifts and other boolean operations using a loop and some IFs, but ick.


An instruction set that I think *would* be very interesting to do as an exercise is the integer core of Transputer. The opcodes are very simple and it has the interesting property that you can make it have any register width and address space size you want -- 8 bits, 16 bits, 32 bits, 64 bits or anything in between. As with WASM larger values are built up incrementally, 4 bits at a time in the case of Transputer. Transputer uses a stack but it's a fixed 4 elements in size, and the compiler (which is available) manages things around that.
« Last Edit: October 13, 2020, 07:45:10 pm by brucehoult »
 

Offline 0db

  • Frequent Contributor
  • **
  • Posts: 336
  • Country: zm
Re: softcore for learning purposes
« Reply #48 on: October 13, 2020, 08:30:34 pm »
Anything with an INVOKEVIRTUAL and NEWARRAY opcodes is not going to be a great target for home made FPGA CPUs.

I played a lot with it during my examinations. I somehow liked it. It's not "picojava" it's "ijvm". Invokevirtual is actually a simple (CISCish) "jsr". Nothing special, nothing complex.

Also the only boolean operations provided are AND and OR, which is very limiting. I'm not clear from the spec whether those are boolean or bitwise i.e. && and || or & and | in C terms.  And there are no shifts.

Ijvm goes from the simplest and minimal "MIC1" to the complex "MIC5", which is technically "implementation" stuff, but starting from MIC5 there is also space for ISA revisions, and since there is space for new opcodes, you can also implement "bitwise" and "boolean" "operators" ... "shift", "rotate", "bit testing", everything, on the stack primitives.

Andrew Stuart Tanenbaum has documented it on his book, and if someone needs any book support for a CPU, well that's can be interesting. It was a true hype more than ten years ago, when Java was the new great hit.
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #49 on: October 13, 2020, 08:38:35 pm »
LC3 doesn't have opcode space to add more. There is only one free major opcode.
Subtraction can be handled by taking the 2s complement and adding.  Slide 5-10 on page 3 here:

https://www.cis.upenn.edu/~milom/cse240-Fall05/handouts/Ch05.pdf

In the case of an immediate subtraction, just encode the 2s complement from the start and emit an ADD instruction.

I would use the last op code to do the shifts.  The count could be either immediate or from a register, there is enough room in the instruction to put the immediate count in SRC1 and one bit of SRC2 while using the last 2 bits of SR2 to specify type and direction.  We could encode arithmetic and logical shifts of up to 16 bits.  Ugly!  I haven't drawn this out, I don't actually know that it is feasbile.

 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #50 on: October 13, 2020, 08:52:26 pm »
LC3 doesn't have opcode space to add more. There is only one free major opcode.
Subtraction can be handled by taking the 2s complement and adding.  Slide 5-10 on page 3 here:

https://www.cis.upenn.edu/~milom/cse240-Fall05/handouts/Ch05.pdf

Sure. I covered synthesizing SUB, OR, XOR from first principles back in...

https://www.eevblog.com/forum/fpga/softcore-for-learning-purposes/msg3270124/#msg3270124

I showed how to fake up shifts a few posts later.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf