Author Topic: softcore for learning purposes  (Read 7465 times)

0 Members and 1 Guest are viewing this topic.

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #25 on: October 10, 2020, 12:47:47 pm »
Would strongly recommend making your own RISC V design too

Have a look at my working-but-simple RV32I CPU if you want to get an idea of the size of the project you are thinking of undertaking:

https://github.com/hamsternz/Rudi-RV32I

It is nothing spectacular, but you can write programs in C and have them run at 50M+ instructions per sec.

I have been pawing through your github and the code is elegant!  I intend to rip it later today and at least try to stuff the bitfile in my BASYS 3 board.  In the meantime, I have been fooling around with the toolchain and it seems to me that I don't want any of the standard libraries.  It takes 20k bytes to do a printf("Hello World!\n");  I coded up a simple equivalent program using a fictitious puts function.  It's around 112 bytes.  Add a wee bit to flesh out the puts function plus some more code to initialize the UART if necessary.

Does the following look reasonable?  I assume the output code file has to be stripped of ELF stuff and any debugging symbols and such.

Code: [Select]
// set up a few aliases to save typing - add them to .bash_aliases in the very near future

alias rcc='/opt/riscv32i/bin/riscv32-unknown-elf-gcc'
alias robjcopy='/opt/riscv32i/bin/riscv32-unknown-elf-objcopy'
alias robjdump='/opt/riscv32i/bin/riscv32-unknown-elf-objdump'

// compile the program and use a linker script to get the segments in the right place (when I know what 'right' is)
// leave out various libraries and try to avoid using them

rcc -nostartfiles -nostdlib -nodefaultlibs -o3 -T rputs.ld rputs.c -o rputs

// strip off all the ELF and symbols stuff to get a pure binary file

robjcopy -R -S -O binary rputs rputs.bin
[/font]

I would convert the pure binary code to a hex file and slurp it up during the FPGA build process using more of your code.  Also excellent!

It's going to take a lot of work to understand what is going on with this CPU.  I'll probably head for the RISC-V documentation in the very near future.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #26 on: October 10, 2020, 02:17:37 pm »
I've written a simple program in C# (I work on Win10, so none of that python rubbish) which formats binary files to the format which can be stuffed into BRAMs (via $readmemh command) as I couldn't find a way to get binutils to do it:
Code: [Select]
using System;
using System.IO;

namespace memfmt
{
    class Program
    {
        static void Main(string[] args)
        {
            var fileName = "test.bin";
            if (args.Length > 0)
                fileName = args[0];
            if (!File.Exists(fileName))
            {
                Console.WriteLine($"ERROR - file '{fileName}' is not found!");
                return;
            }
            var destFile = Path.GetFileNameWithoutExtension(fileName) + ".mem";
            if (args.Length > 1)
                destFile = args[1];
            var destWidth = 4;
            if (args.Length > 2 && int.TryParse(args[2], out int width))
            {
                destWidth = width;
            }
            using (var sw = new StreamWriter(destFile))
            {
                var currAddr = 0;
                var srcData = File.ReadAllBytes(fileName);
                for(var i = 0; i < srcData.Length / destWidth; i++)
                {
                    for (var j = destWidth - 1; j >= 0 ; j--)
                        sw.Write(srcData[i * destWidth + j].ToString("x2"));
                    sw.Write(" ");
                    if (i % 4 == 3)
                    {
                        sw.WriteLine("\t//0x" + (currAddr - destWidth * 3).ToString("x8"));
                    }
                    currAddr += destWidth;
                }
            }
        }
    }
}
This code supports both 32 and 64 bit memories, usage is: memfmt.exe <source_binary_file>.bin [<destination_memory_file>.mem [4|8]]
First argument - source binary file to be formatted, second - destination file (if omitted, it will default to <source_file_name_without_extension>.mem), third argument is a memory width, can be 4 or 8 for 32 or 64 bit, if omitted it will default to 4 (32bit).
I also put code and data into separate BRAMs as internally I have separate datapaths for code and data memories:
$(GCC_FOLDER)riscv-none-embed-objcopy.exe -O binary -R .text -R .rodata $(FIRMWARE).elf $(FIRMWARE).data.bin
$(GCC_FOLDER)riscv-none-embed-objcopy.exe -O binary -R .data -R .bss $(FIRMWARE).elf $(FIRMWARE).code.bin
« Last Edit: October 10, 2020, 02:19:56 pm by asmi »
 

Offline prophossTopic starter

  • Contributor
  • Posts: 46
  • Country: us
Re: softcore for learning purposes
« Reply #27 on: October 10, 2020, 03:50:26 pm »
Oh My gosh y'all have hijacked my thread! :-DD

Should have known this would happen. Well of all the replies I will start looking at RISCV and maybe the 8080/6502. I don't want to just copy and paste, but see how to build each part. The older stuff probably leans more that way. As I dig into whatever I wind up doing I will hopefully start to realize what questions really need to be asked. No matter what I know it will take a while because I will be doing this between work and school and family. So small pieces will be helpful. Thanks for all the comments. Please feel free to add anything you feel might be relevant. Just remember I am a beginner so keep the comments layman's terms please. 
 

Offline tggzzz

  • Super Contributor
  • ***
  • Posts: 20355
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: softcore for learning purposes
« Reply #28 on: October 10, 2020, 04:17:43 pm »
Oh My gosh y'all have hijacked my thread! :-DD

Should have known this would happen. Well of all the replies I will start looking at RISCV and maybe the 8080/6502. I don't want to just copy and paste, but see how to build each part. The older stuff probably leans more that way. As I dig into whatever I wind up doing I will hopefully start to realize what questions really need to be asked. No matter what I know it will take a while because I will be doing this between work and school and family. So small pieces will be helpful. Thanks for all the comments. Please feel free to add anything you feel might be relevant. Just remember I am a beginner so keep the comments layman's terms please.

You'll have fun whatever you do.

When interviewing new grads, it always looks good if they have done things they didn't need to do (because they liked it), chose realistic stretch goals, completed it, and know what they would do better with the benefit of hindsight.

The advantage of the 1970s processors is that they are simple (typically a few thousand transistors).
The advantage of something like RISC/V is that it is relevant now.

I have always been grateful that I started out when things were so simple that I could realistically understand everything about them, from transistors and gates, through registers and ISAs, to HLLs. Very few young people have an appreciation of all those things nowadays - and it shows :)
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #29 on: October 10, 2020, 05:56:58 pm »
Would strongly recommend making your own RISC V design too

Have a look at my working-but-simple RV32I CPU if you want to get an idea of the size of the project you are thinking of undertaking:

https://github.com/hamsternz/Rudi-RV32I

It is nothing spectacular, but you can write programs in C and have them run at 50M+ instructions per sec.

I have it working with a minor glitch.  Every time a key is pressed, a binary counter composed of the LEDs increases.  However, the characters are scrambled.  I tried 19200 (and 9600 and 115200) to no avail.

No doubt I have done something wrong in the setup.  I don't have a fast Ubuntu machine and Vivado won't install under Mint without a bunch of pushups.  I really wanted to try the script driven build but I wound up just creating a project with the Vivado IDE on Win10.

It will be a while before I know enough about the project to ask questions.
« Last Edit: October 10, 2020, 08:33:32 pm by rstofer »
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #30 on: October 10, 2020, 08:32:52 pm »
Let's revert to the OP's topic for a moment.  We'll get back to RV32I following this brief break...

If you are thinking about Z80, there is a core named 'T80' at OpenCores.org.  It is excellent!  I have used it to run CP/M and I currently have it on a Nexys 2 board running the PacMan arcade game.

Again, if you want to learn Z80, why not learn from the guy who created it

https://www.amazon.com/Microprocessor-Design-Using-Verilog-HDL/dp/0963013351

The author uses Excel spreadsheets to lay out the timing of each instruction and ultimately write the Verilog to make it play.  The use of spreadsheets seems to be unique and a very clever approach to dealing with CISC architectures.

The 6502 is easier and there is at least one example at OpenCores.  I haven't used it.

For the ARM architecture, there is a book that provides System Verilog and VHDL for the project:

https://www.amazon.com/Digital-Design-Computer-Architecture-ARM/dp/0128000562

This book starts with number systems and works up through a complete CPU.  It deals with coding the various small modules that get combined to build a system.  Very detailed...  The author describes pipelining but due to complexity reasons, the CPU is a multi-cycle design.  Enough information is provided to work it out if you really want a pipelined processor.

There is an earlier book by the same authors that provides Verilog and VHDL for a MIPS CPU which, once again, starts from scratch and works through all of the component pieces on the way to a finished CPU and again, multi-cycle.

https://www.amazon.com/Digital-Design-Computer-Architecture-Harris/dp/0123944244

I'm not signing up for the "RISC-V is a great FIRST project".  Yes, the RISC-V is going to be a 'better' project just because of the gcc toolchain and the ISA.  But the question is whether it is a good learning project.  The reason I proposed the LC3 project is that it is complete in 1700 lines of code and there is enough documentation to truly understand what every single signal does.  Actually, there are only some 50 odd control signals.  And there's a book!  Books are good!  There's probably a reason that so many universities are using the LC3 project and I suspect it is because it can be built in a one semester course with just a previous semester of logic design.  That, and there are other copies of the project all over the Internet.

I don't view the original question as "What's the best core?" but rather "What core is likely to be achievable by a beginner?".  Grabbing somebody else's work and trying to assimilate it just isn't as satisfying as building something yourself.

While I'm thinking about it, hamsternz's github is worth cloning just to learn how an expert writes code.

To my knowledge, there is no HDL code for the LC3.  The authors wanted to use microcode and that scheme doesn't require an FPGA (generally) because all the logic is in the microcode.  Pre-compiled, as it were.

So, for projects we have:  LC3, RISC-V, Z80, ARM and MIPS.  Pick one and have fun!  I'm not sure of RISC-V but the other 4 have books.

If you haven't checked out vhdlquiz.com, you should look at some of his free tutorials.  He has a couple of 'for pay' projects and both are excellent.  He also spends a LOT of time with simulation, a concept that wasn't available to me when I started.  My code went from keyboard to hardware in one giant stumble.

There are many sites with tutorials for System Verilog, Verilog or VHDL.  Pick one and get started.


 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #31 on: October 10, 2020, 09:20:16 pm »
Yes, the RISC-V is going to be a 'better' project just because of the gcc toolchain and the ISA.  But the question is whether it is a good learning project.
Yes it is. Because it's very-very simple, and the instruction encoding was designed for easy implementation of decoding logic. So all you need for RV32I implementation is a simple ALU that can do additions (subtraction is addition of 2's complement of one of arguments), shifts and bitwise operations. That's it. All of those operations can be synthesized using high-level HDL commands, and so do not require complicated logic in HDL.
Again, the simplest implementation is fetch-decode-execute-memory access-register writeback (decode can be combined with execution, and even with fetch, though for FPGAs I wouldn't recommend that as it will severely limit the frequency for no good reason), and so 4-5 states FSM will get you going. Once you have that working, you can work on pipelining this core to get much higher throughput if you want.
Just take my word for it - take a specification, read RV32I section only, then start from the clean sheet and try implementing ALU which can do the following operations (below assuming Verilog/SystemVerilog, but I'm sure VHDL has equivalent operators too):
1. signed addition - operator "+", just make sure your arguments are signed.
2. signed subtraction - you can use addition with 2's complement, or operator "-". Again, make sure your arguments are treated as signed.
3. shift left (operator "<<" is all you need).
4. set less than - operator "<"
5. set less than unsigned - same as above, but cast operands to unsigned types.
6. XOR - operator "^"
7. shift right logical - operator ">>".
8. shift right arithmetic - operator ">>>".
9. OR - operator |
10. AND - operator &.
That's all you will need to implement a full set of RV32I. As you can see, there is zero black magic, and all of that code would be nearly identical if you'd write it in C.

Once you have that, add a fetch unit which will load commands from the memory (again, a very straightforward code like command <= mem[current_instruction_pointer];
For decoding block just take a look at encodings (there is a table of command encodings at the end of the spec PDF) and you will see that the logic is very simple - you will need to know how to access vector subranges and a command to merge subranges into a single vector ("{}" in Verilog/SV). This will tell you an ALU operation you need to do (if any) and its' operands (wither directly, or indirectly via register indices). For control transfer commands, absolute jumps are the easiest - you optionally write back a register (for jump-and-link commands), change the current_instruction_pointer and restart from cycle 1. For conditional jumps, I prefer to create a mini-ALU which performs condition checks ("==", "!=", "<", ">=" is all you need), and if result is true - you do exact same thing as with unconditional jump.
After that implement memory access (again, very simple "mem[addr] <= value;" for writes, or "reg_data <= mem[addr];" for reads, writes will require a clock cycle, which is why I placed memory access into separate state). For non-memory related register writes, this section can be skipped entirely, though I'd recommend you implement it as a pass-through to make pipelining easier down the line.
And a final state is register writeback - "registers[reg_index] <= reg_data;".

So as you can see, there is nothing very complicated here. I highly recommend you actually do all of the above yourself first, without looking at others' code. It won't take much time, trust me.

Offline prophossTopic starter

  • Contributor
  • Posts: 46
  • Country: us
Re: softcore for learning purposes
« Reply #32 on: October 10, 2020, 09:24:06 pm »
I have been looking at a few different cores that I was interested in. To be honest I am not sure what I'm looking at most of the time. I looked at Hamster's code and finally found some .vhd stuff that I recognized but most of it I had no clue what it was for. I looked at the 6502 and 8080 stuff and that seemed a bit more accessible but same issue when I went to the github repo. I looked at the books suggested by Rstofr and they were very expensive but worth it I am sure. Unfortunately not an option right now but will keep in the back of my mind. The LC-3 seems very simple and practical but where would I begin? All of the different cores seem to have memory, ALU, registers, and so on but how much and what kind changes. I  am trying to break this down into something I can put into a list that I can start checking off. The books seem the best idea for that but.... What approach would you take as you began putting anyone of these together?
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #33 on: October 10, 2020, 09:41:04 pm »
What approach would you take as you began putting anyone of these together?
I think the best way to learn is to write it all yourself from the ground zero. Make sure you're familiar with your HDL of choice, then follow the plan I outlines above. This will give you the basic core, and you can work on improving it later on.

Offline tggzzz

  • Super Contributor
  • ***
  • Posts: 20355
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: softcore for learning purposes
« Reply #34 on: October 10, 2020, 10:08:28 pm »
I have been looking at a few different cores that I was interested in. To be honest I am not sure what I'm looking at most of the time. I looked at Hamster's code and finally found some .vhd stuff that I recognized but most of it I had no clue what it was for. I looked at the 6502 and 8080 stuff and that seemed a bit more accessible but same issue when I went to the github repo. I looked at the books suggested by Rstofr and they were very expensive but worth it I am sure. Unfortunately not an option right now but will keep in the back of my mind. The LC-3 seems very simple and practical but where would I begin? All of the different cores seem to have memory, ALU, registers, and so on but how much and what kind changes. I  am trying to break this down into something I can put into a list that I can start checking off. The books seem the best idea for that but.... What approach would you take as you began putting anyone of these together?

Get the original datasheets which show internal block diagram components and interconnections., plus the instructions set and instruction encoding. Ignore the timings.

Make sure you understand which functionality is purely combinatorial and which is multiplexer based, and which has registers. Understand finite state machines, and use one or more FSMs to control the registers and multiplexers etc.

Do not try to duplicate the gate-level implementation. Do understand each block's functionality, and code that behaviourally in the HDL. Then use the HDL to structurally compose the blocks.

Do have decent test suites, so that you can prove a block works (and continues to work after you make a small change).
« Last Edit: October 10, 2020, 10:14:43 pm by tggzzz »
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #35 on: October 10, 2020, 10:45:56 pm »
So, for projects we have:  LC3, RISC-V, Z80, ARM and MIPS.  Pick one and have fun!  I'm not sure of RISC-V but the other 4 have books.

Er ... does Patterson and Hennessy "Computer Organization and Design: The Hardware Software Interface" not count somehow?

https://www.amazon.com/Computer-Organization-Design-RISC-V-Architecture/dp/0128122757
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #36 on: October 10, 2020, 10:55:43 pm »
Again, the simplest implementation is fetch-decode-execute-memory access-register writeback (decode can be combined with execution, and even with fetch, though for FPGAs I wouldn't recommend that as it will severely limit the frequency for no good reason), and so 4-5 states FSM will get you going. Once you have that working, you can work on pipelining this core to get much higher throughput if you want.

If you have instructions and data in separate RAM blocks (or ROM and RAM) then you can do RISC-V with everything in a single pipe stage: PC and register contents input at one end, ripple asynch through decode, operand select, ALU, memory read or write, present new PC and register contents at the output. It'll run at 10s of MHz at least.
 
The following users thanked this post: Someone

Online asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: softcore for learning purposes
« Reply #37 on: October 10, 2020, 11:29:38 pm »
If you have instructions and data in separate RAM blocks (or ROM and RAM) then you can do RISC-V with everything in a single pipe stage: PC and register contents input at one end, ripple asynch through decode, operand select, ALU, memory read or write, present new PC and register contents at the output. It'll run at 10s of MHz at least.
I recommend multi-cycle implementation because it will be easier to pipeline - as you will already have all of your internal state registers, so you will only need to add a pipeline registers for control signals - and here you can do a half-step by pushing as much of your control signals into the pipeline as you can - and you will need to add some code to deal with data and control hazards. Here even very naïve implementation which will force a stall each time you encounter it, on average will have much more throughput than multicycle one. And you can progressively refine this implementation to optimize for whatever metric you like be it frequency, power or resource usage.

Online chris_leyson

  • Super Contributor
  • ***
  • Posts: 1544
  • Country: wales
Re: softcore for learning purposes
« Reply #38 on: October 11, 2020, 10:40:49 am »
The reason I mentioned Picoblaze is because it is a very small and useful 8 bit core and it is very well documented. It's a good example of very compact fetch/execute architecture. KCPSM3 uses about about 120 CLBs on a Spartan 3 and the later KCPSM6 uses 60 or so CLBs on a Spartan6. CLB or complex logic block is more or less the same as Alteras LE or logic element.

Quote
The 6502 is easier and there is at least one example at Open  Cores.  I haven't used it.
I've tried one version of the 6502 from Open Cores and on a Spartan3 it used 3500 CLBs !! Over 90% of the logic is used for instruction decoding and that is a good example of how not to write a softcore.

I've tried my own version of the 6502 by using two block rams to expand each 8 bit instruction into a 36 bit control word, 80% or 90% percent of the instructions worked, so not quite there yet but the design used less than 300 CLBs and two block rams, a 10X improvement over the Open Cores version. Maybe three block rams and a 54 bit control word would have worked. The 6502 instruction decoder used a 21x130 bit decode ROM. 6502 block diagram attached.

There is also the 8-bit Gumnut core described in "A designers Guide to VHDL" by Peter Ashenden and there are free PDF copies out there.

If you get the instruction decoding wrong you end up with the logic equivalent of bloatware, Open Cores 6502 being just one example, there are others. ALUs whether they are 8, 16 or 32 bit take up very little logic as does multiplexing for the data paths. Instruction decoding and the finite state machine that drives the core take a lot of work to get right. tggzzz outlined a good work flow, understand finite state machines and take a look some block diagrams from older designs.
« Last Edit: October 11, 2020, 10:47:42 am by chris_leyson »
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #39 on: October 11, 2020, 04:20:51 pm »
So, for projects we have:  LC3, RISC-V, Z80, ARM and MIPS.  Pick one and have fun!  I'm not sure of RISC-V but the other 4 have books.

Er ... does Patterson and Hennessy "Computer Organization and Design: The Hardware Software Interface" not count somehow?

https://www.amazon.com/Computer-Organization-Design-RISC-V-Architecture/dp/0128122757

Of course it would but I can't recommend what I don't have.  Each of the books I linked above is on my shelf.

My copy of Patterson and Hennessy is "Computer Organization And Design" 3rd edition and was published in 2005 (India printing) and covers the MIPS design with just a wee bit HDL.  It is a great reference on computer architecture and but not on how to create a core with HDL.

I see they now have an ARM and a RISC-V version.  It would be interesting to see what they have to say about the RISC-V since that is the current topic.  Now the question is:  Wait for the announced 2d Edition (with no release date at Amazon) or buy what will instantly be obsolete?

I wish I had the briefest notion of when the 2d edition was being released.

ETA:  I went to the Morgan Kauffman site and they're showing Jan 1, 2021 as the release date.  I don't know if that is a commitment or just pointing out that it won't happen in the next couple of months.

I can wait...
« Last Edit: October 11, 2020, 04:32:48 pm by rstofer »
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #40 on: October 11, 2020, 04:59:52 pm »
Quote
The 6502 is easier and there is at least one example at Open  Cores.  I haven't used it.
I've tried one version of the 6502 from Open Cores and on a Spartan3 it used 3500 CLBs !! Over 90% of the logic is used for instruction decoding and that is a good example of how not to write a softcore.

It's the nature of CISC architectures to have most of the logic in decoding.  The resources are limited, there are generally only a very few registers, but the complex addressing schemes chew up a lot of logic.

For elegance in code, hamsternz's RISC project is excellent.  The entire core uses less than 6% of the LUTs on an Artix 7 35T which is a fairly small chip considering I tend toward the 100T variants.  The code has some alternative bits where resources are reducing by a different coding and the savings are substantial.  It also uses just 4% of the BlockRAM (memory size is defined to be quite small) so there is room to expand it.

The important thing is that the core can be used with a small, less expensive, board.  I haven't tried it but apparently it will work with a CMOD chip.  This allows the FPGA board to be plugged into a daughter card with the various peripherals.  Nice!
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #41 on: October 11, 2020, 10:25:08 pm »
If you want a really small RISC-V core, there is Olof Kindgren's "SeRV", a bit-serial implementation of RV32I. Most instructions take 32 clock cycles, but jumps, load/store, SLT/SLTU take 64 and shifts take between 32 and 64.

https://github.com/olofk/serv

He keeps making it smaller, but as at August 9th Olof tweeted: "255 LUT and 225 FF on iCE40 with the standard config that support interrupts and a few CSR. The minimal RV32I only config is approximately 50 LUT and 20 FF less than that". As at May that was 167 LUT and 224 FF on Artix-7, but I think it will be a little smaller than that now as the same slide shows 266 LUT and 227 FF on iCE40.

https://diode.zone/videos/watch/0230a518-e207-4cf6-b5e2-69cc09411013
 

Offline 0db

  • Frequent Contributor
  • **
  • Posts: 336
  • Country: zm
Re: softcore for learning purposes
« Reply #42 on: October 13, 2020, 12:19:47 pm »
Wait for the announced 2d Edition (with no release date at Amazon) or buy what will instantly be obsolete?

Wait for it.
 

Offline 0db

  • Frequent Contributor
  • **
  • Posts: 336
  • Country: zm
Re: softcore for learning purposes
« Reply #43 on: October 13, 2020, 01:59:42 pm »
LC3 lacks of some features. Why not to add them? Doc them, test them. Improve them.
It seems a good approach.
There is too much people who only download stuff, without sending anything back.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #44 on: October 13, 2020, 05:43:50 pm »
LC3 lacks of some features. Why not to add them? Doc them, test them. Improve them.
It seems a good approach.
There is too much people who only download stuff, without sending anything back.

LC3 doesn't have opcode space to add more. There is only one free major opcode.

That's one of the main reasons the guys at Berkeley developed RISC-V instead of trying to enhance OpenRISC or MIPS or SPARC or ARM.

(The other version is licences weren't available at a reasonable cost or at all)
 

Offline 0db

  • Frequent Contributor
  • **
  • Posts: 336
  • Country: zm
Re: softcore for learning purposes
« Reply #45 on: October 13, 2020, 06:20:13 pm »
LC3 doesn't have opcode space to add more. There is only one free major opcode.

So make the opcode 32 bit long, or rearrange it without messing too much with the data path, and you have more space for new opcodes. This needs to adapt the machine layer of LCC, but it will teach a lot.

That's one of the main reasons the guys at Berkeley developed RISC-V instead of trying to enhance OpenRISC or MIPS or SPARC or ARM.

  • MIPS? You cannot touch it without receiving a legal letter (especially with the "nano MIPS"). Nasty guys there. In fact even now that big companies like SGI is finally dead, they prefer to keep secret on all their deprecated CPU and forget it: no matter how much you wish it happening, you won't see anything about MIPS-5 (R12K, R14K, R16K). They literally have an underground room where they only care about stacking their super secret paper datasheets to have them hidden from the public. Year after year their precious paper age even more to the color of a yellowish funeral, while the ink becomes a bit more unreadable, but that's what they want rather than releasing the doc or paying someone for making a digital copy to be shared on the internet.
  • SPARC? Too much complex, and the window-registers is ... a bad idea
  • ARM? ... stands for "Advanced Risc Machine", and the "advanced" term means too much complexity for a "single person" project
  • OpenRISC? ... well it's like the HURD project. They started with something simple, their ego made it so hyper-complex that two devs cannot talk about the same detail without feeling lost in space and time

There is also an alternative: ijvm!, the didactic stack machine invented by Andrew Stuart Tanenbaum. It's not RISCy by any mean, it's intended to teach but it's very interesting, and simple enough for FPGA and home-made compilers.
 

Offline tggzzz

  • Super Contributor
  • ***
  • Posts: 20355
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: softcore for learning purposes
« Reply #46 on: October 13, 2020, 06:44:49 pm »
  • ARM? ... stands for "Advanced Risc Machine", and the "advanced" term means too much complexity for a "single person" project
.

The first ARM took two people (Roger Wilson, Steve Furber) 6 man years. But that included inventing the concept, the instruction set, the custom semiconductor implementation.

It is reasonable to consider an ARM 1 processor as a single person project.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4414
  • Country: nz
Re: softcore for learning purposes
« Reply #47 on: October 13, 2020, 07:40:28 pm »
LC3 doesn't have opcode space to add more. There is only one free major opcode.

So make the opcode 32 bit long, or rearrange it without messing too much with the data path, and you have more space for new opcodes. This needs to adapt the machine layer of LCC, but it will teach a lot.

Sure. Or just start from scratch. But either way you also have to start from scratch with the software.

Quote
That's one of the main reasons the guys at Berkeley developed RISC-V instead of trying to enhance OpenRISC or MIPS or SPARC or ARM.

  • MIPS? You cannot touch it without receiving a legal letter (especially with the "nano MIPS"). Nasty guys there. In fact even now that big companies like SGI is finally dead, they prefer to keep secret on all their deprecated CPU and forget it: no matter how much you wish it happening, you won't see anything about MIPS-5 (R12K, R14K, R16K). They literally have an underground room where they only care about stacking their super secret paper datasheets to have them hidden from the public. Year after year their precious paper age even more to the color of a yellowish funeral, while the ink becomes a bit more unreadable, but that's what they want rather than releasing the doc or paying someone for making a digital copy to be shared on the internet.
  • SPARC? Too much complex, and the window-registers is ... a bad idea
  • ARM? ... stands for "Advanced Risc Machine", and the "advanced" term means too much complexity for a "single person" project
  • OpenRISC? ... well it's like the HURD project. They started with something simple, their ego made it so hyper-complex that two devs cannot talk about the same detail without feeling lost in space and time

All correct, more or less. Which is why it's so good they actually did RISC-V, and did it pretty well too. There are some things I think they got wrong, and I've debated them with Andrew and Krste and they even in some cases agree they could have been done better, but nothing major enough to be worth an incompatible change at this point (or even three years ago when I raised them).

NanoMIPS, incidentally, looks really nice, but it was dead on arrival. There is one 32 bit chip using it, apparently exclusive to MediaTek. The gcc patches to support NanoMIPS were sent to the gcc mailing list the day before the entire compiler team was fired, and have never been merged into upstream gcc, and probably never will be.

Quote
There is also an alternative: ijvm!, the didactic stack machine invented by Andrew Stuart Tanenbaum. It's not RISCy by any mean, it's intended to teach but it's very interesting, and simple enough for FPGA and home-made compilers.

Anything with an INVOKEVIRTUAL and NEWARRAY opcodes is not going to be a great target for home made FPGA CPUs. If nothing else, you're going to need a garbage collector in hardware -- or a way for NEWARRAY to trap to a software garbage collector ... but the instruction set doesn't have primitives to allow you to implement one.

Also the only boolean operations provided are AND and OR, which is very limiting. I'm not clear from the spec whether those are boolean or bitwise i.e. && and || or & and | in C terms.  And there are no shifts.

As I demonstrated in a previous post in this thread about LC3, if you have ADD and AND and a test for equality to zero then you can fake up shifts and other boolean operations using a loop and some IFs, but ick.


An instruction set that I think *would* be very interesting to do as an exercise is the integer core of Transputer. The opcodes are very simple and it has the interesting property that you can make it have any register width and address space size you want -- 8 bits, 16 bits, 32 bits, 64 bits or anything in between. As with WASM larger values are built up incrementally, 4 bits at a time in the case of Transputer. Transputer uses a stack but it's a fixed 4 elements in size, and the compiler (which is available) manages things around that.
« Last Edit: October 13, 2020, 07:45:10 pm by brucehoult »
 

Offline 0db

  • Frequent Contributor
  • **
  • Posts: 336
  • Country: zm
Re: softcore for learning purposes
« Reply #48 on: October 13, 2020, 08:30:34 pm »
Anything with an INVOKEVIRTUAL and NEWARRAY opcodes is not going to be a great target for home made FPGA CPUs.

I played a lot with it during my examinations. I somehow liked it. It's not "picojava" it's "ijvm". Invokevirtual is actually a simple (CISCish) "jsr". Nothing special, nothing complex.

Also the only boolean operations provided are AND and OR, which is very limiting. I'm not clear from the spec whether those are boolean or bitwise i.e. && and || or & and | in C terms.  And there are no shifts.

Ijvm goes from the simplest and minimal "MIC1" to the complex "MIC5", which is technically "implementation" stuff, but starting from MIC5 there is also space for ISA revisions, and since there is space for new opcodes, you can also implement "bitwise" and "boolean" "operators" ... "shift", "rotate", "bit testing", everything, on the stack primitives.

Andrew Stuart Tanenbaum has documented it on his book, and if someone needs any book support for a CPU, well that's can be interesting. It was a true hype more than ten years ago, when Java was the new great hit.
 

Online rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: softcore for learning purposes
« Reply #49 on: October 13, 2020, 08:38:35 pm »
LC3 doesn't have opcode space to add more. There is only one free major opcode.
Subtraction can be handled by taking the 2s complement and adding.  Slide 5-10 on page 3 here:

https://www.cis.upenn.edu/~milom/cse240-Fall05/handouts/Ch05.pdf

In the case of an immediate subtraction, just encode the 2s complement from the start and emit an ADD instruction.

I would use the last op code to do the shifts.  The count could be either immediate or from a register, there is enough room in the instruction to put the immediate count in SRC1 and one bit of SRC2 while using the last 2 bits of SR2 to specify type and direction.  We could encode arithmetic and logical shifts of up to 16 bits.  Ugly!  I haven't drawn this out, I don't actually know that it is feasbile.

 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf