Author Topic: ALU in MIPS  (Read 2478 times)

0 Members and 1 Guest are viewing this topic.

Offline prophossTopic starter

  • Contributor
  • Posts: 46
  • Country: us
ALU in MIPS
« on: September 07, 2021, 01:18:13 am »
I am still fairly new to all this but have been wanting to get a softcore going. When I finally realized how much was involved I backed off that plan. I have since found some code in VHDL @ https://www.fpga4student.com/2017/09/vhdl-code-for-mips-processor.html that comes with a testbench. The starting point is the ALU. I watched some videos to try and understand what all the ALU does. The basic idea, if I understand it correctly, is that you create op codes for what all you want the ALU to do. In the case of the above it looks like a 3 bit op code that adds, subtracts, AND, OR, with a comparator(?). Should it do more? Could it do more? Is that case statement the best way to go? On the same website there is another ALU with a 4 bit op code by itself that has multiplication, division, left and right shift, and so on as well as the above mentioned functions. I guess I'm kind of wondering why they didn't use the same thing for both? Still learning here and am curious of y'alls opinions. Thanks ahead of time.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9885
  • Country: us
Re: ALU in MIPS
« Reply #1 on: September 07, 2021, 01:50:58 am »
That's a really interesting site, thanks for the link!  The processor, as described, is a very minimal implementation.  It only has 256 words of RAM.  The ALU will do 5 operations out of a maximum of 8.  You could possibly add 3 more if you think of them.  Multiply and divide are candidates but they won't be single cycle instructions.  Multiply can be on some machines but divide is just ugly.

Before you worry about how the ALU works, you need to understand how the CPU works from 10,000 feet up.  Memory contains instructions, instructions direct data flow through the ALU as well as branch instructions and such.  Part of the 16 bit instruction is the ALU operation code.  Look at VHDL Code for Control Unit to see all of the instructions.

Look at the non-ALU instructions like beq, lw and sw, j, and jal.  These stand for branch if equal, load word to register, store word from register, jump and (Jump And Link) which is a 'call' instruction for branching to a subroutine.  The return address is stored in R2 (I believe).

This is an interesting starter CPU.  There are certainly more complete examples but this might be a good place to start.

I use the case statement for that kind of thing all the time.  The alternative might be an if-elsif-else structure and that forms a priority tree which is likely a good deal slower.

There are cleaner ways to code the process.  You can move all the default <signal <= '0';> stuff to just ahead of the case statement and then you don't need to duplicate the assignments in every case.  When a particular case needs to assert a non-default value, just write it in.  VHDL will create hardware that uses the last value that is assigned.

Here's what it looks like for the process up to the first case:
Code: [Select]
    process(State, BEN, IR,MemReady, Immediate, PSR, Interrupt)
    begin
        -- set default values for all signals
        GateBusSelect   <= GatePC;
        MIOenable       <= '0';
        LD_MAR          <= '0';
        LD_MDR          <= '0';
        LD_IR           <= '0';
        LD_BEN          <= '0';
        LD_PC           <= '0';
        LD_Reg          <= '0';  -- many other signals omitted
       
        case State is
            when  0     =>  -- Branch Instruction
                            if BEN = '1' then
                                NextState   <= 22;  -- branch is taken
                            else
                                NextState   <= 18;  -- branch not taken   
                            end if;
[/font]

Get the book "Free Range VHDL"  http://freerangefactory.org/pdf/df344hdh4h8kjfh3500ft2/free_range_vhdl.pdf

Make certain that every output is defined under every condition of the case statement or you will generate latches.  My most frequent error is adding a signal to a FSM and forgetting to add a default entry.
« Last Edit: September 07, 2021, 02:13:49 am by rstofer »
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9885
  • Country: us
Re: ALU in MIPS
« Reply #2 on: September 07, 2021, 01:57:26 am »
What's it good for?

It turns out that a tiny core is often just the right thing when you need to manipulate external signals that you don't want to code in logic.  Maybe it's a disk controller, maybe it's an external device of some other kind.  Something where it is easier to handle it in CPU code than VHDL.

You can arrange for your signals to be one of the registers by not connecting a particular register to memory and just delivering inputs when the register is read.  You could use a second register for outputs.

ETA:

Better yet, create  INP and OUT instructions sort of like lw and sw.  Instead of a register address, you would have an IO address.

Next progression would be to add some kind of interrupt structure, probably vectored to specific addresses in low memory.
« Last Edit: September 07, 2021, 03:17:43 pm by rstofer »
 

Offline prophossTopic starter

  • Contributor
  • Posts: 46
  • Country: us
Re: ALU in MIPS
« Reply #3 on: September 08, 2021, 12:22:14 am »
I do actually have Free range VHDL downloaded, as suggested to you at an earlier date. I will take a look at the CPU and all the codes. Thanks for the help and suggestions. I will keep going with this one.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9885
  • Country: us
Re: ALU in MIPS
« Reply #4 on: September 08, 2021, 12:36:16 am »
I just installed Vivado on a new machine so I downloaded the files and they do indeed simulate and synthesize.  I would be targeting a Digilent Nexys 4 DDR board but before I can do that, I need to convert the 16 bit pc_out and alu_result vectors to 7 segment displays.  I have no reason to believe this thing has any issues, it looks pretty clean.

The Nexys 4 DDR is obsolete and has been replaced by the Nexys A7

https://digilent.com/shop/nexys-a7-fpga-trainer-board-recommended-for-ece-curriculum/

It just happens to have 8 7-segment digits, perfect for the application.  It's also my favorite board for playing with these kinds of things because it has a lot of gadgets on the board.

 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9885
  • Country: us
Re: ALU in MIPS
« Reply #5 on: September 08, 2021, 04:10:24 pm »
OK, I copied the constraints file from another project and I have the bitstream working on the Nexys 4 DDR and the 7 segment displays are working.

It takes a bit more work to have a switch selectable display so I can see PC and Instruction or some other combination that would include the ALU output.  This is just a MUX with the select input from a toggle switch.  I have that code laying around here somewhere.

Then there is the need to generate a single step clock.  It's a little tough to see what is happening at 100 MHz.  I'm of the view that simulation isn't hardware so regardless of what the simulator says, I want to see it run in hardware.  I have that code laying around as well.
 

Offline prophossTopic starter

  • Contributor
  • Posts: 46
  • Country: us
Re: ALU in MIPS
« Reply #6 on: September 09, 2021, 10:05:54 pm »
I have enjoyed your comments as you have been checking if the code  actually works. I also appreciate that as well! Many times the code has worked like it should and I don't know why. This time I know the code does work!

You had mentioned something about using a constraint file. I was wondering what does that do? I have never dealt with that when programing my FPGA. I have a cheap Chinese board with a cyclone 4 and very little documentation. I have been able to synthesize and run code but just simple stuff so far. I looked at that nexys board you are using but that is out of my price range. Sounds great though.
Thanks for playing along.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9885
  • Country: us
Re: ALU in MIPS
« Reply #7 on: September 10, 2021, 01:22:58 pm »
The constraints file, in this example, define the clk parameters and all of the IO mapping to pins.

I use a master file from Digilent and just comment out the stuff I'm not using.

There are other things that can be done in a constraints file but I don't know what they are and I have never used them.

Code: [Select]
set_property -dict {PACKAGE_PIN E3 IOSTANDARD LVCMOS33} [get_ports clk]
#set_property CLOCK_DEDICATED_ROUTE FALSE [get_nets B16_OBUF];
set_property -dict {PACKAGE_PIN C12 IOSTANDARD LVCMOS33} [get_ports reset]
# Tri-Color LEDs
#set_property -dict {PACKAGE_PIN N16 IOSTANDARD LVCMOS33} [get_ports R17]
#set_property -dict {PACKAGE_PIN R11 IOSTANDARD LVCMOS33} [get_ports G17]
#set_property -dict {PACKAGE_PIN G14 IOSTANDARD LVCMOS33} [get_ports B17]
#set_property -dict {PACKAGE_PIN N15 IOSTANDARD LVCMOS33} [get_ports R16]
#set_property -dict {PACKAGE_PIN M16 IOSTANDARD LVCMOS33} [get_ports G16]
#set_property -dict {PACKAGE_PIN R12 IOSTANDARD LVCMOS33} [get_ports B16]
# 7 Segment Anodes
set_property -dict {PACKAGE_PIN U13 IOSTANDARD LVCMOS33} [get_ports {Digits[7]}]
set_property -dict {PACKAGE_PIN K2  IOSTANDARD LVCMOS33} [get_ports {Digits[6]}]
set_property -dict {PACKAGE_PIN T14 IOSTANDARD LVCMOS33} [get_ports {Digits[5]}]
set_property -dict {PACKAGE_PIN P14 IOSTANDARD LVCMOS33} [get_ports {Digits[4]}]
set_property -dict {PACKAGE_PIN J14 IOSTANDARD LVCMOS33} [get_ports {Digits[3]}]
set_property -dict {PACKAGE_PIN T9  IOSTANDARD LVCMOS33} [get_ports {Digits[2]}]
set_property -dict {PACKAGE_PIN J18 IOSTANDARD LVCMOS33} [get_ports {Digits[1]}]
set_property -dict {PACKAGE_PIN J17 IOSTANDARD LVCMOS33} [get_ports {Digits[0]}]
# 7 Segment Cathodes
set_property -dict {PACKAGE_PIN T10 IOSTANDARD LVCMOS33} [get_ports {Segments[0]}]
set_property -dict {PACKAGE_PIN R10 IOSTANDARD LVCMOS33} [get_ports {Segments[1]}]
set_property -dict {PACKAGE_PIN K16 IOSTANDARD LVCMOS33} [get_ports {Segments[2]}]
set_property -dict {PACKAGE_PIN K13 IOSTANDARD LVCMOS33} [get_ports {Segments[3]}]
set_property -dict {PACKAGE_PIN P15 IOSTANDARD LVCMOS33} [get_ports {Segments[4]}]
set_property -dict {PACKAGE_PIN T11 IOSTANDARD LVCMOS33} [get_ports {Segments[5]}]
set_property -dict {PACKAGE_PIN L18 IOSTANDARD LVCMOS33} [get_ports {Segments[6]}]
set_property -dict {PACKAGE_PIN H15 IOSTANDARD LVCMOS33} [get_ports {Segments[7]}]
# LEDs
#set_property -dict {PACKAGE_PIN H17 IOSTANDARD LVCMOS33} [get_ports {LEDs[0]}]
#set_property -dict {PACKAGE_PIN K15 IOSTANDARD LVCMOS33} [get_ports {LEDs[1]}]
#set_property -dict {PACKAGE_PIN J13 IOSTANDARD LVCMOS33} [get_ports {LEDs[2]}]
#set_property -dict {PACKAGE_PIN N14 IOSTANDARD LVCMOS33} [get_ports {LEDs[3]}]
#set_property -dict {PACKAGE_PIN R18 IOSTANDARD LVCMOS33} [get_ports {LEDs[4]}]
#set_property -dict {PACKAGE_PIN V17 IOSTANDARD LVCMOS33} [get_ports {LEDs[5]}]
#set_property -dict {PACKAGE_PIN U17 IOSTANDARD LVCMOS33} [get_ports {LEDs[6]}]
#set_property -dict {PACKAGE_PIN U16 IOSTANDARD LVCMOS33} [get_ports {LEDs[7]}]
#set_property -dict {PACKAGE_PIN V16 IOSTANDARD LVCMOS33} [get_ports {LEDs[8]}]
#set_property -dict {PACKAGE_PIN T15 IOSTANDARD LVCMOS33} [get_ports {LEDs[9]}]
#set_property -dict {PACKAGE_PIN U14 IOSTANDARD LVCMOS33} [get_ports {LEDs[10]}]
#set_property -dict {PACKAGE_PIN T16 IOSTANDARD LVCMOS33} [get_ports {LEDs[11]}]
#set_property -dict {PACKAGE_PIN V15 IOSTANDARD LVCMOS33} [get_ports {LEDs[12]}]
#set_property -dict {PACKAGE_PIN V14 IOSTANDARD LVCMOS33} [get_ports {LEDs[13]}]
#set_property -dict {PACKAGE_PIN V12 IOSTANDARD LVCMOS33} [get_ports {LEDs[14]}]
#set_property -dict {PACKAGE_PIN V11 IOSTANDARD LVCMOS33} [get_ports {LEDs[15]}]
# Buttons
#set_property -dict {PACKAGE_PIN P17 IOSTANDARD LVCMOS33} [get_ports BtnL]
#set_property -dict {PACKAGE_PIN M17 IOSTANDARD LVCMOS33} [get_ports BtnR]
#set_property -dict {PACKAGE_PIN M18 IOSTANDARD LVCMOS33} [get_ports BtnU]
#set_property -dict {PACKAGE_PIN P18 IOSTANDARD LVCMOS33} [get_ports BtnD]
#set_property -dict {PACKAGE_PIN N17 IOSTANDARD LVCMOS33} [get_ports BtnC]
# Switches
#set_property -dict {PACKAGE_PIN J15 IOSTANDARD LVCMOS33} [get_ports {Sw[0]}]
#set_property -dict {PACKAGE_PIN L16 IOSTANDARD LVCMOS33} [get_ports {Sw[1]}]
#set_property -dict {PACKAGE_PIN M13 IOSTANDARD LVCMOS33} [get_ports {Sw[2]}]
#set_property -dict {PACKAGE_PIN R15 IOSTANDARD LVCMOS33} [get_ports {Sw[3]}]
#set_property -dict {PACKAGE_PIN R17 IOSTANDARD LVCMOS33} [get_ports {Sw[4]}]
#set_property -dict {PACKAGE_PIN T18 IOSTANDARD LVCMOS33} [get_ports {Sw[5]}]
#set_property -dict {PACKAGE_PIN U18 IOSTANDARD LVCMOS33} [get_ports {Sw[6]}]
#set_property -dict {PACKAGE_PIN R13 IOSTANDARD LVCMOS33} [get_ports {Sw[7]}]
#set_property -dict {PACKAGE_PIN T8  IOSTANDARD LVCMOS33} [get_ports {Sw[8]}]
#set_property -dict {PACKAGE_PIN U8  IOSTANDARD LVCMOS33} [get_ports {Sw[9]}]
#set_property -dict {PACKAGE_PIN R16 IOSTANDARD LVCMOS33} [get_ports {Sw[10]}]
#set_property -dict {PACKAGE_PIN T13 IOSTANDARD LVCMOS33} [get_ports {Sw[11]}]
#set_property -dict {PACKAGE_PIN H6  IOSTANDARD LVCMOS33} [get_ports {Sw[12]}]
#set_property -dict {PACKAGE_PIN U12 IOSTANDARD LVCMOS33} [get_ports {Sw[13]}]
#set_property -dict {PACKAGE_PIN U11 IOSTANDARD LVCMOS33} [get_ports {Sw[14]}]
#set_property -dict {PACKAGE_PIN V10 IOSTANDARD LVCMOS33} [get_ports {Sw[15]}]
# Serial Port
#set_property -dict {PACKAGE_PIN C4  IOSTANDARD LVCMOS33} [get_ports Tx]
#set_property -dict {PACKAGE_PIN D4  IOSTANDARD LVCMOS33} [get_ports Rx]
#set_property -dict {PACKAGE_PIN D3  IOSTANDARD LVCMOS33} [get_ports CTS]
#set_property -dict {PACKAGE_PIN E5  IOSTANDARD LVCMOS33} [get_ports RTS]
# PMod JA Pin 1
#set_property -dict {PACKAGE_PIN C17 IOSTANDARD LVCMOS33} [get_ports BaudClk]
#set_property -dict {PACKAGE_PIN E18 IOSTANDARD LVCMOS33} [get_ports SerIn]
#set_property -dict {PACKAGE_PIN D18 IOSTANDARD LVCMOS33} [get_ports KBdataReady]

create_clock -period 10.000 -name clk  -waveform {0.000  5.000} -add [get_ports -filter { NAME =~ "*clk*"  && DIRECTION == "IN" }]

[/font]

This file works with my modified project where I took alu_results and pc_out and ran them through a 7 segment display entity so the outputs are now digits and segments, both 8 bit vectors.  Neither of the original vectors now connects externally.

The clk signal is on 2 lines at the top of the file (one is commented out) and once again at the very bottom.  Awkward, but it works.

I leave the commented definitions for future expansion.  I never remove the unused/commented definitions.
« Last Edit: September 10, 2021, 01:58:59 pm by rstofer »
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9885
  • Country: us
Re: ALU in MIPS
« Reply #8 on: September 10, 2021, 01:56:38 pm »
I looked at that nexys board you are using but that is out of my price range. Sounds great though.
Thanks for playing along.

There are other ways to get the same type of gadgets.  You can use an IO expander like the Microchip MCP23008 for simple input/output so you could bring in a bank of 16 switches or output LEDs at a reasonable current (note there is a package current limit as well as a pin current limit)

https://www.mouser.com/datasheet/2/268/21919b-65915.pdf

I used the Microchip MCP23S17 for 16 console entry switches.

Maxim makes a 7 segment driver the will drive 16 7-segment display digits.  I used this on a project a long while back.  There are 8 digit drivers as well.  One way to do this is to create a PCB with switches and displays and separate SPI channels.  This can be a universal gadget used on all projects.  Just like the displays and switches on the Nexys 4 DDR board.

You would have to code up an SPI entity to interface the FPGA to the IO devices but SPI is easy.  I would never even think about using I2C.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9885
  • Country: us
Re: ALU in MIPS
« Reply #9 on: September 10, 2021, 02:06:46 pm »
On the same website there is another ALU with a 4 bit op code by itself that has multiplication, division, left and right shift, and so on as well as the above mentioned functions. I guess I'm kind of wondering why they didn't use the same thing for both? Still learning here and am curious of y'alls opinions. Thanks ahead of time.

Remember, this project is for a 1 cycle MIPS CPU.  Some instructions in the expanded CPU might not work in 1 cycle.  Very wide barrel shifters tend to add a lot of logic and the alternative of shifting one position per clock is clearly incompatible with the 1 cycle objective.  Division won't work in 1 cycle and multiplication might but might not, depending on whether the FPGA has built in multipliers.  Many do...
 

Offline prophossTopic starter

  • Contributor
  • Posts: 46
  • Country: us
Re: ALU in MIPS
« Reply #10 on: September 20, 2021, 12:01:43 am »
@rstofer
I have been thinking about what you  have been doing. I could copy and paste the code but implementing the MIPS core on my dev board is kind of the point. I have been learning more about MIPS for the past week or so. I did not realize how different the assembly code was for each kind of processor. The assembly I learned in class was for an STM32 kind of processor. All the basics are there but the actual code is different enough that I had to go looking to see what each op-code was doing. To be honest I still don't quite know what sliu is doing for this implementation.
Doing all the reading has got me wondering what would be the best way to learn the architecture? I love learning about how it all works and playing with the assembly code. It seems like a good way to see what each part of the code actually does. Kind of like figuring out how the abstraction layer was created. Just thought I'd see what you thought about it.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9885
  • Country: us
Re: ALU in MIPS
« Reply #11 on: September 20, 2021, 02:40:31 am »
I'm not certain of the mnemonic for sliu but if you look in control_unit file, sliu sets up alu-op to '10'.  Then follow this into ALU_control and see where ALU_control is set to '100' and then follow along into ALU to see where it says if a<b then destination  becomes x"0001" else x"0000".

So, it does a comparison on the two source registers and puts a logical value into the result register.

There's not a lot of architecture involved.  You have memory, a control unit, some registers and an ALU.  You don't have TRAP instructions, there is no interrupt structure, DMA is completely out the window...  There just isn't a lot to it.  One way to draw the architecture is to draw rectangular blocks for all the entities and draw some interconnecting lines.  Everything goes through MIPS_VHDL so all the signals are at least routed through there whether they are used in that component or not.

This is a minimal CPU and it might be worthwhile for that feature alone but 256 words of RAM isn't much of a machine.

Once again, I'm going to bring up LC3 - a RISC processor that has terrific documentation as well as a book.  Google for 'LC3' and you will find a lot of universities using this book/CPU as an undergrad course in computer architecture.  In the meantime, look at the appendices:

https://www.cs.utexas.edu/users/fussell/courses/cs310h/lectures/Lecture_10-310h.pdf
https://www.cs.colostate.edu/~cs270/.Spring21/resources/PattPatelAppA.pdf

http://people.cs.georgetown.edu/~squier/Teaching/HardwareFundamentals/LC3-trunk/docs/LC3-uArch-PPappendC.pdf

Appendix C gives the block diagram and the state transition diagram as well as a coding form for the microcode for those who use the author's approach.  I used a case statement written exactly like the transition diagram but I really like the idea of using loadable microcode.  Either way...

At line 437 in LC3.vhd (in attached .zip file), you can see the beginning of the case statement..  Note how everything follows exactly from that diagram.  Easy!

If you look at Figure C-4, you can see the block diagram and when you compare it to LC3.vhd, the code should pretty much match.  All the muxen(tm) are laid out exactly according to the block diagram.  There is also a .hex file that contains a core load for an initial program.  It just implements interrupt driven IO over a serial port.  The important part is that the interrupt system works.  Check LC3.lst for the source program.  Lower memory is reserved for traps and interrupt vectors so it drags on a little.  There is an assembler manual here:

http://www.cs.binghamton.edu/~tbarten1/CS120_Summer_2013/ClassNotes/L10-LC3_RR.htm

There's an assembler out there somewhere.  I had it but I think I may have lost it momentarily.  Google for 'LC3 tools'

Here is the 3rd edition of the book, it is newer than the one I used (2d edition).  Mine has a blue cover and it matches the appendices that I linked above.

https://www.alibris.com/Introduction-to-Computing-Systems-From-Bits-and-Gates-to-C-C-Beyond-Yale-N-Patt/book/42539098

At some point, the authors changed to byte addressable memory and the new project is LC3b and you can find that project out on the Internet as well.  I haven't implemented it although I do have the newer book.  I'm not sure about tools for that project.  Is there a different assembler?  Beats me!

 
« Last Edit: September 20, 2021, 05:47:21 pm by rstofer »
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9885
  • Country: us
Re: ALU in MIPS
« Reply #12 on: September 20, 2021, 07:02:00 pm »
@rstofer
I have been thinking about what you  have been doing. I could copy and paste the code but implementing the MIPS core on my dev board is kind of the point. I have been learning more about MIPS for the past week or so. I did not realize how different the assembly code was for each kind of processor. The assembly I learned in class was for an STM32 kind of processor. All the basics are there but the actual code is different enough that I had to go looking to see what each op-code was doing.

Whether I am writing in assembly code or Fortran, I find it helpful to lay out the blocks with some kind of pseudo-code.  The relationship between pseudo operations and any particular instruction is pretty flexible.  Later, when I understand the nature of the various blocks, I start writing the actual code.

There's a programming practice known as "Top Down Design" which I follow up with "Bottom Up Coding".  I get the UART working first and leave the fancy stuff for later.  Once I can get debug output, I'm on my way.  Next up?  Functions to print formatted IO.  Not necessarily a full-blown printf() but things like putstr(), atoi(), some kind of hex output for nibbles, bytes, shorts and words.  That kind of thing.  I grab all of my string functions from "The C Programming Language" by Kernighan and Ritchie.

https://www.techopedia.com/definition/9744/top-down-design

« Last Edit: September 20, 2021, 07:03:43 pm by rstofer »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf