Author Topic: The NEORV32 Risc-V Processor  (Read 2715 times)

0 Members and 1 Guest are viewing this topic.

Offline betocool

  • Regular Contributor
  • *
  • Posts: 57
  • Country: au
The NEORV32 Risc-V Processor
« on: June 21, 2022, 01:59:21 am »
Hi all,

I searched through the forum just to make sure I'm not repeating someone else's information, it turns out the term "neorv32" appears only once, on a post I made last year, and just in passing. As luck would have it, I was redirected to the NeoRV32 project in my interest to get a soft Risc-V processor into an FPGA. I'd looked at Litex, VexRiscV, PicoRV32, and while they were ok, documentation was either lacking, you'd need to invest A LOT of time to get something decent up and running, or dabble in Migen which is a Python way of describing hardware. Mainly it all felt a bit cumbersome.

But back on track, I went and looked into https://www.neorv32.org/, and looked at both the user manual and the datasheet.

This is so far the best and most understandable Risc-V implementation I've come across. After investing some time in reading the details, I was able to have a program compiled and up and running in minutes, both in Xilinx and Altera dev kits. The community also seems very open to questions, somehow, and this is my own personal opinion, the documentation and help seem less obscure than some other open source projects I've come across.

The official repo has test examples based on hardware and software. There are a few simple tools that help you getting that C code flashed into the FPGA as well, so your design is persistent. All the code is well documented and very well written, IMO.

If you're looking to get a Risc-V processor running on an FPGA, have a look at this, a very good place to get started. I'm curious to see what you guys think if you get around to it.

Cheers,

Alberto
 
The following users thanked this post: paf, Someone, edavid, emece67

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: The NEORV32 Risc-V Processor
« Reply #1 on: June 21, 2022, 03:34:23 am »
I haven't tried it, but the docs make it look well done.

I'm led to believe that LiteX (https://github.com/enjoy-digital/litex) wraps all kinds of cores, not only RISC-V in a SoC for FPGA and makes it easy to get them going. Neorv32 is just one of the supported cores. There are also Cortex-M1, Zynq7000 (C-A9), OpenRISC in addition to more than a dozen RISC-V cores.

https://github.com/enjoy-digital/litex/tree/master/litex/soc/cores/cpu
 
The following users thanked this post: spostma

Offline betocool

  • Regular Contributor
  • *
  • Posts: 57
  • Country: au
Re: The NEORV32 Risc-V Processor
« Reply #2 on: June 21, 2022, 04:16:14 am »
I have not looked into Litex in detail, but I did run one example, and it was as easy as "python --run --processor..." or something along those lines. In that sense it does work out of the box. Trying to make sense of what was happening in the background and how the modules are all tied up together is a bit more difficult, and let's face it, the documentation for Litex is rather lax in that sense.

I will look into Litex more in the future, and yes, they do have a wrapper for the NeoRV32 core, so it'd be interesting to see how that may add components like SDRAM and DDR3 memories and such. I suppose once you understand what happens in the background things will get easier.

Cheers,

Alberto
 

Offline miken

  • Regular Contributor
  • *
  • Posts: 86
  • Country: us
Re: The NEORV32 Risc-V Processor
« Reply #3 on: June 21, 2022, 05:30:08 am »
I agree, the NEORV32 seems like the friendliest RISC-V core to adopt. (As long as you're not allergic to VHDL.) I'm using it for my SQRL Acorn example projects.

The documentation is lighter on some of the features but since it's open source it's easy enough to investigate exactly how things work. And there are good SW examples for us hardware people to abuse ;D
 

Offline laugensalm

  • Regular Contributor
  • *
  • Posts: 96
  • Country: ch
Re: The NEORV32 Risc-V Processor
« Reply #4 on: June 21, 2022, 07:19:26 am »
I've evaluated a few RISC-V implementations in VHDL, indeed, this is so far the only well-advertised implementation that seems to be compliant, unfortunately it did not exist back then, however, its predecessor, 'neo430' could be integrated into an opensource SoC generator ('masocist') quite effectively.
The `f32c` and `potato` implementations did not pass the regression tests, in fact, a specific implementation of the f32c required a modified toolchain, I don't know if this is still the case, but it was unacceptable back then.
From a technical point of view, the neorv32 does not use the full potential of the RISC-V architecture with respect to pipelining, but normally this is not an issue.
The big plus with decent OpenSource VHDL implementations that are running with GHDL is that you can legally distribute an executable of a virtual chip (i.e. not violating the GPL), plus: passing GHDL raises chances to be portable among different FPGA architectures, provided that the toolchain eats VHDL. The neo430 synthesized with the OpenSource toolchain yosys through the ghdl plugin, I'm quite sure neorv32 would, too.

 

Online pbernardi

  • Contributor
  • Posts: 7
  • Country: br
Re: The NEORV32 Risc-V Processor
« Reply #5 on: June 27, 2022, 12:59:52 am »
One could also take a look at DarkRiscV:

https://opencores.org/projects/darkriscv
https://github.com/darklife/darkriscv

A small and compact RISCV implementation, the original version was made in one night! Still an active project, with periodic updates.
 
The following users thanked this post: paf, betocool

Offline dolbeau

  • Regular Contributor
  • *
  • Posts: 69
  • Country: fr
Re: The NEORV32 Risc-V Processor
« Reply #6 on: July 01, 2022, 10:00:10 am »
(...) in my interest to get a soft Risc-V processor into an FPGA. I'd looked at Litex

Litex is a SoC generator, it can wrap several different cores (some in SMP) and supply a lot of optional peripherals. There's repo to help generate a Linux-bootable SoC (such as linux-on-litex-vexriscv.

Quote
VexRiscV

That's what I usually use, because it's extremely configurable. I've used it both as a SMP Linux test platform starting with quad-core RV32GC and adding parts of B, K and testing some stuff from P. At the other end I've used it as a micro-controller RV32I running some bare-metal code, customized to improve bandwidth usage by using a 128-bits Wishbone bypass interface to a 128-bits LiteDRAM controller (DDR3) and some custom 64-bits load/store (using pair of 32-bits registers).

If you want something faster, Vex has a newer, superscalar OoO brother, NaxRiscv.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: The NEORV32 Risc-V Processor
« Reply #7 on: July 01, 2022, 10:50:24 am »
I've been following NaxRISCV for a little while. It's improving all the time, but it still got a fair bit lower benchmark IPC than WD SWeRV and SiFive U7 dual-issue in-order cores. Maybe it can do higher frequency to make up for it, I'm not sure.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 10170
  • Country: fr
Re: The NEORV32 Risc-V Processor
« Reply #8 on: July 01, 2022, 05:52:47 pm »
The NEORV32 is good, well written and well documented.

The downside is the performance - with an average CPI of 3-4, it's meh, so keep this in mind when comparing with other RISC-V soft cores. And comparing to actual silicon, a NEORV32 clocked at say, 100 MHz, will be very, very far in terms of performance from say, a SiFive FE310 clocked at the same frequency. Just so you know what to expect.

With that in mind, it can still be pretty useful.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: The NEORV32 Risc-V Processor
« Reply #9 on: July 01, 2022, 11:51:06 pm »
I don't know about NEORV32, but an explicit part of the design of PicoRV32 was that if the other parts of your FPGA design allow clocking at 300 or 400 MHz then you don't have to put the CPU core in a different clock domain -- it just takes 3 or 4 cycles pre instruction and sits there happily doing its 100 or whatever MIPS.
 
The following users thanked this post: Someone

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 10170
  • Country: fr
Re: The NEORV32 Risc-V Processor
« Reply #10 on: July 02, 2022, 12:47:14 am »
I don't know about NEORV32, but an explicit part of the design of PicoRV32 was that if the other parts of your FPGA design allow clocking at 300 or 400 MHz then you don't have to put the CPU core in a different clock domain -- it just takes 3 or 4 cycles pre instruction and sits there happily doing its 100 or whatever MIPS.

Yes, but I don't think the NEORV32 can be clocked nearly as fast as this, at least not on "modest" FPGAs. Maybe on the Virtex series. =)
They show something  like 108 MHz Fmax on the Cyclone IV (or V?) IIRC.
 

Offline dolbeau

  • Regular Contributor
  • *
  • Posts: 69
  • Country: fr
Re: The NEORV32 Risc-V Processor
« Reply #11 on: July 05, 2022, 11:11:12 am »
I've been following NaxRISCV for a little while. It's improving all the time, but it still got a fair bit lower benchmark IPC than WD SWeRV and SiFive U7 dual-issue in-order cores. Maybe it can do higher frequency to make up for it, I'm not sure.

Well, Nax has been a single-person job for less than a year; the fact it can reasonably be compared to commercial offerings is already an impressive result :-) Also, it's highly configurable, can be either RV32 or RV64, and is free.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: The NEORV32 Risc-V Processor
« Reply #12 on: July 05, 2022, 12:01:07 pm »
I've been following NaxRISCV for a little while. It's improving all the time, but it still got a fair bit lower benchmark IPC than WD SWeRV and SiFive U7 dual-issue in-order cores. Maybe it can do higher frequency to make up for it, I'm not sure.

Well, Nax has been a single-person job for less than a year; the fact it can reasonably be compared to commercial offerings is already an impressive result :-) Also, it's highly configurable, can be either RV32 or RV64, and is free.

Fair enough.

Here's a single-person about a year OoO RISC-V core that can do up to 8 instructions per clock and currently gets 6.5 DMIPS/MHz (vs NaxRiscv 2.94). This is quite close to the 6.5 for SiFive's P550 core, due to appear in Intel's "Horse Creek" platform shortly.

http://www.moonbaseotago.com/

GPL3, which some people count as free.
 

Offline dolbeau

  • Regular Contributor
  • *
  • Posts: 69
  • Country: fr
Re: The NEORV32 Risc-V Processor
« Reply #13 on: July 05, 2022, 12:32:32 pm »
Here's a single-person about a year OoO RISC-V core that can do up to 8 instructions per clock and currently gets 6.5 DMIPS/MHz (vs NaxRiscv 2.94)

Oh, nice! Didn't know that one.

Fast, but quoting the architectural presentation (as I can't find any specific numbers for e.g. LUT usage):
Quote
* Xilinx VU9P Ultrascale (which is really 3 dies)
* Cut down to fit (...)

Ouch. VU9P are not exactly cheap - quick googling suggests they are 5-digits in small quantity. NaxRiscv will fit in a sub-$100 mid-range Artix-7, and with 4x the clock (presumably, they are targeting ASICs rather than FPGAs so the trade-offs and targeted clock are quite different).

With the number of RISC-V (soft-)core now available, there should be something adequate for most use cases.

(edit: typo)
« Last Edit: July 05, 2022, 12:34:15 pm by dolbeau »
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: The NEORV32 Risc-V Processor
« Reply #14 on: July 05, 2022, 09:58:18 pm »
Fast, but quoting the architectural presentation (as I can't find any specific numbers for e.g. LUT usage):
Quote
* Xilinx VU9P Ultrascale (which is really 3 dies)
* Cut down to fit (...)

Ouch. VU9P are not exactly cheap - quick googling suggests they are 5-digits in small quantity. NaxRiscv will fit in a sub-$100 mid-range Artix-7, and with 4x the clock (presumably, they are targeting ASICs rather than FPGAs so the trade-offs and targeted clock are quite different).

Right. It's a big OoO core. I don't know how you'd make a useful small OoO core :-)

SiFive were able to prototype E31 and single U54 and U74 cores in an Arty (only just in the case of U74). To prototype the whole 5-core U54-MC for the HiFive Unleashed a $3500 VC707 FPGA board was needed (now $5244).  For the U74-MC in the HiFive Unmatched a $7000 (in 2019) a VCU118 was needed (well, they were $7000 in 2017, now the Xilinx site says $8394 and Digikey says $9496 and out of stock)

If I recall correctly, the VCU118 is enough for a single U84 / P550 core.

I'm not pointing to Vroom to say "Hey, you should use this in your $100 FPGA" but "this is what a one-person team can do". And, yeah, it will be aimed at ASIC implementation once completed.
 

Offline betocool

  • Regular Contributor
  • *
  • Posts: 57
  • Country: au
Re: The NEORV32 Risc-V Processor
« Reply #15 on: July 06, 2022, 01:07:39 pm »
What do you guys mean by OoO?

Cheers,

Alberto
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: The NEORV32 Risc-V Processor
« Reply #16 on: July 06, 2022, 01:20:40 pm »
What do you guys mean by OoO?

Out of Order.

Instructions don't execute in the same order they are in the program. They go into some kind of big pool of instructions and at each clock cycle instructions are checked to see if their input data is ready (e.g. reads from memory, possibly cache misses, or long instructions such as multiple or divide) and one or more of the ready instructions are sent to execution units, depending on how many execution units you have.

In current x86 and ARM CPUs this instruction pool is typically something like 192 or 224 instructions in size or even more -- Apple's M1 appears to be something like 630.
 

Offline laugensalm

  • Regular Contributor
  • *
  • Posts: 96
  • Country: ch
Re: The NEORV32 Risc-V Processor
« Reply #17 on: July 06, 2022, 01:34:05 pm »
Stepping in: Check for 'out of order' execution. It's roughly speaking a hardware based rescheduling of instructions of different classes that can run in parallel on multiple resource instances while minimizing pipeline stalls. For instance, there could be an arithmetic insn, like ADD, plus a branch depending on a condition that was met earlier (independent from the ADD operands). Those could be executed simultaneously.
Since RISC-V has still inherited plenty from old MIPS concepts (aside from branch delay slot), there's some material to read from the MIPS 10k CPU designs that still applies.
However OoO introduces some complexity that makes it close to impossible for devs@home to get full verification coverage, therefore there's caveats with such designs (as we had to learn from the intel archs).
I doubt designs of that sort really pay off on FPGAs due to their longer pipelines and way more logic congestion, so, I'd second this is rather for prototyping with an ASIC in mind, eventually.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: The NEORV32 Risc-V Processor
« Reply #18 on: July 06, 2022, 02:00:41 pm »
Why people is always so boringly interested at performance?!? WHY!????:-//

Jesus, Pipeline and Cache are evil, especially for hobby stuff, especially when you don't have the effort/time to carry their bad effect, and you randomly see topics over topics about people asking "why my debugger behaves so weird .... " it's due to the pipeline, it's due to the cache....


oh, so why do people insist with that bloody stuff? Why?  :-//

A softcore without pipeline is not inferior, it's better by several hundred of magnitude, starting with "because it's simple", ending to "because it doesn't add any bad-side effect"

Don't you hear me? I will make t-shirts with that message :D
 
The following users thanked this post: paf

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 10170
  • Country: fr
Re: The NEORV32 Risc-V Processor
« Reply #19 on: July 06, 2022, 06:18:58 pm »
Apple's M1 appears to be something like 630.

Wow.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: The NEORV32 Risc-V Processor
« Reply #20 on: July 07, 2022, 03:02:50 am »
Jesus, Pipeline and Cache are evil, especially for hobby stuff, especially when you don't have the effort/time to carry their bad effect, and you randomly see topics over topics about people asking "why my debugger behaves so weird .... " it's due to the pipeline, it's due to the cache....

If that actually happens then that's simply a bug in the implementation.

The ISA specifies the semantics of programs (except timing), and if some "clever" microarchitecture fails to implement those semantics correctly and transparently then that microarchitecture is buggy. Not that that has never happened, but when it does it's clearly a bug, and fixable in a revised hardware design.

DRAM access times haven't increased all that much since the 1970s, so if we didn't have caches and pipelines and branch prediction and OoO our desktop PCs would still be running at maybe 16 MIPS total on shared RAM instead of 5-10 billion IPS per core.

When you're controlling physical things outside the computer you need to have enough performance to do the job, it needs to reliably meet deadlines without glitches, and being 2x or 10x or 100x faster doesn't bring any benefit.

But there are lots of applications where being 500x faster, on average, is a huge advantage, even if it's without 100% predictable timing in any particular instance.
 

Online ejeffrey

  • Super Contributor
  • ***
  • Posts: 2966
  • Country: us
Re: The NEORV32 Risc-V Processor
« Reply #21 on: July 07, 2022, 04:16:43 am »
When you're controlling physical things outside the computer you need to have enough performance to do the job, it needs to reliably meet deadlines without glitches, and being 2x or 10x or 100x faster doesn't bring any benefit.

But there are lots of applications where being 500x faster, on average, is a huge advantage, even if it's without 100% predictable timing in any particular instance.

Also, for those same physical things a computer that runs 90% of the speed need to meet deadlines is completely worthless.

Quote from: DiTBho
Jesus, Pipeline and Cache are evil, especially for hobby stuff, especially when you don't have the effort/time to carry their bad effect, and you randomly see topics over topics about people asking "why my debugger behaves so weird .... " it's due to the pipeline, it's due to the cache....

Cache and pipelining are essential to getting high performance out of modern processors.  While they (especially cache) do make it difficult to give exact timing values, in many cases a "unpredictable" cached, pipelined CPU will have a worst case performance that is orders of magnitude faster than if it were running on a simple non pipelined CPU.  I'm not talking about a pathalogical program designed to have a cache miss and/or pipeline flush on every cycle, but for a specific program that you actually care about you can often calculate the cache miss rate and show that it will never miss a deadline.

I'm working on a hard realtime system that has many deadlines measured in nanoseconds.  I'm using tightly coupled memory with no cache, but without a pipelined core it would be completely impossible.  Furthermore, while the process is hard real time, the more performance I can get, the more complex logic the users can implement, so more is better.  I'd love to use an OoO superscalar core, but those don't fit particularly well on most FPGAs.
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2103
  • Country: ca
Re: The NEORV32 Risc-V Processor
« Reply #22 on: July 07, 2022, 05:36:50 am »
Why people is always so boringly interested at performance?!? WHY!????:-//

Jesus, Pipeline and Cache are evil, especially for hobby stuff, especially when you don't have the effort/time to carry their bad effect, and you randomly see topics over topics about people asking "why my debugger behaves so weird .... " it's due to the pipeline, it's due to the cache....


oh, so why do people insist with that bloody stuff? Why?  :-//

A softcore without pipeline is not inferior, it's better by several hundred of magnitude, starting with "because it's simple", ending to "because it doesn't add any bad-side effect"

Don't you hear me? I will make t-shirts with that message :D
Man, you're sooooo out of touch with reality, it's not even funny. ALL MCUs designed in the last decade (or even more) are pipelined! If you don't know how to debug pipeline, you'd better figure it out, and fast!

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: The NEORV32 Risc-V Processor
« Reply #23 on: July 07, 2022, 07:22:48 am »
Man, you're sooooo out of touch with reality, it's not even funny. ALL MCUs designed in the last decade (or even more) are pipelined!

I am tired to see comments like this.

Really this forum it's gotten so f*cking depressing recently that answering is gotten so f*cking frustrating, especially when you comment with pleasantness and you get answers where the purpose is nothing but to demotivate those who aim for simplicity.

Not mentioning that every single university project is multi-cycles-based rather pipeline-based.

Not mentioning that every single project that aims for real-time and cycle-precision is multi-cycles-based rather then pipeline-based.

Not mentioning that every single project when you have to deal with pipeline needs "critic code" and instructions like { sync, isync, fence, ... } which (in every PowerPC  book) are described as "voodoo black magic code"

etc

All idiots and sooooooooooooooooooo out of touch with reality, I guess.


But, frankly I don't care. I am going to finish my coffee, take my bicycle and run for 90 Km to the lake. I will also have a swim with girls in bikini, you can stay here pumping your ego, and have fun with whatever you believe it's funny  :-+
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: The NEORV32 Risc-V Processor
« Reply #24 on: July 07, 2022, 07:25:16 am »
without a pipelined core it would be completely impossible

Oh, well, 60Mhz clock, 6 stage multi-cycles, it means 10Mhz per instruction.
It's enough for my real-time needs.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf