Author Topic: WCH $0.10 USD RISC-V MCU  (Read 42823 times)

0 Members and 1 Guest are viewing this topic.

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4199
  • Country: us
Re: WCH $0.10 USD RISC-V MCU
« Reply #25 on: November 01, 2022, 10:17:49 pm »
Quote
There is no hardware multiplier/divider in this device.
Some stuff is so dependent on the compilers and libraries.
I've noticed (WRT ARM CM0) that code efficiency might benefit from having divide functions that operated on fewer than 32bits...

 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4039
  • Country: nz
Re: WCH $0.10 USD RISC-V MCU
« Reply #26 on: November 01, 2022, 10:36:50 pm »
Quote
There is no hardware multiplier/divider in this device.
Some stuff is so dependent on the compilers and libraries.
I've noticed (WRT ARM CM0) that code efficiency might benefit from having divide functions that operated on fewer than 32bits...

Note that while CM0 always has a multiply instruction, depending on what options the chip manufacturer licensed from ARM, it might take 32 cycles for the multiply .. or it might take 1 cycle (which presumably limits the clock speed because big fast CPUs usually take 3 or 4 cycles for a multiply!)
 

Offline tszaboo

  • Super Contributor
  • ***
  • Posts: 7390
  • Country: nl
  • Current job: ATEX product design
Re: WCH $0.10 USD RISC-V MCU
« Reply #27 on: November 01, 2022, 11:41:15 pm »
For a practical test of the application that may be suitable for an MCU like this (minus lack of USB here), I build this firmware https://github.com/ataradov/free-dap/tree/master/platform/samd11 with "rv32imc" target. Everything else remained the same, so that target RV MCU would have the same memory map and layout as the SAM D11 (Cortex-M0+). I had to nop out a couple instructions in the assembly sections, but that would not change the code size a lot.

The code size for Cortex-M0+ is 9184 bytes. The code size for RV is 10228 bytes. This is about 11% more. For comparison, rv32im code size is 13732 bytes, or 34% over rv32imc.
I would guess that the M0+ compiler is more mature. Plus RISC-V has all these optional instructions, so maybe they use only the necessary ones while compiling?
Quite honestly, I don't really care what core a MCU have. It's like having 30 peripherals on it, and the 31st is the core.
But yeah, I don't get the point of these small MCUs either. Chip scale package or BGA would be cheaper and smaller, and not restricted by the number of pins. Though probably more expensive for the assembly.
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11261
  • Country: us
    • Personal site
Re: WCH $0.10 USD RISC-V MCU
« Reply #28 on: November 02, 2022, 12:10:36 am »
It would be interesting to know what A32 and A64 code size is, making the same modifications as for RV. Well, and T32 too.
ARM9 size is 12664 bytes. Cortex-A5 is 13236 bytes. No Thumb in either case.

It is very aggressive with loop unrolling on Cortex-A5 by default.

I don't have 64 bit compilers installed.
Alex
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11261
  • Country: us
    • Personal site
Re: WCH $0.10 USD RISC-V MCU
« Reply #29 on: November 02, 2022, 12:12:19 am »
I would guess that the M0+ compiler is more mature. Plus RISC-V has all these optional instructions, so maybe they use only the necessary ones while compiling?
No, compressed instruction set of RV is just not as good. It is literally compressed instructions that unfold into full instructions. Thumb-2 has dedicated instructions that make sense in 16-bit format that have no direct equivalent in 32-bit word.

Looking at the ARM disassembly, the only 32-bit instruction used is "bl" for long branches. RV code uses a lot more 32-bit instructions.
« Last Edit: November 02, 2022, 12:15:53 am by ataradov »
Alex
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14481
  • Country: fr
Re: WCH $0.10 USD RISC-V MCU
« Reply #30 on: November 02, 2022, 01:57:07 am »
11% larger program is not that much, what about the core performance?
Anyways, the point is you can get 5x mcus for the same money.

Obviously that means there would be some cases where the flash is 91% full using M0+, but 101% full using RV, and that's not good. Assuming they had the same flash size in the first place. But if you're using 90% or less, then it doesn't mater at all.

Of course. And obviously if you are that tight memory-wise, things are not looking good anyway. Any change in compiler version or in code may result in code not fitting anymore. Not sure you want to put yourself in such a situation. I wouldn't, unless it was a one-shot project.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4039
  • Country: nz
Re: WCH $0.10 USD RISC-V MCU
« Reply #31 on: November 02, 2022, 03:35:50 am »
I would guess that the M0+ compiler is more mature. Plus RISC-V has all these optional instructions, so maybe they use only the necessary ones while compiling?
No, compressed instruction set of RV is just not as good. It is literally compressed instructions that unfold into full instructions. Thumb-2 has dedicated instructions that make sense in 16-bit format that have no direct equivalent in 32-bit word.

Looking at the ARM disassembly, the only 32-bit instruction used is "bl" for long branches. RV code uses a lot more 32-bit instructions.

For practical purposes, BL is the only 32 bit instruction that ARMv6-M *has*.

There are also DMB, DSB, ISB, MRS and MSR but you're not going to see a lot of those memory barriers and status register movement instructions in compiler-generated code.

Most of the time a 32 bit RISC-V instruction will be doing the work of two 16 bit Thumb instructions. The most important exceptions are that ARM squeezed 3-address add/sub into a 16 bit instruction, and Thumb has push/pop multiple as a 16 bit instruction while original RISC-V needs a 32 bit instruction to call a millicode function to save/restore multiple registers.
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4199
  • Country: us
Re: WCH $0.10 USD RISC-V MCU
« Reply #32 on: November 02, 2022, 07:30:50 am »
Quote
ARM9 size is 12664 bytes. Cortex-A5 is 13236 bytes.

Could you do a CM3 or other v7m cpu?
The v6m instruction set is pretty depressing :-(
(although I've frequently observed that allowing the use of 32bit thumb instructions seems to decrease the instruction COUNT, but not the code size.  You'll get a fancy shifted-operand "LDR R1, #0x10000" (32bit instruction) instead of "LDR R1, #0x1; LSLS R1, #16" (two 16 bit instructions.)  Sometimes the M3 code is longer than the M0 code (but presumably faster.  Assuming memory can keep up.))
« Last Edit: November 02, 2022, 08:20:54 pm by westfw »
 

Offline tszaboo

  • Super Contributor
  • ***
  • Posts: 7390
  • Country: nl
  • Current job: ATEX product design
Re: WCH $0.10 USD RISC-V MCU
« Reply #33 on: November 02, 2022, 12:49:33 pm »
I would guess that the M0+ compiler is more mature. Plus RISC-V has all these optional instructions, so maybe they use only the necessary ones while compiling?
No, compressed instruction set of RV is just not as good. It is literally compressed instructions that unfold into full instructions. Thumb-2 has dedicated instructions that make sense in 16-bit format that have no direct equivalent in 32-bit word.

Looking at the ARM disassembly, the only 32-bit instruction used is "bl" for long branches. RV code uses a lot more 32-bit instructions.
OK, fair enough. I'm not sure this matters for such a small device though. I've seen projects that ran out of code space, usually they were using BLE USB and other bloatware libraries that ate all the space. For a small device like this, there is usually a bigger version. If you are not charged unreasonable extra for the extra flash, like STM32s this is not an issue. I mean FFS why is a STM32F103 with some flash costing 10 dollars in quantity, nobody knows.
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11261
  • Country: us
    • Personal site
Re: WCH $0.10 USD RISC-V MCU
« Reply #34 on: November 02, 2022, 02:55:40 pm »
Could you do a CM3 or other v7m cpu?

CM3  is 8644   bytes. CM7 is 8868 bytes.
Alex
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4039
  • Country: nz
Re: WCH $0.10 USD RISC-V MCU
« Reply #35 on: November 03, 2022, 08:36:09 am »
For a practical test of the application that may be suitable for an MCU like this (minus lack of USB here), I build this firmware https://github.com/ataradov/free-dap/tree/master/platform/samd11 with "rv32imc" target. Everything else remained the same, so that target RV MCU would have the same memory map and layout as the SAM D11 (Cortex-M0+). I had to nop out a couple instructions in the assembly sections, but that would not change the code size a lot.

Could you possibly push a branch with the exact changes you made to compile it for RV? I'd like to take a look at it but I don't want to guess what you did and get different results from the start.
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11261
  • Country: us
    • Personal site
Re: WCH $0.10 USD RISC-V MCU
« Reply #36 on: November 03, 2022, 04:09:14 pm »
Here is the project the way it was used for experiments.
Alex
 
The following users thanked this post: willmore

Offline mac.6

  • Regular Contributor
  • *
  • Posts: 225
  • Country: fr
Re: WCH $0.10 USD RISC-V MCU
« Reply #37 on: November 04, 2022, 10:01:14 am »
I tried to compile a selection of my risc-v code to cm0+ and cm33, and I got into a 10-30% range increase for risc-v in code size (per function). Of course, small function are impacted a lot, and bigger function tend to end up in the 10% ballpark.
What kill us is that we have pretty strict code requirements which impact a lot code size already (secure coding), on top of that strigent MISRA-C rules that tend to break code into small functions due to cyclomatic (idiocratic) rules.
I don't use -msaverestore and probably won't, as it has not been evaluated security wise.
 

Offline woofy

  • Frequent Contributor
  • **
  • Posts: 334
  • Country: gb
    • Woofys Place
Re: WCH $0.10 USD RISC-V MCU
« Reply #38 on: November 04, 2022, 10:32:58 am »
Comparing RISCV against other architectures is not trivial. Here's Chris Celio comparing ARMV7 ARMV8 X86 and RISCV:
https://youtu.be/Ii_pEXKKYUg

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4039
  • Country: nz
Re: WCH $0.10 USD RISC-V MCU
« Reply #39 on: November 04, 2022, 10:56:43 am »
I don't use -msaverestore and probably won't, as it has not been evaluated security wise.

!!!!

It's 96 bytes of very very simple code, which would otherwise be generated inline in every function.

If you are not using -msaverestore then you are not using RISC-V as it was designed to be used in embedded where code size is more important than a couple of percent of speed. It's an integral part of the decisions to 1) not have load/store multiple, and 2) allow choice of register to store the return address in JAL & JALR. (Actually in many cases it makes code faster because with it the hot code fits into L1 cache)
« Last Edit: November 04, 2022, 11:02:20 am by brucehoult »
 
The following users thanked this post: SiliconWizard, I wanted a rude username

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4039
  • Country: nz
Re: WCH $0.10 USD RISC-V MCU
« Reply #40 on: November 04, 2022, 11:08:23 am »
Comparing RISCV against other architectures is not trivial. Here's Chris Celio comparing ARMV7 ARMV8 X86 and RISCV:
https://youtu.be/Ii_pEXKKYUg

It's a good talk, but forget you ever heard of macro-op fusion.

Far too many hacker news warriors like to claim that RISC-V relies on it for performance, while the truth is that no RISC-V core that I know of implements fusion. It *might* make sense in some kind of high end in-order CPU, but it makes very little sense in either simple CPUs (which can't afford the transistors) or high end OoO CPUs, where simple operations and tracking dependencies are preferred.
 
The following users thanked this post: willmore

Offline tim_

  • Regular Contributor
  • *
  • Posts: 239
  • Country: de
Re: WCH $0.10 USD RISC-V MCU
« Reply #41 on: November 05, 2022, 08:12:29 pm »
Did anybody already get his board and managed to set up toolchain etc? Still waiting for mine in the mail.

It looks like quite a capable device - and the documentation is well done and english. Also great to see 32bit devices at the low end price points, instead of all the PIC clones.

The biggest caveat seems to be the weird power management. According to the datasheet, the device uses a 1.5V core voltage (130nm?) and has an integrated LDO. However the minimum external VDD is 2.7V. 1.2V drop is not what I would consider a LDO... Maybe they took some shortcuts to save IC area?

It's cleary not a device meant for ultra-low power operation. Besides the voltage issue, it also looks like the slow oscillator cannot be used as a main clock. The minimum current draw is around 0.5mA unless you use the deep sleep mode that requires external wake up.



« Last Edit: November 05, 2022, 08:17:44 pm by tim_ »
 

Offline tim_

  • Regular Contributor
  • *
  • Posts: 239
  • Country: de
Re: WCH $0.10 USD RISC-V MCU
« Reply #42 on: November 05, 2022, 08:17:05 pm »

The 10c is a banner price, at high volumes, but LCSC does have similar parts already

https://lcsc.com/product-detail/Microcontroller-Units-MCUs-MPUs-SOCs_PUYA-PY32F003L16S6TU_C5128435.html
claims 1-7~5.5V and 12b ADC and  32kF 4kR  all for 5000+   US$0.1145
I think that does have a multiplier.

I got excited for some minutes, but then I remembered I already had some low cost CM0 in my starsh (HK32F030 etc) that I never got around testing due to absence of useful documentation and concerns that these small niche supplier may hike prices/disappear/have tons of bugs in the periphery.

WCH looks somewhat reliable at least.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14481
  • Country: fr
Re: WCH $0.10 USD RISC-V MCU
« Reply #43 on: November 05, 2022, 08:20:29 pm »
Comparing RISCV against other architectures is not trivial. Here's Chris Celio comparing ARMV7 ARMV8 X86 and RISCV:
https://youtu.be/Ii_pEXKKYUg

It's a good talk, but forget you ever heard of macro-op fusion.

Far too many hacker news warriors like to claim that RISC-V relies on it for performance, while the truth is that no RISC-V core that I know of implements fusion. It *might* make sense in some kind of high end in-order CPU, but it makes very little sense in either simple CPUs (which can't afford the transistors) or high end OoO CPUs, where simple operations and tracking dependencies are preferred.

Completely agree. Fusion makes you sound smart, but in practice almost nobody does it.

In one very simple example of fusion sometimes given, you have the 32x32->64 multiply that requires two instructions to get the full 64-bit result, and fusion would avoid multiplying twice. The same can be achieved (and this is what I've done) just registering the last multiply operation and outputting the registered result (if a subsequent multiply matches the operands and signedness), which is even better as it doesn't require the two multiply instructions to be directly consecutive, and it's much cheaper to implement than any kind of fusion.

Of course there are cases where it's less trivial and for which fusion would make more sense, but I'm not sure I have seen of many examples of this, as, as you said, OoO in this case is a more generic approach and works much better overall.


« Last Edit: November 05, 2022, 08:22:33 pm by SiliconWizard »
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4039
  • Country: nz
Re: WCH $0.10 USD RISC-V MCU
« Reply #44 on: November 06, 2022, 12:52:18 am »
Completely agree. Fusion makes you sound smart, but in practice almost nobody does it.

Ironically, one example is that modern high end x86 and ARM cores fuse a compare with an immediately following conditional branch. Ironic because this has always been a single instruction in RISC-V.

SiFive's 7-series dual-issue cores (e.g. the U74 in HiFive Unmatched, Beagle Starlight beta, VisionFive V1, VisionFive 2) link a short forward conditional branch over a single instruction with that following instruction. It's not fusion because they don't become a single instruction. Both instructions proceed down the two pipelines in parallel and at the final stage if the branch turns out to be taken then the other instruction is turned into a NOP (the register or memory write is squashed).

Quote
In one very simple example of fusion sometimes given, you have the 32x32->64 multiply that requires two instructions to get the full 64-bit result, and fusion would avoid multiplying twice. The same can be achieved (and this is what I've done) just registering the last multiply operation and outputting the registered result (if a subsequent multiply matches the operands and signedness), which is even better as it doesn't require the two multiply instructions to be directly consecutive, and it's much cheaper to implement than any kind of fusion.

Good optimisation. I'd even speculate that its main impact might even be catching simple repeated multiplication of the same operands, not even 32x32->64 cases!

Quote
Of course there are cases where it's less trivial and for which fusion would make more sense, but I'm not sure I have seen of many examples of this, as, as you said, OoO in this case is a more generic approach and works much better overall.

Some of the potential RISC-V instruction fusion candidates have simply been added as official instructions in later extensions. This loses the advantage of code running unchanged on CPUs that don't implement them, but otherwise gains the performance advantages in a simpler way, and also gives code size advantages that fusing wouldn't.

The main example that comes to mind is the Zba extension (part of the group of extensions commonly grouped as "B"): add.uw, sh1add, sh1add.uw, sh2add, sh2add.uw, sh3add, sh3add.uw, slli.uw.

These are various combinations of:

- zero extension of rs1 from 32 to 64 bits (which itself requires a SLLI;SRLI pair in base RVI)
- shift rs1 left by 0,1,2, or 3 bits (or 0..63 for slli.uw)
- add rs1 and rs2

RVI instructions replaced:

2: sh1add, sh2add, sh3add, slli.uw
3: add.uw, sh1add.uw, sh2add.uw, sh3add.uw

Fusion might be reasonable for two instructions, but expecting it for three instructions is getting out of hand. Also, if you don't want to break the pipeline design then fused instruction sequences must all modify the same register (i.e. only modify one register), which would constrain a fused instruction sequence for the above to either have Rd distinct from Rs1 and Rs2 (the usual case) or else the same as Rs1. In the very common case where Rs1 is a loop index this means you'd need to either copy it to a new register first (making the fused sequence 1 instruction longer) or else require the first instruction of the fused sequence to be a 4 byte 3-address opcode. So either way you're looking at 6 or 8 bytes of code you're trying to fuse to get the effect of the above 4 byte instructions.

The reason for all the .uw variants is that it has been discovered that a frightening amount of code in the wild has been "optimised" by using 32 bit "unsigned", "uint32_t" etc variables as loop counters and array indexes in 64 bit code instead of the natural "int" or a 64 bit type such as "long", "unsigned long", "size_t", "ptrdiff_t", "int64_t", "uint64_t".

Legacy 32 bit code of course usually just uses "int". However using a signed 32 bit type such as "int" on amd64 or arm64 is suboptimal because a separate sign-extension step is required, as both ISAs automatically zero-extend 32 bit values to 64 bits. So a lot of people have been going around replacing int with -- not a 64 bit type, which would make sense -- but with a 32 bit unsigned type.

And this *pessimises* RISC-V, which automatically sign-extends 32 bit results to 64 bits.  RISC-V code is optimal if people use either the legacy code "int" or any 64 bit type.

Grrrrr.

One of the places this showed up rather badly is in Coremark. Early RISC-V Coremark results typedef'd the offending 32 bit unsigned variables used for array indexing to int. But then ARM and their friends said "that is an illegal modification, you must not change typedefs".

So, sh1add, sh2add, sh3add (or just plain "add" for byte arrays) work for indexes of type int32, int64, uint64 while add.uw, sh1add.uw, sh2add.uw, sh3add.uw work for indexes of type uint32 and all the bases are covered equally efficiently.

The main reason RISC-V chose to automatically sign-extend 32 bit results, by the way, is because you then only need a single set of (64 bit) compare instructions which automatically work on both signed and unsigned 32 bit values as well. If you zero-extend 32 bit results then you need both 64 bit and 32 bit compare instructions (at least for signed compares).
 
The following users thanked this post: paf, thm_w, edavid, SiliconWizard, woofy

Offline ali_asadzadeh

  • Super Contributor
  • ***
  • Posts: 1905
  • Country: ca
Re: WCH $0.10 USD RISC-V MCU
« Reply #45 on: November 06, 2022, 01:34:52 pm »
If existing debugging tools like j-link, st-link or FT2232 tools can be used for flashing and debugging it would be very nice, since I would ask my chinese supplier within next week to order 1000units, I hope I could get them around 100$.
ASiDesigner, Stands for Application specific intelligent devices
I'm a Digital Expert from 8-bits to 64-bits
 

Offline mon2Topic starter

  • Frequent Contributor
  • **
  • Posts: 463
  • Country: ca
Re: WCH $0.10 USD RISC-V MCU
« Reply #46 on: November 06, 2022, 02:36:00 pm »
@ali, you should contact WCH (factory) - their sales (and tech support) is very responsive. They replied within hours to notify that the tools for this new MCU are available through their Aliexpress webstore.

They offer a bundle with the debugger for ~5$ USD.

https://www.aliexpress.com/item/1005004895791296.html?spm=a2g0o.store_pc_newArrival.8148356.7.356e5c5bT07QQk&pdp_npi=2%40dis%21USD%21US%20%245.80%21US%20%245.51%21%21%21%21%21%402100bb4a16677452454096350e1da2%2112000030932586121%21sh

We have a few on order.

Can you update your pricing once it is confirmed ? Also curious at which volume, does one receive @ $0.10 USD each ?
 
The following users thanked this post: MT

Offline ali_asadzadeh

  • Super Contributor
  • ***
  • Posts: 1905
  • Country: ca
Re: WCH $0.10 USD RISC-V MCU
« Reply #47 on: November 06, 2022, 02:42:17 pm »
Thanks mon2 for the tips, Next week I have some orders, I will try to contact them.
ASiDesigner, Stands for Application specific intelligent devices
I'm a Digital Expert from 8-bits to 64-bits
 

Offline mon2Topic starter

  • Frequent Contributor
  • **
  • Posts: 463
  • Country: ca
Re: WCH $0.10 USD RISC-V MCU
« Reply #48 on: November 06, 2022, 04:29:57 pm »
Posting a sales contact @ WCH (factory).

Quote
Marketing Department:Jiamin Wang
Address: N0.18,Ningshuang Road,Qinheng Technology Park,Nanjing
Telephone number: 18951773252
Email: wjm[at-remove-this]wch.cn
Nanjing Qinheng Microelectronics Co., Ltd.
 
The following users thanked this post: ali_asadzadeh

Offline ali_asadzadeh

  • Super Contributor
  • ***
  • Posts: 1905
  • Country: ca
Re: WCH $0.10 USD RISC-V MCU
« Reply #49 on: November 08, 2022, 08:43:27 am »
I asked for 1K unit of CH32V003F4P6, and they told me it's 0.127$,I was expecting a better price tough ^-^
So I asked them when I should expect under 0.1$, waiting for their answer.
ASiDesigner, Stands for Application specific intelligent devices
I'm a Digital Expert from 8-bits to 64-bits
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf