So, I wanted to test the CH32L103, a lower-power variant of the WCH family with USB-PD support. Barely more expensive than the CH224K that I've used for basic USB-PD functionality, and it's a full MCU with 64KB of Flash and 20KB of RAM.
First testing it on an eval board: CH32L103C8T6-EVT-R0. Docs & SDK:
https://github.com/openwch/ch32l103It can be flashed using a WCH Link-E programming adapter. I have one from a third-party (Muse Lab) which works well. It's compatible with the original WCH firmware. But note that you'll need the most recent version of the firmware for the Link-E to be able to connect to a CH32L103. I had bought a couple of those Link-E adapters last year, and of course, they didn't work for the CH32L103. So, I had to update them. This tool allows to update the firmware very easily, and it works flawlessly:
https://www.wch.cn/downloads/WCH-LinkUtility_ZIP.html . Windows-only though, but it works with Windows 7. (There's apparently a way to update from Linux, but I didn't bother.)
For the first test, I ported the test firmware I did for the CH32V307 with Bruce's "primes" benchmark.
Interestingly, the result is kinda disastrous.
CH32L103C8T6 (QingKe V4C core, RV32IMAC) @ 96 MHz : 92.9 billion cycles. That's about twice what I got with the CH32V307, and the core is close enough not to make a real difference here.
I figured out the reason quite quickly: the CH32V307 (and maybe the CH32V2xx series too, I haven't checked, but if anyone knows...), as we had discussed earlier, loads the entire Flash content in RAM before starting execution, so that you always have zero wait state. This isn't the case for the CH32L103, which is much cheaper and has apparently no extra internal RAM at all apart from the 20KB available for data. So, it directly executes from Flash, and it doesn't seem to have any kind of cache either, not even a small one. At 96 MHz, it has 2 wait states. So, there you go. That's not very pretty, but that's the price of low price. The wait states are: 0 up to 40 MHz, 1 up to 72 MHz and then 2 up to 96 MHz.
I tried at lower frequencies and the number of cycles was much lower and consistent with the wait states. So, that's something to know. When running from Flash, it may not have any benefit whatsoever to run at higher clock frequencies, on the contrary.
What I'm curious to try next is to run the primes function from RAM at 96 MHz. It should run at zero wait state and get us back to the results I got with the CH32V307, which was about half the number of cycles.
Still a pretty cool small MCU with a full USB-PD controller, which should be great for implementing all kinds of USB-powered devices.