Author Topic: Worst "in production" microcontrollers (Read 22375 times)

IDEngineer · « **Reply #75 on:** August 03, 2022, 01:06:15 am »

I have no data to support this, but I bet the IC fab market is like most other markets. Some stuff at the bleeding edge (small feature sizes), some stuff still way back on big nodes, and the vast majority of chips being fabbed on middle-of-the-road nodes/processes. I bet the fab industry produces several orders of magnitude more MCU's every year than Intel/AMD's latest bleeding edge CPU's even though the latter get most of the press coverage and fanbois attention. Cars these days have dozens of "processors" (example: everything that connects to CAN) but nary a single i9 anywhere. And that's just one market.

SiliconWizard · « **Reply #76 on:** August 03, 2022, 01:45:22 am »

Keep in mind power consumption, especially in low-power modes (sleep/stop/...) is a key point for MCUs these days, and due to usually higher leakage on finer process nodes, MCUs on the finer nodes are usually not that practical. Except possibly for the top-end of what is considered a MCU at the moment, such as the Cortex-M7-based ones, in particular like the NXP iMXRT series. And from what I remember, even those are on a 45 nm process at the moment.

David Hess · « **Reply #77 on:** August 03, 2022, 01:57:33 am »

Quote from: westfw on August 02, 2022, 11:54:14 pm

Quote
The problem now is that floating gate memory is only available down to 28 nanometers, so OTP processors can only be produced on legacy processes.

But can't they produce OTP chips using Flash technology and leaving out the charge-pumps/etc needed to erase them? I thought there were a substantial number of such chips, actually.

That *is* NOR Flash. It does not scale below 28 nanometers because retention is too short.

Berni · « **Reply #78 on:** August 03, 2022, 05:25:26 am »

With very simple MCUs you don't really gain much from going to the smaller processes. They do still need bonding pads that can't just shrink along. Using an old fab node is also a good way of saving money because the fab that is now obsolete for making the bleeding edge chips is still more than good enough for making MCUs.

This is also where the chip shortage hit the hardest. There is only a certain number of these low tech fabs around that can make things like MCUs. The bleeding edge fabs that do <8nm produce x86 CPUs, GPUs, Phone SoCs etc... ware mostly kept busy by the crypto mining boom, but now that mining fell over they have extra capacity that they don't know what to do with, but can't be used to make the low tech chips that we need. For example the Raspberry foundation pushed out a ComputeModule 4 redesigned to fit into the CM3 form factor because the SoCs for the Pi3 are on a particularly busy process node while the Pi4 CPUs are on a smaller more modern node where they have a bit more capacity, so it is used as a workaround to help costumers who designed the CM3 into there design and are not stuck unable to ship products.

hans · « **Reply #79 on:** August 03, 2022, 06:56:28 am »

STM32H7 is on ST's 40nm: https://en.wikipedia.org/wiki/STM32#STM32_H7

But the F4, F7 is at 90nm, which was first introduced 20 years ago. I guess it will take a while before we see 28nm or much lower commonly, although in high-end I'm sure you do see it first.

With small process node you will lose certain nice features. I think EEPROM cells are probably the first to get booted, which already has on most MCUs. Then at some point is also a low static power consumption. You can more or less see this with some SiliconLabs low-power MCUs. Even though their new generation has all the bells and whistles for a low operating current (DC/DC, low Vcore, low Icore), the consumption in sleep with a fixed size block (say 8 or 16kB) kept non-volatile is going up from <0.5-1uA to >2uA. For applications where a device has a tiny duty cycle, it makes no sense at all to use those MCUs..

Kjelt · « **Reply #80 on:** August 03, 2022, 07:08:36 am »

Quote from: Berni on August 03, 2022, 05:25:26 am

This is also where the chip shortage hit the hardest. There is only a certain number of these low tech fabs around that can make things like MCUs. The bleeding edge fabs that do <8nm produce x86 CPUs, GPUs, Phone SoCs etc. have extra capacity that they don't know what to do with, but can't be used to make the low tech chips that we need.

The story is a bit different.
To make a chip or better a wafer with low nm technology you need the extremely expensive high end machines (EUV) with extremely expensive reticles but only for the first layers.
All layers above those first layers , there the "cheaper" DUV machines that also can handle the higher nm processes take over to do the top layers like the interconnects to the top pads, power etc.

So if you make a low nm wafer your are also occupying the machines that would be perfect to create the cheaper chips.
Besides that the lower nm chips earn much more money per hour than the lower end chips. Many fabs don't do the jellybean parts anymore, only the older never updated fablines do those and that capacity is steady.
Also the latest chips use many more layers (think more than hundred) that means each wafer has to be processed (coated, exposed, etched, cleaned, coated, etc.) more than hundred times, which is increasing the time for one wafer to be ready (not to mention if anything goes wrong in between like polution or calibration errors or whatever will forego the entire batch).

free_electron · « **Reply #81 on:** August 03, 2022, 02:12:03 pm »

it's even more complicated than what Kjelt wrote. The functions that you seek may simply not be present in those deep technologies. 3.3 volt or 5 volt on 8nm ? forget it. you'll fry the gates. that stuff is made for 0.9 or 1.1 volt max. some devices (like the GPU's) have on-board switchers to lower the voltage even more ( down to 0.65 volt. ). transistors at those geometries are "leaky" . besides : the higher voltage swing they need to do when transitioning from 1 to 0 or 0 to 1 the more power they burn. Every time you are charging/discharging a capacitor. it may be very small but when you have a couple of billion... it adds up quickly.
and doing stuff like a/d d/a , comparators, opamps or anything that smells remotely analog .... can't do it. You can technically, but not economically. The structures are so large that you waste area on a simple opamp that could occupy the entire CPU core. So why you need that 8nm then ? you are better off ditching that opamp , putting another core on that wafer and make money selling that extra core.

The deep nanometer technologies are for digital stuff.

chickenHeadKnob · « **Reply #82 on:** August 03, 2022, 07:00:55 pm »

Quote from: IDEngineer on August 02, 2022, 07:40:45 am

Ah, the good old days of things like the Intel MDS development system (big blue box). Basically an early Personal Computer. Then you purchased their ICE for the target processor. The complete workstation was $30K back in the 80's when that was real money. We Engineers used to jostle for access to the one or two such workstations in the lab. Lots of late nights.

Those original ICE's were something. Big metal boxes that literally emulated the processor with discrete logic. They were expensive but really FELT expensive too. I don't remember when the first ICE came out that used a specialized version of the die in question but the early years were basically boxes full of discrete DIP packages, often on a stack of several interconnected PCB's. Twisted ribbon cables came out to a 40-pin DIP plug that inserted in place of the MCU in your target PCB. Smart users stacked several 40-pin machined pin sockets on those DIP plugs because it was too expensive to break off a pin from the primary connector.

An intel applications engineer (mid 80's) claimed that the division that produced those blue boxes was the most profitable in Intel with respect to net revenue over expenses. He said it was called the metal folding division internally -because boxes! He dropped that tid-bit to us after we complained about the cost and just laughed. At 40k canuk buck$s the management continually pestered us "why do you need this thing?" even after we bought one.

benSTmax · « **Reply #83 on:** August 04, 2022, 06:03:14 am »

Quote from: Kjelt on August 02, 2022, 01:53:34 pm

What I really would like to see is that the customer can program the chip which peripherals to which outputs/inputs pins are routed.
Just a very tiny small piece of glue logic that you can program through Swim.
So that the customer if he wants 5 timers but for instance no second UART etc. can choose that also on the 32 pin version.

Besides this very nice idea, I would add we also need improved peripherals.
I explored a couple of years ago accessing a 2nd micro through SWD using another micro on the same board.
Of course, implementing SWD can be done through bit banging but it could have been done using SPI if the main micro would have supported other transfer lengths than 8 bits or multiples.
From the micros I played with, only Pico, NXP/Freescale and TI ARM microcontrollers were able to transfer data between 4 to 16 bits.

Berni · « **Reply #84 on:** August 04, 2022, 06:28:03 am »

Yep peripherals quite often are bloated with useless features while missing very useful features.

SPI only able to do 8bit or 16bit is a common one. For example there are also a lot of SPI LCDs that use 9bit SPI (8bit data 1bit command/data flag). Only way around it on those is indeed to just bitbang it.

There is also things like I2C peripherals that for some reason love to lock up on many MCUs. There implementation usually makes them unusable for DMA too (Often useful for slow peripherals where waiting for the data takes too long). Then on modern MCus there are peripherals that ONLY perform at full speed using DMA (STM i am looking at you). Where the peripheral has only 1 byte of internal buffer and the flags are so vague or buggy about the buffer state, that the only safe way to refill it is to wait for it to actually go empty before feeding in new data. So the solution is to set up a DMA to spoon feed the peripheral while you wait in a while loop for it to finish, all because you wanted to send 8 bytes out of a SPI port at 30MHz without pauses in the transfer, yet you spend half of that time setting up a DMA

mikeselectricstuff · « **Reply #85 on:** August 04, 2022, 08:16:19 am »

totally agree about inflexible SPI lengths. I once used the SPI on an NXP part to add an additional TX UART for DMX.
..and don't get me started about compare peripherals with no output polarity control. I spent ages trying to get a PIC32MX170 to produce precise-length low-going pulses before giving up and adding an external inverter.
Fortunately later parts have the configurable logic cell peripheral that can do the inversion.

benSTmax · « **Reply #86 on:** August 04, 2022, 08:55:40 am »

Quote from: mikeselectricstuff on August 04, 2022, 08:16:19 am

..and don't get me started about compare peripherals with no output polarity control. I spent ages trying to get a PIC32MX170 to produce precise-length low-going pulses before giving up and adding an external inverter.

I couldn't agree more. Polarity control should be part of all the digital peripherals in a micro.
It only takes a very small fraction of the silicon real-estate, however the overall benefits outweigh the slightly increased digital size. AFAIK the analog and pads take quite a bit of real-estate in lower node technologies used for making mid-range to high-end micros.
Hopefully, one or more micro vendors will see this thread and maybe incorporate some of our wishes in their next micros.

David Hess · « **Reply #87 on:** August 04, 2022, 09:33:22 am »

Quote from: chickenHeadKnob on August 03, 2022, 07:00:55 pm

An intel applications engineer (mid 80's) claimed that the division that produced those blue boxes was the most profitable in Intel with respect to net revenue over expenses. He said it was called the metal folding division internally -because boxes! He dropped that tid-bit to us after we complained about the cost and just laughed. At 40k canuk buck$s the management continually pestered us "why do you need this thing?" even after we bought one.

They were under-powered and more expensive than contemporary S-100 microcomputers which could run CP/M. Even when we used one, we had a separate Prolog PROM programmer to use with it.

IDEngineer · « **Reply #88 on:** August 04, 2022, 03:26:15 pm »

Along with simple things like output polarity and programmable SPI bit lengths, it would be enormously valuable to have I/O peripherals with something more than a single-byte buffer. This would drop interrupt rates in direct proportion to the buffer depth, either allowing faster speeds or lower CPU burden (or some balance between those two). Yes, I know there are some MCU peripherals which have buffers but many do not and the performance impact can be huge, even if all the associated ISR does is move the latest byte in or out of the single register. You still incur the ISR overhead and the subsequent jitter in system timing.

nctnico · « **Reply #89 on:** August 04, 2022, 03:30:08 pm »

Quote from: IDEngineer on August 04, 2022, 03:26:15 pm

Along with simple things like output polarity and programmable SPI bit lengths, it would be enormously valuable to have I/O peripherals with something more than a single-byte buffer. This would drop interrupt rates in direct proportion to the buffer depth, either allowing faster speeds or lower CPU burden (or some balance between those two). Yes, I know there are some MCU peripherals which have buffers but many do not and the performance impact can be huge, even if all the associated ISR does is move the latest byte in or out of the single register. You still incur the ISR overhead and the subsequent jitter in system timing.

Nowadays you are supposed to use DMA to implement buffering.

IDEngineer · « **Reply #90 on:** August 04, 2022, 04:06:39 pm »

That presumes the device, instead of having a multibyte buffer, instead has full DMA support. I'll accept either but suspect the former is easier to implement on the die.

Berni · « **Reply #91 on:** August 04, 2022, 04:42:50 pm »

Quote from: nctnico on August 04, 2022, 03:30:08 pm

Quote from: IDEngineer on August 04, 2022, 03:26:15 pm
Along with simple things like output polarity and programmable SPI bit lengths, it would be enormously valuable to have I/O peripherals with something more than a single-byte buffer. This would drop interrupt rates in direct proportion to the buffer depth, either allowing faster speeds or lower CPU burden (or some balance between those two). Yes, I know there are some MCU peripherals which have buffers but many do not and the performance impact can be huge, even if all the associated ISR does is move the latest byte in or out of the single register. You still incur the ISR overhead and the subsequent jitter in system timing.
Nowadays you are supposed to use DMA to implement buffering.

Yep DMA capability is nice, but it should be there to free up the CPU to do more useful stuff rather than being required to get full bandwidth out of a peripheral.

On the STM32F4 series you can't get SPI to run with no pause between bytes by manual polling even if you dedicate 100% of the CPU to it. It has a 1 byte large buffer and the status flags don't clearly tell you exactly when it needs servicing. Because it is bidirectional you have to both fill and empty the buffer, technically you can service the buffer with the next byte while the previous byte is being shifted out/in, but no matter how i would follow the available status flags something would go wrong causing a data over/under run at some point. The only reliable way was to send 1 byte, wait for it to be transferred fully and then give it the next one (This is also how the official HAL driver does it). This results in a small pause from when you find out it is done and feeding it the next byte. Feeding it with DMA works fine tho.

But in comparison here is the process for sending in polling mode:
1) Init SPI (Turn on clocks, set register, enable peripheral)
2) Write/read data into data register
3) Wait for space in buffer
4) Repeat step 2 until done

Here is what you do with DMA mode:
1) Init SPI (Turn on clocks, set register, enable peripheral)
2) Init DMA (Turn on clocks, figure out what DMA channel is able to service SPI, move away other DMA channels in conflict, link up the DMA channel with the correct pripheral, write down the channel so you can use it later)
3) Copy data into an area of RAM that is accessible by DMA
4) Flush CPU data cache to make sure it is written out there
7) Make sure DMA is idle for a new job
5) Configure SPI peripheral for a DMA mode transfer
6) Configure the DMA with the apropriate address and data size
7) Start the DMA transfer
8 ) Poll DMA register to check if done, repeat 8 otherwise
9) Terminate DMA transfer
10) Flush CPU data cache again
11) Retrieve received data from the RAM area.

Okay sure you can skip the cache flush if you configure the memory management unit to forbid caching that area, but that is even more setup work. Also yes you could do other stuff while the DMA is processing but if i am sending 8 bytes trough SPI at 30Mbit that takes only 2us, what useful stuff am i going to do in that time? If i decide to jump into a ISR in that time that will take 1us for the context switch alone, so we might as well just run in circles in a loop during that time.

Why do i have problems like this on a modern 32bit ARM MCU when i never had problems like that on a 16bit PIC way back. It did also have DMA but you could always make peripherals go full speed with polling, the DMA was only there when you needed to send a huge pile of data and you had something useful to do in that time. Then you have peripherals like I2C that are horendusly slow and could use DMA, yet need constant CPU intervention to actually complete a transfer, so the DMA is only usable once you already completed half the transfer with polling.

Sal Ammoniac · « **Reply #92 on:** August 04, 2022, 06:20:48 pm »

Quote from: mikeselectricstuff on August 04, 2022, 08:16:19 am

totally agree about inflexible SPI lengths. I once used the SPI on an NXP part to add an additional TX UART for DMX.

The Infineon XMC series can do SPI at any data length between 1 and 63 bits.

benSTmax · « **Reply #93 on:** August 04, 2022, 08:32:27 pm »

Quote from: Sal Ammoniac on August 04, 2022, 06:20:48 pm

The Infineon XMC series can do SPI at any data length between 1 and 63 bits.

Thanks for pointing this out. I checked the XMC's USIC peripheral and it seems like a pretty versatile one. I will definitely keep the USIC and the XMC series in mind when I will need a more capable SPI/UART/IIS

analityk · « **Reply #94 on:** August 05, 2022, 12:19:05 am »

Did someone ever used PS4?
In my opinion, if you have to use dedicated software to create project (especially pinout configuration) it should be ommited in your mind.
Second sad characteristics is code density. I wrote some time ago fast (discrete) Fourier transform on AVR (in C++ if it does matter). Code create testing vector (array) of data using sin(), perform 64 point FFT (on float, not uint16), calculate bin power and throw it into console in formated strings. This all code ocuppy 5026 bytes of flash and 1040 bytes of ram. Try it on your Cortex m0+.
Also documetation for uC in Cypress is like a joke.
AVR is not cheap if you think about capabilities or MHz but sometimes can make you positive suprise.
Cheap is STM8 but you have to use SDCC so it is maby not as bad as it may be but for small volume batch i will take avr.
STM is France-Itally firm. But it is more like Fiat than Ferrari. 1000 web location for some serious doc and after this you know you want get manual... It is sad joke.
In ATSAM sometime you can meet SSC - serial synchronous controller. It can be preety good - it can generate PCM signals for two I2S codecs workign together on one bus.

benSTmax · « **Reply #95 on:** August 05, 2022, 12:55:57 pm »

I admire a lot the RPI for what they brought us with the RP2040. It also teached the big micro names a good lesson too.

About 10 years or so ago, I remember I saw SGPIO from NXP and I thought it was a nice touch to a mundane peripheral like GPIO. It never took off but at least it showed there is still innovation to be made with a simple peripheral like GPIO.
Fast forward some years later, the RPI managed to bring the GPIO innovation to the next level by giving us the amazing PIO.

I think the big microcontroller guys should understand they need to bring more disruptive innovation (like the RPI did with their PIO) and stop reusing some of their crappy peripherals over and over again

SiliconWizard · « **Reply #96 on:** August 05, 2022, 06:40:13 pm »

NXP has gone a long way with this with FlexIO in their "newer" MCUs.
Some vendors have some kind of programmable logic blocks which can be useful too.

The big plus of the RP2040 PIO is that it's both versatile and very easy to use. It takes only a little while to figure out and then you can design fancy custom peripherals. There are dozens if not hundreds of examples all over the place now. While NXP's FlexIO is relatively similar, it's much more complex and I've seen almost no project using FlexIO with the same ease and creativity.

All in all, I agree this is something that we are not seeing enough in modern MCUs.

hans · « **Reply #97 on:** August 05, 2022, 06:59:21 pm »

It's pretty disappointing we don't see enough microcontrollers with FPGA-like fabric and powerful MCU together.

Such as the previous example about output polarity. Yes you could use one CLC instance, but considering the capabilities per CLC channel (more complex logic on 4 inputs), it's also a bit of a waste too. Output inversion shouldn't be more than 1 flipflop and an XOR gate..
But if you have it available, and the alternative is an external logic chip, then the choice is obvious.

I wonder, why isn't this an option in the GPIO block? Then those IP devs don't have to scatter around dozens of polarity-invert bits around each peripheral (potentially forgetting to add it on some), but instead, it could be done on a 'global' level.

PIO is an interesting alternative approach to a full FPGA fabric. Sometimes you don't need that fine granular flexibility, but still need something 'else'.

Then there are plenty of MCUs that are quite imbalanced. Like Gigabit ethernet on a Cortex-m0+ class processor, such as the CH32V307. Or a 10MSPS ADC on a 16MIPS PIC24 (such as the PIC24FJ64GC006). For both, I wonder what the applications of those devices really are... if you can't actually process the data to from/to a high-speed peripheral.

SpacedCowboy · « **Reply #98 on:** August 05, 2022, 07:51:07 pm »

PIO is sufficiently awesome that it's been implemented in an FPGA in verilog ...

westfw · « **Reply #99 on:** August 05, 2022, 11:25:06 pm »

Quote

It's pretty disappointing we don't see enough microcontrollers with FPGA-like fabric and powerful MCU together.

The microcontrollers that HAVE supported things like this have not been very successful. :-(
As someone complained at a (Cypress) PSoC seminar I attended: "You want HW engineers and SW engineers to share the same project (files and IDE)??!! Madness!" (ok, somewhat paraphrased.)

The built-in options (like PIO or the "Invert" config on rp2040) are more widely accepted.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Worst "in production" microcontrollers (Read 22375 times)

Share me