Author Topic: Are there any microcontrollers with dedicated OTA or OTW reprogramming hardware? (Read 2148 times)

e100 · « **on:** February 28, 2022, 04:51:22 am »

As far as I can tell deploying firmware updates is still a risky business as there is a risk of bricking the device by accidentally uploading the wrong firmware etc.

At the moment if you want a fail safe system then you have to roll your own solution by co-locating another micro which has the sole purpose of receiving firmware and reprogramming the target chip. The system I'm currently using uses a STM32F042 to manage a SAMD21 (https://omzlo.com/articles/canzero). It works but it's requires a bunch of code to make it work as there don't appear to be any standards for kind of thing (I could be wrong).

ataradov · « **Reply #1 on:** February 28, 2022, 05:12:02 am »

It is absolutely possible to design reliable systems using current hardware without a need for the external MCU. You need to clearly define what you are protecting against.

If you are uploading a correctly signed and check summed image, then how do you define "wrong" image? There may be a subtle error that only shows up a month after deployment. If you have a way to detect this occurrence, then you can go into the firmware update more automatically. If you don't have a reasonable way to detect that there is an issue, then how would the dedicated HW know about it?

One commonly used way to protect against a completely broken image is to have some sort of a flag that only running firmware can set after it verifies itself (like it can contact a server or something like this). If FW can't verify itself or just really broken, then WDT would reset the device. Bootloader should count the number of such resets, and revert to the old firmware in case is this counter reached some threshold.

All this assumes existence of safe unmutable bootloader and a place to store the backup image.

e100 · « **Reply #2 on:** February 28, 2022, 05:33:02 am »

You'll never know if the new firmware is capable of updating itself until it tries to do it. If it fails then you are forever stuck on that version.

ataradov · « **Reply #3 on:** February 28, 2022, 05:35:00 am »

Ok, sure, but how your dedicated HW would work in that case?

If your firmware detects that the current image can't update itself, it can signal the bootloader to switch back to the old version. No need for dedicated hardware here.

Of course this implies that OTA is based on the device polling for the image, not the image being pushed to the device. There is no way to detect that other than not seeing an update in a while.

e100 · « **Reply #4 on:** February 28, 2022, 06:05:37 am »

If the part of the update process that is supposed to recognize a failed update didn't work then it'll never know it failed to update itself.

It could end up in a perpetual loop of firmware updates but never get beyond the current version.

ataradov · « **Reply #5 on:** February 28, 2022, 06:09:44 am »

But again, how would hardware solution help here?

e100 · « **Reply #6 on:** February 28, 2022, 06:46:14 am »

Quote from: ataradov on February 28, 2022, 06:09:44 am

But again, how would hardware solution help here?

At the factory the external hardware is used to program the target (to prove that the process works) and once deployed the external hardware never gets updated or changed in any way, therefore it is always able to download new firmware and reprogram the target regardless of what firmware the target is currently running.

ataradov · « **Reply #7 on:** February 28, 2022, 06:53:10 am »

I don't understand. Do you mean that there will be two completely independent radios connected to the same network? What stops you from doing that with just two devices you already use? It won't be any cheaper if manufacturer integrates two radios in one package.

It seems to me that it is easier to implement a bootloader that never changes and contacts the server for the update on each reboot (or on timer if there is RTC). The main firmware must reset the device periodically when update is required. If you afraid that firmware will lock without a reset, a simple external supervisor could be used.

e100 · « **Reply #8 on:** February 28, 2022, 07:19:13 am »

Quote from: ataradov on February 28, 2022, 06:53:10 am

I don't understand. Do you mean that there will be two completely independent radios connected to the same network? What stops you from doing that with just two devices you already use? It won't be any cheaper if manufacturer integrates two radios in one package.

In the case of the STM32F042 and SAMD21 in my original post, the supervisory STM32F042 is connected to a CAN bus and channels messages to and from the SAMD21 via SPI. There is no need for duplicate transceiver hardware.

Quote from: ataradov on February 28, 2022, 06:53:10 am

It seems to me that it is easier to implement a bootloader that never changes and contacts the server for the update on each reboot (or on timer if there is RTC). The main firmware must reset the device periodically when update is required. If you afraid that firmware will lock without a reset, a simple external supervisor could be used.

Regularly rebooting the system and waiting (perhaps minutes, dead in the water) for a firmware download to complete seems like a poor solution.

ataradov · « **Reply #9 on:** February 28, 2022, 07:43:21 am »

You are not waiting for the FW to download all the time, just when there is a new firmware.

What if the part of the new firmware that talks over the SPI bus breaks? You will always be able to find some weak point like this when you consider the most unlikely scenarios. But in practice this is not an issue at all. There is a certain level of risk that is to be accepted.

2N3055 · « **Reply #10 on:** February 28, 2022, 08:07:38 am »

Some really unusual statements here:

"As far as I can tell deploying firmware updates is still a risky business as there is a risk of bricking the device by accidentally uploading the wrong firmware etc."

- Wrong. It is trivial to make checks if firmware is proper for the target. Both in version (is it type for the hardware), and by checking integrity of downloaded image.

"You'll never know if the new firmware is capable of updating itself until it tries to do it. If it fails then you are forever stuck on that version."

"If the part of the update process that is supposed to recognize a failed update didn't work then it'll never know it failed to update itself.
"

- What do you mean? You are writing random code and then test over the air if it can update something there in the world? Don't you test your code at all? I understand firmware having hidden errors, but not being sure if it will upload at all is not possible. That part must be tested before even considering it a firmware to upload to production.

Like Alex says (and he knows about this..) you need a single processor, a trusted bootloader, an additional storage (to store interim image and backups) and a watchdog like procedure that will be part of trusted bootloader, that will trigger restore of previous image if boot after update doesn't go as planned.

You are obviously deliberately vague about what are you planning to do. What kind of system is this? Why are you planning to do en masse unattended firmware updates of devices all the time? What is communications channel for firmware pushes? Is channel prone to errors so you get corrupted files?

When designing some system (any system, really) most important thing is to solve only problems that are necessary and not those imposed by ourselves. Or shall I say, it is important not how to solve problem at any cost, but to take a step back, look at the big picture and realize it is easier to remove source of problems altogether and not create solutions to complications you invented yourself.
Keep it simple.

For instance: Is channel prone to errors so you get corrupted files? in that case make sure you cannot get errors. Use protocol that has built in EC. Do checksums/CRC. Don't just download broken file, blindly flash it into controller and then invent failsafe recovery procedures.

You need to step back and rethink.

e100 · « **Reply #11 on:** February 28, 2022, 08:15:50 am »

Quote from: ataradov on February 28, 2022, 07:43:21 am

What if the part of the new firmware that talks over the SPI bus breaks?

The factory test at the beginning proves that the SPI works. If the new firmware you download has buggy comms then you use the supervisor to download a new version. Remember, the target can be running any firmware or none at all.

Quote from: ataradov on February 28, 2022, 07:43:21 am

You will always be able to find some weak point like this when you consider the most unlikely scenarios. But in practice this is not an issue at all. There is a certain level of risk that is to be accepted.

What if the device is on another continent and you have to pay for your own time, travel and accommodation to go and fix it because you saved $5 by going for the "mostly works" instead of the "always works" option?

ataradov · « **Reply #12 on:** February 28, 2022, 08:19:21 am »

Quote from: e100 on February 28, 2022, 08:15:50 am

The factory test at the beginning proves that the SPI works. If the new firmware you download has buggy comms then you use the supervisor to download a new version. Remember, the target can be running any firmware or none at all.

But the target still runs some form of a bootloader that talks over SPI? Or do yo also have SWD connection? If it is SPI only, then what stops your bad firmware from erasing the bootloader and bricking the device?

e100 · « **Reply #13 on:** February 28, 2022, 09:20:18 am »

Quote from: ataradov on February 28, 2022, 08:19:21 am

Quote from: e100 on February 28, 2022, 08:15:50 am
The factory test at the beginning proves that the SPI works. If the new firmware you download has buggy comms then you use the supervisor to download a new version. Remember, the target can be running any firmware or none at all.
But the target still runs some form of a bootloader that talks over SPI? Or do yo also have SWD connection? If it is SPI only, then what stops your bad firmware from erasing the bootloader and bricking the device?

The URL I provided in the first post has links to the architecture documentation, source code and schematics. Hopefully that will be able answer some of your questions. Perhaps there are flaws in it, I don't know, it's a complex system and I don't pretend to understand half of it.
From a software perspective it's not a finished thing. Chipageddon means they cannot source parts to make new boards to fund development so it is what it is, at least for the time being.

I didn't come here to promote (or defend) a specific solution created by someone else. I merely used it as an example of a system that in my experience has been un-brickable and therefore worthy of attention. I'm pretty good at breaking things and so was pleasantly surprised that is has survived my stress testing over many months and thousands of firmware updates. It definitely has issues, but none that have rendered hardware unusable.

The unanswered question still remains, are there any microcontrollers with dedicated OTA or OTW reprogramming hardware?

ataradov · « **Reply #14 on:** February 28, 2022, 09:24:07 am »

Quote from: e100 on February 28, 2022, 09:20:18 am

The URL I provided in the first post has links to the architecture documentation, source code and schematics. Hopefully that will be able answer some of your questions. Perhaps there are flaws in it, I don't know, it's a complex system and I don't pretend to understand half of it.

The schematic shows no SWD connection from the CAN MCU, only SPI. So, the wrong firmware can brick that device.

Quote from: e100 on February 28, 2022, 09:20:18 am

The unanswered question still remains, are there any microcontrollers with dedicated OTA or OTW reprogramming hardware?

I don't think there are, as it makes no commercial sense. There are well proven solutions that are considered good enough by everyone, including largest corporations shipping millions of devices. There is no need to over-complicate things.

Also, this all is really a problem only if your device has no user interface at all and not accessible for maintenance. This is very rare. And for other devices you can always have a recovery procedure by pressing a button or doing some other action with a device. You will likely need that UI for initial commissioning anyway.

TomS_ · « **Reply #15 on:** February 28, 2022, 06:58:14 pm »

Quote from: e100 on February 28, 2022, 04:51:22 am

As far as I can tell deploying firmware updates is still a risky business as there is a risk of bricking the device by accidentally uploading the wrong firmware etc.

You can "package" your firmware along with some "headers", including a checksum of the application code. When you upload the firmware package to the device, it checks the headers to ensure that it is intended for this device, and that the checksum is correct. At this point youre in a good place and only need to program the newly received code into the device flash.

If youre not doing that at a minimum, then youre just opening yourself up to programming the wrong firmware in. Certainly it would seem to be a fairly common practice in my experience.

Quote

At the moment if you want a fail safe system then you have to roll your own solution by co-locating another micro which has the sole purpose of receiving firmware and reprogramming the target chip. The system I'm currently using uses a STM32F042 to manage a SAMD21 (https://omzlo.com/articles/canzero). It works but it's requires a bunch of code to make it work as there don't appear to be any standards for kind of thing (I could be wrong).

This can easily be done with a single micro. The method I have been using is to "partition" the micros internal flash into two: one partition (just several KB in size) holds the bootloader code, and the rest of it holds the application code. The partitions are sized accordingly so that neither of them crosses a flash page erasure boundary, so in theory they are both safe from each other as long as something catastrophic doesnt go wrong during erase operations.

I use an external SPI EEPROM to hold new application code, because flash based SPI EEPROMs are dirt cheap.

On boot, the bootloader compares a version number of the loaded application code and the externally stored application code, and if they are different, and if a checksum of the stored application code checks out OK, it erases the application area of the internal flash and programs the externally stored version in.

Before jumping into the application code, whether an upgrade was just performed or not, the bootloader checksums the loaded application code to make sure it isnt corrupt. If its OK, it jumps into it, if its not OK then it can attempt to load in the externally stored version as above. If both the internal and external application code are corrupt, well, youre kind of screwed at that point, so I just flash "SOS" on a status LED.

PCB.Wiz · « **Reply #16 on:** February 28, 2022, 07:08:04 pm »

Quote from: e100 on February 28, 2022, 09:20:18 am

The unanswered question still remains, are there any microcontrollers with dedicated OTA or OTW reprogramming hardware?

Short answer : Yes.
You are effectively asking for a ROM loader, and many MCUs have that feature, for OTW reprogramming.

I've not dug into wireless MCUs but many of those are multi-core and I'm sure they have secure bootloaders in their toolkits.

Google finds this in seconds

https://docs.silabs.com/bluetooth/latest/general/firmware-upgrade/secure-ota-dfu

ataradov · « **Reply #17 on:** February 28, 2022, 07:18:32 pm »

ROM loader won't help if your firmware is toast and can't invoke it. There are no ROMs with some sophisticated recovery logic.

PlainName · « **Reply #18 on:** February 28, 2022, 08:41:16 pm »

ESP32 almost gets there and the process can be used elsewhere (it's baked into the ESP-IDF SDK but there's nothing hardware-specific to it). The flash is partitioned for the default (factory) code and 1 or more OTA partitions. The bootloader checks the last set OTA partition and runs the code in it (you can enable all kinds of version checking in the OTA downloading code, but let's assume you ignore all that and just program in any old thing). The new code has to verify that it's good at some point before a reset - if it doesn't then on reset the bootloader reloads the previously good partition. With a watchdog (preferably hardware) it's pretty robust since anything not the right stuff will cause a reboot and hence reversion to good code.

But... suppose your new code tells the bootloader it is perfectly fine, but you've screwed up the OTA downloading stuff. You're toast. The only way to guard against this kind of thing is with another processor to do the updating. But then you have the problem of how do you update that one...

ataradov · « **Reply #19 on:** February 28, 2022, 08:45:04 pm »

Quote from: dunkemhigh on February 28, 2022, 08:41:16 pm

The new code has to verify that it's good at some point before a reset - if it doesn't then on reset the bootloader reloads the previously good partition.

This is exactly the algorithm I described. This is an industry standard practice that has been used for ages.

And yes, this does not address the issue raised above where the code is generally ok, but no longer able to perform OTA function. You are locked with the last verified firmware with no way to recover.

There is no general way to tell that you did not intend to upload a dummy blinky firmware. If it was signed correctly - it is a correct firmware.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Are there any microcontrollers with dedicated OTA or OTW reprogramming hardware? (Read 2148 times)

Share me