Author Topic: Asymmetric multiprocessing considered harmful  (Read 3482 times)

0 Members and 1 Guest are viewing this topic.

Offline tabemannTopic starter

  • Contributor
  • Posts: 43
  • Country: us
Asymmetric multiprocessing considered harmful
« on: April 15, 2023, 10:48:15 pm »
For a long time I have been considering whether I should make a RISC-V port of my Forth-based RTOS, zeptoforth, which is currently for ARM Cortex-M0+/M4/M7 and when I first saw the mention of the Pine64 Ox64 I thought "maybe I will do that sometime in the near future" - it is (will be, once you can buy it) price-competitive with the ARM Cortex-M0+ RP2040-based boards that I have been focusing on as of late and significantly more powerful. But when I saw the actual specs of the thing I immediately recoiled - three cores where each of the three cores runs a different RISC-V architecture, one of them 64-bit and two of them 32-bit?! Designing an RTOS to take advantage of such a design is insane! Were I to make a port, I would just use the 64-bit core and ignore the two 32-bit cores. This contrasts with the RP2040, which is a very suitable target for symmetric multiprocessing due to its very symmetric design. Also consider the case of the STM32H745 DISCOVERY, which I own one of - it has separate ARM Cortex-M4 and Cortex-M7 cores, which makes it less attractive of a target due to their asymmetry; were I to target it I would probably just target the Cortex-M7 core and largely ignore the Cortex-M4 core (even though the board's cost, and thus less demand for support, makes me unlikely to bother; my board is still in its packaging, for one). All of this taken into consideration, I would wish that manufacturers would just make symmetric designs and not bother with asymmetric designs which, while sounding attractive to the kind of people who are like "well, we can do high performance computing on the 64-bit core, and low-power operation on one of the 32-bit cores, all at the same time!", make targeting them with practical RTOS designs one major PITA.
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 26907
  • Country: nl
    • NCT Developments
Re: Asymmetric multiprocessing considered harmful
« Reply #1 on: April 15, 2023, 10:56:49 pm »
In my idea those designs are not meant to have all cores used by the same OS but have a strict seperation between high level, communication features that typically work well using a pre-emptive multitasking OS and tasks that do not need (or even will be hindered) by using an OS that need to achieve predictable and very short (sub 1ms) time intervals.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 
The following users thanked this post: Siwastaja, tooki

Offline coppice

  • Super Contributor
  • ***
  • Posts: 8652
  • Country: gb
Re: Asymmetric multiprocessing considered harmful
« Reply #2 on: April 15, 2023, 11:11:26 pm »
This looks like a typical radio SoC. You have cores where you put a radio stack, get it approved and then leave it alone as much as you possibly can. Then you have cores where you can put applications, modify them, and not sink yourself back into the mire of complex approval processes every time. You very much DON'T want a single OS running across those cores.
 

Offline tabemannTopic starter

  • Contributor
  • Posts: 43
  • Country: us
Re: Asymmetric multiprocessing considered harmful
« Reply #3 on: April 15, 2023, 11:41:42 pm »
This looks like a typical radio SoC. You have cores where you put a radio stack, get it approved and then leave it alone as much as you possibly can. Then you have cores where you can put applications, modify them, and not sink yourself back into the mire of complex approval processes every time. You very much DON'T want a single OS running across those cores.

That is the only way this design even makes sense. I was thinking "even if they were going to have an asymmetric design, could they have been kind enough to use the same architecture across each of them?" - even in the case of my STM32H745, both cores at least are ARMv7-M, so I could use one compiler for both of them, were I ever to bother with targeting that design. But with such a radically asymmetric design as this, you can't even use the same compiler configuration for all the cores at once! You have to compile different parts of your code separately, and then integrate them together somehow. So it would make sense if those 32-bit cores were meant to run binary blobs that you, the programmer, have not even implemented (especially since the thing is marketed as supporting WiFi, BLE, and Zigbee).
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14488
  • Country: fr
Re: Asymmetric multiprocessing considered harmful
« Reply #4 on: April 16, 2023, 12:34:53 am »
A bit more difficult /= harmful.

And separation of concerns isn't a bad thing.
 
The following users thanked this post: tooki

Offline tabemannTopic starter

  • Contributor
  • Posts: 43
  • Country: us
Re: Asymmetric multiprocessing considered harmful
« Reply #5 on: April 16, 2023, 12:46:52 am »
A bit more difficult /= harmful.

Not simply a bit more difficult, but practically unusable for a conventional multiprocessing design. And yet, at the same time, probably too closely-coupled for the opposite, i.e. the kind of completely separated radio stack I personally favor. In a way, it looks like if you took something like the ESP32, which is a basically symmetric design where the radio stack lives in the same space as user code and under the same RTOS, and made it bizarrely asymmetric.

And separation of concerns isn't a bad thing.

It isn't - I personally favor designs where WiFi or Bluetooth radios are completely separate from the main MCU, because then the main MCU is not complicated by having to deal with the inner workings of the (in most cases) binary blobbed radio. Too bad the two RP2040-based designs with radio that I've looked at both have issues - the Pico W has major issues with licensing due to "non-commercial" restrictions on the CYW43 driver made available by Damien George (and thus any drivers derived from it), and the Wio RP2040's ESP8285 radio is simply very buggy and unreliable, such that I have had to abandon support for it.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4040
  • Country: nz
Re: Asymmetric multiprocessing considered harmful
« Reply #6 on: April 16, 2023, 01:04:26 am »
I really don't get your problem here.

The RP2040 is also asymmetric, with two C-M0+ cores and then the very very limited (far more so than RV32EMC) PIO.

The ox64's 320 MHz E907 core is easily 50% more powerful than the two RP2040 CM0 cores combined, not even counting the FPU. With 32 registers vs the 16 on the CM0 you could write two tasks that each use half of the registers and switch between them in a couple of clock cycles, either using JAL/RET or interrupt/MRET. Note that on RISC-V, x0 (the ZERO register) is the *only* register with a dedicated function -- any register can serve equally well as stack pointer, any register can serve equally well as link register. (The C extension is optimised around the standard ABI, but the only difference that makes is how often you can use a 2-byte opcode instead of a 4-byte opcode, which is done transparently by the assembler)

The E902 can do the job of PIO, but is a lot more powerful -- I imagine it's intended mostly to run the software stack for the radio. It's also more powerful than one CM0 core on the RP2040.

And THEN you have the 480 MHz 64 bit Linux core on top of all that. And 64 MB RAM instead of 0.26 MB.

It seems an amazing value to me. And you can only call RP2040 "symmetrical" if you ignore the PIO.

The three core on the BL808 at least all use the same basic instruction set and same compiler.

The R Pi Foundation do win on their comprehensive documentation and example code.
 
The following users thanked this post: langwadt, tooki, MK14

Offline tabemannTopic starter

  • Contributor
  • Posts: 43
  • Country: us
Re: Asymmetric multiprocessing considered harmful
« Reply #7 on: April 16, 2023, 01:23:05 am »
I really don't get your problem here.

The RP2040 is also asymmetric, with two C-M0+ cores and then the very very limited (far more so than RV32EMC) PIO.

The ox64's 320 MHz E907 core is easily 50% more powerful than the two RP2040 CM0 cores combined, not even counting the FPU. With 32 registers vs the 16 on the CM0 you could write two tasks that each use half of the registers and switch between them in a couple of clock cycles, either using JAL/RET or interrupt/MRET. Note that on RISC-V, x0 (the ZERO register) is the *only* register with a dedicated function -- any register can serve equally well as stack pointer, any register can serve equally well as link register. (The C extension is optimised around the standard ABI, but the only difference that makes is how often you can use a 2-byte opcode instead of a 4-byte opcode, which is done transparently by the assembler)

The E902 can do the job of PIO, but is a lot more powerful -- I imagine it's intended mostly to run the software stack for the radio. It's also more powerful than one CM0 core on the RP2040.

And THEN you have the 480 MHz 64 bit Linux core on top of all that. And 64 MB RAM instead of 0.26 MB.

It seems an amazing value to me. And you can only call RP2040 "symmetrical" if you ignore the PIO.

The three core on the BL808 at least all use the same basic instruction set and same compiler.

The R Pi Foundation do win on their comprehensive documentation and example code.

Yes, the 64-bit core on this design is much more powerful per se by itself than the RP2040, since even it without the other cores would be essentially either a very small SBC or a very large MCU design, depending on how you look at it. But sometimes sheer power is not the be-all and end-all of things.

Take for instance the RP2040 - part of the big advantage of being a dual-core design is I can put most of my code on one core, and time-critical code on the other core, and yet have them share not only memory space but code and multitasking constructs. I cannot do this on the single-core designs I support, i.e. the STM32F407, the STM32F411, the STM32F746, and the STM32L476, even though, say, the STM32F746 can run circles around the RP2040 when it comes to throughput. This is also why I specifically did not implement load-balancing between cores on the RP2040 ─ even though I contemplated it ─ because I realized there is value in being able to explicitly stick everything where timing does not matter on one core, and particular things that are timing-sensitive (and which are too complex to be implemented via PIO) on another core.

And yes, this is why you could say the 32-bit cores have been added to this design, so you can run time-critical stuff independent of the 64-bit core, just as I described with the RP2040. However, the fact that three different architectures were chosen for each individual core makes things, well, inconvenient. Were I to port zeptoforth to this board, I'd immediately run into the issue that I either could not use one compiler (zeptoforth, for the record, includes a native-code compiler) for all three cores, and I could not share code across all three cores, or if it turns out that some subset of RISC-V will run on all three cores (I must admit that I have not looked at the details of RISC-V that closely), it would be limited to the lowest common denominator supported by all three cores.
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 3722
  • Country: us
Re: Asymmetric multiprocessing considered harmful
« Reply #8 on: April 16, 2023, 03:44:40 am »
I could not share code across all three cores,

You are not supposed to do that, it defeats the entire purpose of this sort of SoC.  You are supposed to run your main OS on the big core and treat the small cores as microcontrollers that happen to be on the same die.

This is basically the equivalent of saying that a modern Intel laptop cpu is bad because you can't share code between the x86_64 cores and the GPU.

This is very common in the ARM world, to have cortex-A cores for the user OS, and Cortex-M cores as microcontrollers.  They can be for realtime tasks, power management, or security, depending on the application.

Given that they are all three risc-V variants you can use the same compiler with appropriate architecture flags, but you won't be using the same binary or migrating code from one to another.

There are asymmetric cases where you do share code such as a BIG.little.  Then you do ideally want the the big and little cores to have the same ISA.  Intel had a problem with that on their recent CPUs because the E cores didn't support AVX-512 they ended up having to disable it on the performance cores as well.
 
The following users thanked this post: langwadt, tooki, MK14

Offline tabemannTopic starter

  • Contributor
  • Posts: 43
  • Country: us
Re: Asymmetric multiprocessing considered harmful
« Reply #9 on: April 16, 2023, 04:00:18 am »
I could not share code across all three cores,

You are not supposed to do that, it defeats the entire purpose of this sort of SoC.  You are supposed to run your main OS on the big core and treat the small cores as microcontrollers that happen to be on the same die.

This is basically the equivalent of saying that a modern Intel laptop cpu is bad because you can't share code between the x86_64 cores and the GPU.

This is very common in the ARM world, to have cortex-A cores for the user OS, and Cortex-M cores as microcontrollers.  They can be for realtime tasks, power management, or security, depending on the application.

Given that they are all three risc-V variants you can use the same compiler with appropriate architecture flags, but you won't be using the same binary or migrating code from one to another.

There are asymmetric cases where you do share code such as a BIG.little.  Then you do ideally want the the big and little cores to have the same ISA.  Intel had a problem with that on their recent CPUs because the E cores didn't support AVX-512 they ended up having to disable it on the performance cores as well.

The key thing is that this arrangement makes it hard for zeptoforth to support generating code for all three cores, because it would have to have code generators that would put out instructions for each core separately, and furthermore, because it inlines much of itself into the code it generates, it would have to have triple the code to inline, one version for each core. Furthermore, any code that is compiled would have to be compiled in triplicate, with one version for each core. Of course, this arrangement would be, well, impractical. Consequently, the only real way to practically make use of all but one of the cores is to include precompiled blobs and to not support runtime compilation of code. This is fine if your goal is simply to support a WiFi/BLE/Zigbee stack on a core, which probably is the real intent here, but if one is not using such a stack it is essentially wasted silicon.
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 3722
  • Country: us
Re: Asymmetric multiprocessing considered harmful
« Reply #10 on: April 16, 2023, 04:50:02 am »
This is fine if your goal is simply to support a WiFi/BLE/Zigbee stack on a core, which probably is the real intent here, but if one is not using such a stack it is essentially wasted silicon.

Radio operation seems to be the intent but there are lots of other ways people use small cores like this.  If it doesn't work for your application that's fine but lots of devices use asymmetric processors.

Wasted silicon is basically a non-problem.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8179
  • Country: fi
Re: Asymmetric multiprocessing considered harmful
« Reply #11 on: April 16, 2023, 05:09:34 am »
You have completely misunderstood the idea, no wonder it hurts trying to push the square peg through the round hole.

It's not to abstract the whole as something which runs general purpose OS scheduling whatever tasks/threads into those cores.

The idea is to, for example, use the smaller CM4 core to run a dedicated bare metal project which does a well-defined task of its own, and communicate with the another core which then can run a different bare metal or maybe OS project. In such case, the fact they are of different (but similar) architecture is only a tiny bit of mental load.

If you want to support these things in your OS, the best approach is exactly to ignore the small auxiliary cores and only target the "main" core. Users who need the small cores then know exactly what they are doing.
« Last Edit: April 16, 2023, 05:11:46 am by Siwastaja »
 
The following users thanked this post: tooki

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14488
  • Country: fr
Re: Asymmetric multiprocessing considered harmful
« Reply #12 on: April 16, 2023, 05:10:55 am »
I could not share code across all three cores,

You are not supposed to do that, it defeats the entire purpose of this sort of SoC.  You are supposed to run your main OS on the big core and treat the small cores as microcontrollers that happen to be on the same die.

This is basically the equivalent of saying that a modern Intel laptop cpu is bad because you can't share code between the x86_64 cores and the GPU.

This is very common in the ARM world, to have cortex-A cores for the user OS, and Cortex-M cores as microcontrollers.  They can be for realtime tasks, power management, or security, depending on the application.

Given that they are all three risc-V variants you can use the same compiler with appropriate architecture flags, but you won't be using the same binary or migrating code from one to another.

There are asymmetric cases where you do share code such as a BIG.little.  Then you do ideally want the the big and little cores to have the same ISA.  Intel had a problem with that on their recent CPUs because the E cores didn't support AVX-512 they ended up having to disable it on the performance cores as well.

The key thing is that this arrangement makes it hard for zeptoforth to support generating code for all three cores, because it would have to have code generators that would put out instructions for each core separately, and furthermore, because it inlines much of itself into the code it generates, it would have to have triple the code to inline, one version for each core. Furthermore, any code that is compiled would have to be compiled in triplicate, with one version for each core. Of course, this arrangement would be, well, impractical. Consequently, the only real way to practically make use of all but one of the cores is to include precompiled blobs and to not support runtime compilation of code. This is fine if your goal is simply to support a WiFi/BLE/Zigbee stack on a core, which probably is the real intent here, but if one is not using such a stack it is essentially wasted silicon.

I understand that your issue here it not so much with the non-homogeneous multi-core architecture, but with the architecture of your tool.
I have no doubt making it able to handle all cores kind of transparently is going to be a singificant endeavour, and as others have said, I'm not sure if it's the way to go.

But OTOH, that could be interesting as a generalization of your tool. Of course, I wouldn't suggest doing this *only* for a particular target, if you're interested in the concept, but more as a generalization of your system (that I don't really know.) Those non-homogeneous multi-core SoCs are likely to become even more common in the future IMO.
That would imply adding ways for the user to assign specific tasks to a specific core though, as the whole point of these SoCs is to do precisely that.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4040
  • Country: nz
Re: Asymmetric multiprocessing considered harmful
« Reply #13 on: April 16, 2023, 05:11:31 am »
The key thing is that this arrangement makes it hard for zeptoforth to support generating code for all three cores, because it would have to have code generators that would put out instructions for each core separately, and furthermore, because it inlines much of itself into the code it generates, it would have to have triple the code to inline, one version for each core.

Unlikely. All three cores can run the same RV32EMC code, if you are careful about generating it. The whole memory map is in the lowest 4 GB, and everything except the boot ROM is in the first 2 GB, so you can happily use the big fast 64 bit core as a 32 bit core if you want to. Loading a 32 bit pointer from RAM will invisible sign extend it to 64 bits, which is zero-extend since all addresses have the hi bit cleared. So then you can happily dereference it as if it was a 64 bit pointer all along. You can have functions just save and restore the lo 32 bits of registers (including function return addresses). The only two things you'd have to watch would be to not depend on arithmetic overflow wrapping around at 232 and you'd have to be careful about using a left shift followed by a right shift for extracting zero-extended or sign-extended bitfields. If you put the shift count into a register then you can just use 63-N (or just -N) all the time as the 32 bit cores will only look at the lower 5 bits. (It's a spec violation if not) I'm not sure whether those 32 bit cores will trap with illegal instruction if you set bit 5 of the shift amount in a shift with an immediate operand for the shift count. I think those encodings might be reserved now, but weren't in 2019.

If you ignore the smallest core than the same code can run on the bigger 32 bit core and the 64 bit core using all 32 registers, and single-precision FP as well.
 
The following users thanked this post: Someone

Offline tabemannTopic starter

  • Contributor
  • Posts: 43
  • Country: us
Re: Asymmetric multiprocessing considered harmful
« Reply #14 on: April 16, 2023, 05:58:16 am »
The key thing is that this arrangement makes it hard for zeptoforth to support generating code for all three cores, because it would have to have code generators that would put out instructions for each core separately, and furthermore, because it inlines much of itself into the code it generates, it would have to have triple the code to inline, one version for each core.

Unlikely. All three cores can run the same RV32EMC code, if you are careful about generating it. The whole memory map is in the lowest 4 GB, and everything except the boot ROM is in the first 2 GB, so you can happily use the big fast 64 bit core as a 32 bit core if you want to. Loading a 32 bit pointer from RAM will invisible sign extend it to 64 bits, which is zero-extend since all addresses have the hi bit cleared. So then you can happily dereference it as if it was a 64 bit pointer all along. You can have functions just save and restore the lo 32 bits of registers (including function return addresses). The only two things you'd have to watch would be to not depend on arithmetic overflow wrapping around at 232 and you'd have to be careful about using a left shift followed by a right shift for extracting zero-extended or sign-extended bitfields. If you put the shift count into a register then you can just use 63-N (or just -N) all the time as the 32 bit cores will only look at the lower 5 bits. (It's a spec violation if not) I'm not sure whether those 32 bit cores will trap with illegal instruction if you set bit 5 of the shift amount in a shift with an immediate operand for the shift count. I think those encodings might be reserved now, but weren't in 2019.

If you ignore the smallest core than the same code can run on the bigger 32 bit core and the 64 bit core using all 32 registers, and single-precision FP as well.

Sign extension is exactly an area that I was thinking would be tricky, even if otherwise lowest common denominator code can be generated. In general I would probably only want 32-bit cells; 64-bit would be a waste, and would introduce unnecessary incompatibility with other versions of zeptoforth. Also, I have no need for greater than 16 registers; zeptoforth on ARM Cortex-M does not even make use of all 16 registers available to it.
 

Offline tabemannTopic starter

  • Contributor
  • Posts: 43
  • Country: us
Re: Asymmetric multiprocessing considered harmful
« Reply #15 on: April 16, 2023, 06:05:39 am »
You have completely misunderstood the idea, no wonder it hurts trying to push the square peg through the round hole.

It's not to abstract the whole as something which runs general purpose OS scheduling whatever tasks/threads into those cores.

The idea is to, for example, use the smaller CM4 core to run a dedicated bare metal project which does a well-defined task of its own, and communicate with the another core which then can run a different bare metal or maybe OS project. In such case, the fact they are of different (but similar) architecture is only a tiny bit of mental load.

If you want to support these things in your OS, the best approach is exactly to ignore the small auxiliary cores and only target the "main" core. Users who need the small cores then know exactly what they are doing.

What I would ideally want to do is for one compiler to be able to compile code shared between all cores, and then offload tasks onto dedicated cores meant for those particular operations (e.g. high throughput but high latency code onto the biggest core, and low throughput but low latency code onto the smaller cores). This way the user could have one codebase, rather than having to separately compile code offline that is destined for a particular core, and essentially include it as essentially an opaque binary along with the other code. As I have been now informed, most likely a lowest common denominator architecture can be compiled to that will run on all three cores, which would greatly simplify this; not being that familiar with RISC-V that was something I had not been familiar about, this was something I was uncertain about. Of course, that is unnecessary if all that is going to be run on the other cores are precompiled stacks independent of one's own codebase.
 

Offline tabemannTopic starter

  • Contributor
  • Posts: 43
  • Country: us
Re: Asymmetric multiprocessing considered harmful
« Reply #16 on: April 16, 2023, 06:09:46 am »
I understand that your issue here it not so much with the non-homogeneous multi-core architecture, but with the architecture of your tool.
I have no doubt making it able to handle all cores kind of transparently is going to be a singificant endeavour, and as others have said, I'm not sure if it's the way to go.

But OTOH, that could be interesting as a generalization of your tool. Of course, I wouldn't suggest doing this *only* for a particular target, if you're interested in the concept, but more as a generalization of your system (that I don't really know.) Those non-homogeneous multi-core SoCs are likely to become even more common in the future IMO.
That would imply adding ways for the user to assign specific tasks to a specific core though, as the whole point of these SoCs is to do precisely that.

Currently tasks are always specifically assigned to cores, by design; there is no automatic core assignment (other than to default to running a new task on the same core as the task that spawned it), unlike in many OS'es, for the very reason that the user may want to choose which core to run code on.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4040
  • Country: nz
Re: Asymmetric multiprocessing considered harmful
« Reply #17 on: April 16, 2023, 08:48:52 am »
Sign extension is exactly an area that I was thinking would be tricky

As an example, to sign-extend an 8 bit value to full register:

On RV64I:

Code: [Select]
slli x,x,56
srai x,x,56

On RV32I:

Code: [Select]
slli x,x,24
srai x,x,24

The RV64 code might work on a sloppy RV32 core that doesn't strictly illegal instruction trap on undefined opcodes, but just looks at bits 4:0 of the literal (not bits 5:0 as RV64 does) and sees 56&0x1f = 24.

Works on both RV64 and RV32 (guaranteed by the spec):

Code: [Select]
li y,-8
sll x,x,y
sra x,x,y
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: Asymmetric multiprocessing considered harmful
« Reply #18 on: April 16, 2023, 11:21:07 am »
one core performing AES{crypt, decrypt}
one core performing ZIP{compresse, decompress}
one core performing other CPU tasks

I am done ;D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: RandallMcRee

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14488
  • Country: fr
Re: Asymmetric multiprocessing considered harmful
« Reply #19 on: April 16, 2023, 07:22:50 pm »
I understand that your issue here it not so much with the non-homogeneous multi-core architecture, but with the architecture of your tool.
I have no doubt making it able to handle all cores kind of transparently is going to be a singificant endeavour, and as others have said, I'm not sure if it's the way to go.

But OTOH, that could be interesting as a generalization of your tool. Of course, I wouldn't suggest doing this *only* for a particular target, if you're interested in the concept, but more as a generalization of your system (that I don't really know.) Those non-homogeneous multi-core SoCs are likely to become even more common in the future IMO.
That would imply adding ways for the user to assign specific tasks to a specific core though, as the whole point of these SoCs is to do precisely that.

Currently tasks are always specifically assigned to cores, by design; there is no automatic core assignment (other than to default to running a new task on the same core as the task that spawned it), unlike in many OS'es, for the very reason that the user may want to choose which core to run code on.

So your main issue is generating code for different cores from the same language then, and what you call a common code base?

Those different cores in general are likely to have not just differences in the instruction sets, but also peripherals and various specific features that you need to address anyway. It's not just about one core having "lower latency", one core having higher CPU performance, etc. It's also that completely different things can be achieved with them.

As the core assignment is in all likelihood done statically (I wouldn't see a point of doing that dynamically if it's not automatic), your "compiler" can generate code adapted for each as it knows which core does what.

Now one benefit of having a common code base would be to make it easier to communicate between cores - that would be the real added value here IMO, so providing communications channels through some kind of mailbox.
 

Offline tabemannTopic starter

  • Contributor
  • Posts: 43
  • Country: us
Re: Asymmetric multiprocessing considered harmful
« Reply #20 on: April 16, 2023, 07:54:41 pm »
So your main issue is generating code for different cores from the same language then, and what you call a common code base?

Those different cores in general are likely to have not just differences in the instruction sets, but also peripherals and various specific features that you need to address anyway. It's not just about one core having "lower latency", one core having higher CPU performance, etc. It's also that completely different things can be achieved with them.

As the core assignment is in all likelihood done statically (I wouldn't see a point of doing that dynamically if it's not automatic), your "compiler" can generate code adapted for each as it knows which core does what.

Now one benefit of having a common code base would be to make it easier to communicate between cores - that would be the real added value here IMO, so providing communications channels through some kind of mailbox.

There is a difference between having different peripherals and requiring completely separate language implementations. A lowest common denominator ISA would allow all the cores to share the same compiler and much of the same code, even if each core requires its own multitasker and its own peripheral support. Were I to break my STM32H745 DISCOVERY out of its packaging this would be the case since both the Cortex-M4 and the Cortex-M7 cores support ARMv7-M. But without a lowest common denominator ISA the code generator would need to support multiple ISA's, and any code in the kernel or which is generated would have to be kept in multiple versions, for each ISA which such code would need to execute under.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6265
  • Country: fi
    • My home page and email address
Re: Asymmetric multiprocessing considered harmful
« Reply #21 on: April 17, 2023, 12:47:51 pm »
What do you gain from using the exact same compiler for all cores?

I gain nothing.  I habitually create build machinery so that I can trivially switch compilers; and I very often use custom build stages to generate or manipulate data.  On ARMv7e-m for example, I want my freestanding C/C++ code to be compiled to sensible machine code using both GCC and clang.  It is the same source code, implementing the same overall design, just compiled using various tools to the different type cores.  I like asymmetric multiprocessing a lot.  Calling it harmful, even in jest/hyperbole, is utterly stupid in my view.

(As I've described elsewhere, I even like to pick my programming language based on the need.  For example, for fully hosted (running under a fully featured OS), I currently like to use Python 3 and Qt 5 for the user interface, because that lets the end users tweak/modify/fix the user interface without having to install any kind of development tools; and it also makes for a nice license break line if I want to include proprietary machinery in a dynamically linked native library.)

This reminds me of a discussion I had a few years ago with a self-professed "MPI Expert", who categorically declared asynchronous I/O harmful and dangerous, just because they themselves could not wrap their mind around how to do it effectively and efficiently without issues.

Instead of forcing your preferred model onto the hardware, either
  • Pick only hardware that suits your preferred model

    or
     
  • Pick a model that well exploits the hardware features
Otherwise, you're essentially recommending/suggesting/demanding others to stop using hardware or models that do not suit you, just because they do not suit you.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8179
  • Country: fi
Re: Asymmetric multiprocessing considered harmful
« Reply #22 on: April 17, 2023, 01:24:35 pm »
Any decent compiler should support all the cores you can buy from the series anyway, and then it's just a matter of compiling with different command line options. I don't see it relevant if the cores are integrated on the same die and sold together, or you buy them separately. For example, if your compiler supports Cortex-M7 but not Cortex-M4, it's of little use.

Having exact same ISA (or binary compatibility) would be handy only in the case of automagic core assignment, from performance resource viewpoint purely. But that's not the point of multicore microcontrollers at all.
 

Offline newbrain

  • Super Contributor
  • ***
  • Posts: 1719
  • Country: se
Re: Asymmetric multiprocessing considered harmful
« Reply #23 on: April 17, 2023, 02:54:24 pm »
Any decent compiler should support all the cores you can buy from the series anyway, and then it's just a matter of compiling with different command line options.
I habitually create build machinery so that I can trivially switch compilers;
I see a bit of point missing, in constantly referring to CLI options or build tools.
OPs perspective is quite different, we are not talking about external tools, but about a compiler that's an integral part of the Forth interpreter running on the target, with the inherent limitations.

It is quite common for Forth implementations to include assemblers and all non-tethered Forths include a Forth compiler.
Depending on the Forth model chosen, a word (≈Forth  function) can be compiled to threaded code or native code.
Mecrisp Stellaris chose the latter Forth model, GForth, e.g., the former (a modified direct threading).

I can understand that implementing different compiler "flavours" for different target cores is a PITA, and breaks the simple model "compile a word and run it on any core".

That said, the advantages of having differently capable cores for different purposes in the same MCU have been quite clearly explained.
So the BL808 is definitely not an easy or maybe even "good" target for (this) Forth, as Nominal Animal said:
  • Pick only hardware that suits your preferred model

    or
     
  • Pick a model that well exploits the hardware features
Nandemo wa shiranai wa yo, shitteru koto dake.
 

Offline coppice

  • Super Contributor
  • ***
  • Posts: 8652
  • Country: gb
Re: Asymmetric multiprocessing considered harmful
« Reply #24 on: April 17, 2023, 03:27:26 pm »
This looks like a typical radio SoC. You have cores where you put a radio stack, get it approved and then leave it alone as much as you possibly can. Then you have cores where you can put applications, modify them, and not sink yourself back into the mire of complex approval processes every time. You very much DON'T want a single OS running across those cores.
That is the only way this design even makes sense.
That's the only sense the design was ever intended to make. Look at any of the competing devices. They are ALL highly asymmetric. Its not a bug, its a feature. The more locked down the radio stack is, the easier it is to convince approvals people that its securely locked away. and you don't need to keep re-approving every time you change the applications code. Buses, peripherals and memory spaces are all usually partitioned, partly to help with this isolation. You don't want the apps processor being able to tinker with value in peripherals the certified stack is supposed to be managing. So, they use different cores. Who cares? Compilers usually handle entire series of cores. So, whichever you choose you just select the target core in your compile scripts, and to the programmer they all look much the same. It would be good to be able to run the same RTOS on all the cores, but not essential. Most people use third party stacks, running on whatever RTOS the developer used. What you run on the apps processor could be very different. It would be a terrible idea to run the same instance of an RTOS on all the cores.

The bottom line is if you make this device more symmetric, nobody in the radio business will buy it.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf