Author Topic: Dhrystone 2.1 on mcus  (Read 41840 times)

0 Members and 1 Guest are viewing this topic.

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #50 on: March 07, 2014, 12:49:37 am »
Added numbers for LM4F120 (CM4) and LPC1343 (CM3).

================================
https://dannyelectronics.wordpress.com/
 

Offline BravoV

  • Super Contributor
  • ***
  • Posts: 7547
  • Country: 00
  • +++ ATH1
Re: Dhrystone 2.1 on mcus
« Reply #51 on: March 07, 2014, 04:37:45 am »

LM4F120:             2,914,     MDK-ARM,      optimized (-O3 + time)

AVR90USB1286:  237,       gcc-avr


Added numbers for LM4F120 (CM4) ....

Cool numbers, never did a through comparison on my own, but this at least confirms my limited observation on TI CM4, it "feels" so fast even on my bloated noob code, btw I migrated from AVR as my sole mcu in the past, thanks.

The worst part is, I become too spoiled, lazy and totally screws my effort to learn code optimization.  :palm:
« Last Edit: March 07, 2014, 08:29:45 am by BravoV »
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #52 on: March 07, 2014, 12:24:20 pm »
Quote
Just want to make sure we agree that we are calculating results per MHz and not per MIPS.

That's what I suspected. I used a 8Mhz crystal in my test and used 4Mhz for the calculation, as the cpu is actually running at 4Mhz. I understand if you used 8Mhz - both approaches have rationale. Thus I kept the numbers the way they are, knowing that people may think one is more valid than another.

================================
https://dannyelectronics.wordpress.com/
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #53 on: March 08, 2014, 01:59:11 pm »
STM32F4 added for gcc-arm (running at 16Mhz): inline with the F3's numbers.
================================
https://dannyelectronics.wordpress.com/
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #54 on: March 08, 2014, 02:09:23 pm »
STM32F4 under iar-arm added. I think this is the first time in this test where gcc-arm is faster than iar-arm, with optimization turned on. The validity of the test with optimization turned on, however, is dubious without further investigation.
================================
https://dannyelectronics.wordpress.com/
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #55 on: March 08, 2014, 03:22:32 pm »
Also added are the STM32F4 numbers under mdk.

IAR and Keil seem to be running neck to neck. GCC appears to be quick a bit slower than either IAR or Keil.
================================
https://dannyelectronics.wordpress.com/
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 26755
  • Country: nl
    • NCT Developments
Re: Dhrystone 2.1 on mcus
« Reply #56 on: March 08, 2014, 08:49:55 pm »
Replace the C library functions with internal ones and test again to make sure it's not the C library making the difference between IAR and GCC.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #57 on: March 09, 2014, 12:46:30 am »
Added LPC1114 (=CM0 running at 12Mhz). The unoptimized Dhrystone/Mhz number is actually quite comparable to STM8S' (=good old 6502).

The old clunker isn't that bad, after all. :)

Or to put it another way, the OEMs are reasonable honest when they say that the CM0/1 chips are meant to compete with the 8-bitters.

================================
https://dannyelectronics.wordpress.com/
 

Offline jaxbird

  • Frequent Contributor
  • **
  • Posts: 778
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #58 on: March 09, 2014, 11:51:38 am »
Quote
Just want to make sure we agree that we are calculating results per MHz and not per MIPS.

That's what I suspected. I used a 8Mhz crystal in my test and used 4Mhz for the calculation, as the cpu is actually running at 4Mhz. I understand if you used 8Mhz - both approaches have rationale. Thus I kept the numbers the way they are, knowing that people may think one is more valid than another.


In my tests, using the internal oscillator, the configuration was like this:

Fosc = ((7.37MHz * M) / N1) / N2

Where M, N1 and N2 are PLLFBD, PLLPOST and PLLPRE.

And the actual values used:

M = 65    
N1 = 2   
N2 = 3

Giving Fosc = 79.841 MHz. (+/- 2%)

The datasheet/reference does list most of the instructions as executing in a single cycle, but I do find that a bit questionable as it's clearly not oscillator clock cycles they are referring to.

Analog Discovery Projects: http://www.thestuffmade.com
Youtube random project videos: https://www.youtube.com/user/TheStuffMade
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 26755
  • Country: nl
    • NCT Developments
Re: Dhrystone 2.1 on mcus
« Reply #59 on: March 09, 2014, 12:49:36 pm »
Added LPC1114 (=CM0 running at 12Mhz). The unoptimized Dhrystone/Mhz number is actually quite comparable to STM8S' (=good old 6502).

The old clunker isn't that bad, after all. :)

Or to put it another way, the OEMs are reasonable honest when they say that the CM0/1 chips are meant to compete with the 8-bitters.
What you wrote above makes all my alarm bells ring. I'm very much doubting comparing unoptimised results has any real value. Unoptimised code is mostly used for debugging purposes where each line of code is represented by some assembly language. The aim is not even to make production grade code as no-one in their right mind would use that in a product. A real test would be to optimise for size and for speed.
« Last Edit: March 09, 2014, 12:59:45 pm by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #60 on: March 09, 2014, 01:10:46 pm »
.-The datasheet/reference does list most of the instructions as executing in a single cycle, but I do find that a bit questionable as it's clearly not oscillator clock cycles they are referring to.-

you are using a 24h part, right? Take a liook at the clock tree. Fcy is at least 1/2 of fosc, assuming doze is not set.

on 24f parts, you have to go through the datasheet to find that out.
================================
https://dannyelectronics.wordpress.com/
 

Offline Kjelt

  • Super Contributor
  • ***
  • Posts: 6459
  • Country: nl
Re: Dhrystone 2.1 on mcus
« Reply #61 on: March 09, 2014, 10:48:17 pm »
STM8S' (=good old 6502).
You keep on saying that, I didn't know that,  is there some info on that?
I know the STM8 also has only the X and Y register (unfortunately, if they had added some extra registers that would have been nice).
But is that the only similarity?
 

Offline jaxbird

  • Frequent Contributor
  • **
  • Posts: 778
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #62 on: March 23, 2014, 01:24:32 pm »
.-The datasheet/reference does list most of the instructions as executing in a single cycle, but I do find that a bit questionable as it's clearly not oscillator clock cycles they are referring to.-

you are using a 24h part, right? Take a liook at the clock tree. Fcy is at least 1/2 of fosc, assuming doze is not set.

on 24f parts, you have to go through the datasheet to find that out.

Yeah, it's defined as Fcy = Fosc / 2. So 2 clocks required minimum per instruction.

Anyway, not important, my motivation was primarily to find the main reason for the large differences in measured performance. I believe we agree this is where our calculations differ, so I'm satisfied :)

Analog Discovery Projects: http://www.thestuffmade.com
Youtube random project videos: https://www.youtube.com/user/TheStuffMade
 

Offline GiskardReventlov

  • Frequent Contributor
  • **
  • Posts: 598
  • Country: 00
  • How many pseudonyms do you have?
Re: Dhrystone 2.1 on mcus
« Reply #63 on: March 25, 2014, 04:24:27 pm »
Didn't run on 8051 but would expect it to hold its own reasonably well.

Hold its own what?  I would say that this all falls under the category of "Premature optimization", but maybe not.
Can you provide a few cases where this kind of performance is the keystone in a design?
I'm curious to know where this metric would be the top design decision.

I've been learning more about uC and I see that some uC have 8051 cores in them, don't know if any you tested do.
Do you know if any do?
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #64 on: March 25, 2014, 10:58:57 pm »
Added (simulated) scores for C51 (a nxp P87C51MC2, in order to hold the data). Scores are obtained in simulation under Keil C51, on 24Mhz crystal, and calculated off a 2Mhz core frequency (the chip I think is a 12-cycle C51).

================================
https://dannyelectronics.wordpress.com/
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #65 on: March 26, 2014, 12:44:22 pm »
STM8s are advertised by ST as a 0.25DMIPS/Mhz chip. That translates into about 430 dhrystones/Mhz, consistent with our measurements here.

Unfortunately, for the CMx chips, we are getting about 50 - 75% of the numbers published by ARM / vendors.
================================
https://dannyelectronics.wordpress.com/
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #66 on: March 26, 2014, 02:42:38 pm »
The DMIPS/Mhz numbers for 8051 varies a lot, from the lows of <0.1DMIPS/Mhz to the highest of 0.5DMIPS/Mhz. 0.25DMIPS/Mhz (about 400+ dhrystones per Mhz) being quite often quoted.

Fairly remarkable in that a chip from the 1980s is as fast as a chip introduced in the last 10 years (STM8).

Not sure what 6500 has in terms of dhrystones scores.
================================
https://dannyelectronics.wordpress.com/
 

Online westfw

  • Super Contributor
  • ***
  • Posts: 4196
  • Country: us
Re: Dhrystone 2.1 on mcus
« Reply #67 on: March 26, 2014, 05:13:09 pm »
Quote
Fairly remarkable in that a chip from the 1980s is as fast as [a newer chip]
You're still measuring DMIPS/MHz, right?  That's not "speed", that's just "architectural efficiency at running C code" or something like that.  The RISC claim is not so much that their architectures are fundamentally faster, just that they permit building SIMPLER chips, which in turn allows the clock rate to be pushed up and give you an overall faster chip.
 

Offline hans

  • Super Contributor
  • ***
  • Posts: 1626
  • Country: nl
Re: Dhrystone 2.1 on mcus
« Reply #68 on: March 26, 2014, 05:37:57 pm »
That's true. The Pentium 3 at 1GHz was faster than a Pentium 4 at 1GHz, but the Pentium 4 could clock far higher with it's new pipeline design. The end of their range it maxed out just under 4GHz or so, and we haven't seen much higher ever since (for example, my i5 3570K steps up to 3.9GHz 1-core load). The only thing that keeps pushing for more performance has been multi-threading and more efficient CPU's, with larger/better caches, more instructions to play with (if programs are enabled for them), etc.

An interesting dimension to add is power consumption per MHz. From that you could then calculate a performance/energy, as you have both Dhrystones/MHz, and mA/MHz, which divided on each other would give Dhrystone/mA ratio, or simply put "computing efficiency". That would be interesting for low power electronics like battery powered stuff which main driver is the MCU doing stuff on an regular basis.
I don't know if it's acceptable to take these figures from the datasheet.. it can depend a lot of what peripherals are turned on (ARM) or supply voltage.

I think I have a board lying around with a PIC32 on it. I will see if I can run the test on that too, see how well MIPS4k compares. They claim 1.65DMIPS/MHz on that.
« Last Edit: March 26, 2014, 05:40:29 pm by hans »
 

Online westfw

  • Super Contributor
  • ***
  • Posts: 4196
  • Country: us
Re: Dhrystone 2.1 on mcus
« Reply #69 on: March 26, 2014, 05:42:16 pm »
(In this case, we're saying STM8 is as fast as CM0 (in DMIPS/MHz), but STM8 tops out at 16MHz, while CM0 in the same price range run 48-72MHz...)  (I count that as about 4x the DMIPS/Dollar...)
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #70 on: March 26, 2014, 09:08:25 pm »
Added the simulated results for PIC32MX320F128H, under an old C32 compiler.

The unoptimized figure translates to 0.75 DMIPS/Mhz, and 2.0 DMIPS/Mhz optimized - not that believable, however.
================================
https://dannyelectronics.wordpress.com/
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #71 on: March 26, 2014, 09:09:24 pm »
Quote
I will see if I can run the test on that too, see how well MIPS4k compares. They claim 1.65DMIPS/MHz on that.

Would love to see where the real thing comes out to be.
================================
https://dannyelectronics.wordpress.com/
 

Offline Kjelt

  • Super Contributor
  • ***
  • Posts: 6459
  • Country: nl
Re: Dhrystone 2.1 on mcus
« Reply #72 on: March 27, 2014, 08:04:14 am »
(In this case, we're saying STM8 is as fast as CM0 (in DMIPS/MHz), but STM8 tops out at 16MHz, while CM0 in the same price range run 48-72MHz...)  (I count that as about 4x the DMIPS/Dollar...)
small correction: STM8 tops at 24MHz but then needs an external oscillator.
The CM0 will give a great increase in speed BUT with cost of codesize, the codesize of a CM0 is 30-35% larger then STM8 codesize and that also adds up to the final cost.
 

Offline dannyfTopic starter

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Dhrystone 2.1 on mcus
« Reply #73 on: March 27, 2014, 03:43:49 pm »
The dhrystone benchmark for 6502 (actually 65C02), that I can find, suggests a 0.022 DMIPS/Mhz (not sure if it is scaled by 2 or not). That would translate into a dhrystone score of 30 / Mhz. Slower than a PIC, :)
================================
https://dannyelectronics.wordpress.com/
 

Offline GiskardReventlov

  • Frequent Contributor
  • **
  • Posts: 598
  • Country: 00
  • How many pseudonyms do you have?
Re: Dhrystone 2.1 on mcus
« Reply #74 on: March 27, 2014, 05:37:28 pm »
Was curious when I discovered that 8051 (and 80C51?) cores are in a lot of uC. A quick digikey search shows 600-700 (plus have to subtract tape&reel, etc.). So let's say 500 (but still less if you subtract pkg types), but still a lot.  Or is that the only way to get an 8051? i.e. they only come as a core?

What does the C in 80C51 designate? (or 65C02)
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf