Author Topic: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO  (Read 27316 times)

0 Members and 1 Guest are viewing this topic.

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #175 on: November 22, 2019, 09:47:02 pm »
Finding the code size from in the program itself is hacky and depends on findPrimes() and main() staying adjacent in the final binary and in that order. It seems to be reasonably reliable on gcc with -O1, but not higher levels.

Use nm:

*I* know how to find the real code size -- I've been doing that on many platforms for many years, as you can see in the results.

I added the hacky calculation in the program itself because last week every single person who submitted a result they had run on some machine I don't have didn't tell me the code size.
 

Offline maginnovision

  • Super Contributor
  • ***
  • Posts: 1962
  • Country: us
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #176 on: November 22, 2019, 10:15:21 pm »
If you copy paste the code, build with gcc -O1, and run it you should get the right answer. I didn't check my avr binary, but I have checked my xs1 and xs2 binaries I just didn't add them to the posts.
 

Offline GeorgeOfTheJungle

  • Super Contributor
  • ***
  • !
  • Posts: 2699
  • Country: tr
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #177 on: November 25, 2019, 11:55:30 am »
Quote
Can you look at the disassembly and see if the accesses are single instruction?

I'm using the arduino IDE, I don't know how to do that with this.

The Arduino IDE actually makes it relatively simple to do this. Make a trivial change to your source code -- maybe add and delete a character and then hit the "check"/"compile" button. IN the panel at the bottom (maybe make it bigger) you'll see a line like the following, with the first (very long) "word" ending in gcc and the path to your eventual executable binary file in the middle (here Blink.ino.elf) after a "-o". Or for certain targets the gcc might be ld instead.

/home/bruce/software/arduino-1.8.10/hardware/teensy/../tools/arm/bin/arm-none-eabi-gcc -O1 -Wl,--gc-sections,--relax -T/home/bruce/software/arduino-1.8.10/hardware/teensy/avr/cores/teensy4/imxrt1062.ld -mthumb -mcpu=cortex-m7 -mfloat-abi=hard -mfpu=fpv5-d16 -o /tmp/arduino_build_829669/Blink.ino.elf /tmp/arduino_build_829669/sketch/Blink.ino.cpp.o /tmp/arduino_build_829669/core/core.a -L/tmp/arduino_build_829669 -larm_cortexM7lfsp_math -lm -lstdc++

Open a terminal window (from your OS, nothing to do with gcc) and copy and paste the bit with gcc or ld and the output file. Don't try to run it yet!

/home/bruce/software/arduino-1.8.10/hardware/teensy/../tools/arm/bin/arm-none-eabi-gcc  /tmp/arduino_build_829669/Blink.ino.elf

Now just replace the "gcc" bit by "objdump -d":

/home/bruce/software/arduino-1.8.10/hardware/teensy/../tools/arm/bin/arm-none-eabi-objdump -d  /tmp/arduino_build_829669/Blink.ino.elf

You can run that.

If you can't scroll your terminal window backwards then you might want to put " | more" (or " | less") on the end, or redirect the output to a file with " >/home/bruce/myDisassembly.txt" or whatever other location or name you want. (Your name probably isn't Bruce...)

If the compiler is gcc then you can get an assembly language listing by instead finding the line that compiled your code ("Blink.ino.cpp") to an object file ("-o .../Blink.ino.cpp.o"). You can just copy and paste the whole line into your console/terminal window and re-run it. If you add to the end " -g -Wa,-adhl" then you'll get a listing printed to the terminal with the original lines of C code, the generated assembly language, and the binary (hex) code for the instructions.

Thank you Sir!

This:
Code: [Select]
void loop () {
  register uint32_t mask= 0xa;
  register volatile uint32_t* toggle= (volatile uint32_t*) 0x4200408c;
  while (1) {
    *toggle= mask;
    *toggle= mask;
    *toggle= mask;
    *toggle= mask;
  }
}

Gives:
Quote
00000194 <loop>:
     194:   230a4a03    .word   0x230a4a03
     198:   6013         str   r3, [r2, #0]
     19a:   6013         str   r3, [r2, #0]
     19c:   6013         str   r3, [r2, #0]
     19e:   6013         str   r3, [r2, #0]
     1a0:   e7fa         b.n   198 <loop+0x4>
     1a2:   bf00         nop
     1a4:   4200408c    .word   0x4200408c

=> Not much room for improvement I guess...  :(
« Last Edit: November 25, 2019, 12:01:27 pm by GeorgeOfTheJungle »
The further a society drifts from truth, the more it will hate those who speak it.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #178 on: November 25, 2019, 01:47:15 pm »
Quote
00000194 <loop>:
     194:   230a4a03    .word   0x230a4a03
     198:   6013         str   r3, [r2, #0]
     19a:   6013         str   r3, [r2, #0]
     19c:   6013         str   r3, [r2, #0]
     19e:   6013         str   r3, [r2, #0]
     1a0:   e7fa         b.n   198 <loop+0x4>
     1a2:   bf00         nop
     1a4:   4200408c    .word   0x4200408c

And here we see the quite extraordinary phenomenon of a toolchain's "objdump" not understanding code generated by the compiler from the same toolchain! I've seen this with my Teensy 4.0.

The 230a4a03 should I believe (based on the rest of the code) disassemble to:

   194:      4a03      ldr      r2, [pc, #12]
   196:      230a      movs r3, #10

Those are not arcane or new instructions! They are absolutely standard original Thumb instructions present right from the ARM7TDMI in 1994.
 

Offline GeorgeOfTheJungle

  • Super Contributor
  • ***
  • !
  • Posts: 2699
  • Country: tr
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #179 on: November 25, 2019, 02:18:04 pm »
Quote
00000194 <loop>:
     194:   230a4a03    .word   0x230a4a03
     198:   6013         str   r3, [r2, #0]
     19a:   6013         str   r3, [r2, #0]
     19c:   6013         str   r3, [r2, #0]
     19e:   6013         str   r3, [r2, #0]
     1a0:   e7fa         b.n   198 <loop+0x4>
     1a2:   bf00         nop
     1a4:   4200408c    .word   0x4200408c

And here we see the quite extraordinary phenomenon of a toolchain's "objdump" not understanding code generated by the compiler from the same toolchain! I've seen this with my Teensy 4.0.

The 230a4a03 should I believe (based on the rest of the code) disassemble to:

   194:      4a03      ldr      r2, [pc, #12]
   196:      230a      movs r3, #10

Those are not arcane or new instructions! They are absolutely standard original Thumb instructions present right from the ARM7TDMI in 1994.

And I retouched it a bit, because objdump gave me this:

Quote
00000194 <loop>:
     194:   230a4a03    .word   0x230a4a03
     198:   6013         str   r3, [r2, #0]
     19a:   6013         .short   0x6013
     19c:   6013         str   r3, [r2, #0]
     19e:   6013         .short   0x6013
     1a0:   e7fa         b.n   198 <loop+0x4>
     1a2:   bf00         nop
     1a4:   4200408c    .word   0x4200408c
The further a society drifts from truth, the more it will hate those who speak it.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #180 on: November 25, 2019, 02:39:03 pm »
That's just insane.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 4122
  • Country: fi
    • My home page and email address
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #181 on: November 25, 2019, 03:38:22 pm »
Technically, gcc/cc1/g++ and gas/gdb/objdump are different packages in the same toolchain, which explains why sometimes gdb/objdump disagree with gcc what the binary code actually is.

Nevertheless, that is an obvious bug in binutils-gdb, and should be reported to the bugzilla with reproducible examples.

That said, bugs #10288 and #10924 have been open since 2009, and are about ARM7TDMI instruction decoding.  It looks like nobody cared enough to do it properly.  Most likely, companies used just enough resources to get support into GCC, and let the users worry about the rest of the toolchain.
 

Offline GeorgeOfTheJungle

  • Super Contributor
  • ***
  • !
  • Posts: 2699
  • Country: tr
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #182 on: November 25, 2019, 06:51:58 pm »
I had never seen a µC do a jmp in zero cycles before.  :-+

Code: [Select]
while (1) *toggle= mask;
How fast can those longan RISC-Vs toggle a gpio? 1/4th the µC clock? 10MHz was the max I could get on a esp32@240MHz.
The further a society drifts from truth, the more it will hate those who speak it.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 10171
  • Country: fr
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #183 on: November 25, 2019, 07:11:08 pm »
I had never seen a µC do a jmp in zero cycles before.  :-+

Code: [Select]
while (1) *toggle= mask;

Whereas I'm not sure there were any "MCU" per se that had this, there were certainly CPUs in general that could loop with zero overhead, using some kind of "REP" prefix instruction. Not exactly a general "jmp" of course, but could still be used for many things.
(Don't some of the Microchip PICs have something like this in their instruction set? Maybe the dsPIC?)
 

Offline coppice

  • Super Contributor
  • ***
  • Posts: 6964
  • Country: gb
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #184 on: November 25, 2019, 07:13:35 pm »
I had never seen a µC do a jmp in zero cycles before.  :-+
There have been some DSP oriented controller cores which offered zero cycle loop overhead, just like most full on DSP cores.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #185 on: November 26, 2019, 12:28:39 am »
I had never seen a µC do a jmp in zero cycles before.  :-+
There have been some DSP oriented controller cores which offered zero cycle loop overhead, just like most full on DSP cores.

The new ARMv8.1-M spec includes zero-overhead loops:

[attachimg=1]

We think there's a better way -- stay tuned :-)
 

Offline Berni

  • Super Contributor
  • ***
  • Posts: 4295
  • Country: si
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #186 on: November 26, 2019, 06:17:07 am »
Oh that's neat. I had no idea that ARM could do that.

Do things like this get used by the C compiler when you write a for loop with a known length?
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: RPi 4 / STM32 / ESP32 / Teensy 4 / RISC-V GAZPACHO
« Reply #187 on: November 26, 2019, 08:21:48 am »
Oh that's neat. I had no idea that ARM could do that.

I'd expect it's going to be a year or two before they'll be shipping any cores that can do this.

Quote
Do things like this get used by the C compiler when you write a for loop with a known length?

That will be extremely easy to add to gcc and llvm, yes.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf