EEVblog Electronics Community Forum

Electronics => Beginners => Topic started by: t4rmo on March 22, 2021, 06:45:42 pm

Title: Questions about reverse engineering of a bootloader
Post by: t4rmo on March 22, 2021, 06:45:42 pm
Hi, I'm learning reversing, practicing with a old router(Huawei HG532s)
I extracted the router's flash content with a script of this post
https://wikidevi.wi-cat.ru/Huawei_HG532s 1

I have used binwalk to see the differents sections, with binwalk -e I have localiced the first section which seems be the bootloader. With ghidra i have analyzed this mips section and i found that there is a lot of references undefined , I know that I have to fix the memory map but I don't know how because the memory maps that appears in the previous web points to a lot places out of bootloader.
This is my problem, I want to repair the references because when the router boots some strings appear and I would like to find where they are called in the code

Router booting:

RT63365 at Wed Dec 17 16:09:06 CST 2014 version 0.8 Memory size 32MB Found SPI Flash 8MiB S25FL064A at 0xb0000000 Press any key in 3 secs to enter boot command mode. Search PHY addr and found PHY addr=0

I would like to find for example where is called the string "Press any key in 3Secs"


The question is how could I correct the references to locate where the strings are called to get more information about the bootloader's operation?
Title: Re: Questions about reverse engineering of a bootloader
Post by: Renate on March 25, 2021, 12:54:06 pm
I'm ARM, not MIPS and I'm not familiar with the tools you are using, but...

Finding strings is obviously easy.
Finding where they are being used is more complicated.
That's especially true if the code was compiled PIC, position independent.

Here's an example of the kind of thing you see often in ARM:
Code: [Select]
e59f116c  ldr  r1, [pc, #364] ; load offset to string
e7948003  ldr  r8, [r4, r3]   ; do unrelated work getting other arg
e08f1001  add  r1, pc, r1     ; make PC relative offset an absolute address
e1a00008  mov  r0, r8         ; finish getting other arg
eb0244f7  bl   2afa24         ; call function on (something, string)

00816810                      ; offset to string
So you can see that the two instructions necessary to generate a string address aren't even consecutive.

The first thing that I'd do is look generally for the equivalent of "add rx,pc,rx".
If you see lots of them then your code is probably PIC.
If you see lots more of loading a constant value and using it directly then your code probably isn't.

I have no idea if your tools can follow this stuff.
The GNU standard line commands like objdump do not.
I wrote my own little utility for ARM that can track these kind of things.
Title: Re: Questions about reverse engineering of a bootloader
Post by: ve7xen on March 25, 2021, 09:06:07 pm
If you don't get the memory map correct, then all your absolute references will be incorrect, and for most architectures that means the disassembly will be practically useless.

Looking at another dump for this hardware, which might not match exactly but is probably close. Disassembling the first instructions in the boot rom (at +0x280), I can see 3 instructions that perform an absolute jump to 0x81fb0290. Assuming this is the entry point / reset vector, it's a reasonable guess that the image is likely loaded with a base address of 0x81fb0000. If I then update the base address in the memory map in Ghidra, the dissassembly starts to make a lot more sense, and many of the references to strings start to be resolved.

There are still references outside the boot rom. These could be pointing to MMIO, peripherals, RAM, other areas of flash etc.
Title: Re: Questions about reverse engineering of a bootloader
Post by: hermitengineer on March 25, 2021, 10:37:45 pm
Like ve7xen says, your first task is to find out where the processor begins running, as that is where the image must be loaded.  Then you set your address base to there.

Second task is to find all the special addresses of the processor.  For instance, the Renesas RZ line has multiple SPI, timers, CANBUS, UART/USARTs, video, etc, in addition to a few megs of RAM.  You need to make sure your disassembler knows all of those addresses too so that it can give you an idea what it's doing.  Get the data manual for the processor in question so that you can enter all of that as necessary.  In addition, processors such as the RZ can map flash memory as external memory so that it can be read with native instructions, so there may also be some dynamic memory configuration to watch out for.

Also, remember that firmware may have initialized, mutable data.  The initialization will be stored in the firmware, but the first thing the program has to do then is to unpack the values into RAM before running main(), because it sure can't use them as intended when they're in flash ROM.
Title: Re: Questions about reverse engineering of a bootloader
Post by: t4rmo on March 31, 2021, 09:25:57 pm
Thank you very much, ve7xen, you nailed it.
I tried another way of getting there on my own by obtaining the distances between the strings and the content of the load instructions values as described in this post.

https://reverseengineering.stackexchange.com/questions/16612/is-there-a-method-for-guessing-the-addresses-for-unknown-areas-in-bare-metal-f

It was nice to check that your results and mine look like the same. Your help was very useful and accurate. Your post explains in an easy way what took me so long to find out. :)
Title: Re: Questions about reverse engineering of a bootloader
Post by: t4rmo on April 01, 2021, 10:12:50 pm
To finish the topic, i think that i don't know well what are the base image and the start address of the segment in a system because i don't understand why the base image is superior to the start address.
In this case the base image is 0xb000000 and the start address is 0x81fb0000
Title: Re: Questions about reverse engineering of a bootloader
Post by: ve7xen on April 01, 2021, 11:24:35 pm
To finish the topic, i think that i don't know well what are the base image and the start address of the segment in a system because i don't understand why the base image is superior to the start address.
In this case the base image is 0xb000000 and the start address is 0x81fb0000

Well the processor seems (publicly) poorly documented, so lots of guesswork involved here. My guess would be that the code you're looking at is a second-stage bootloader, which is copied into RAM by some other code (might be embedded on the SOC), or the SOC has some similar mechanism to map/copy the bootloader to RAM. I believe that 0x80000000 is the start of RAM, so the addresses we see making sense here probably indicate it runs from RAM. Considering this device has 32M of RAM, that would put the top of this block 256KB below the top of RAM, which is a sensible place to put the bootloader during early boot.