Author Topic: Microchips mid range mcu register file map  (Read 6725 times)

0 Members and 1 Guest are viewing this topic.

Offline pyroespTopic starter

  • Regular Contributor
  • *
  • Posts: 180
  • Country: be
    • Nicolas Electronics
Microchips mid range mcu register file map
« on: July 11, 2012, 11:48:27 pm »
Hi everyone,

I have been working on a MPASM decompiler for 14 bit core PIC mcu (mid range mcu) for a couple of days now.
I've got a semi-working decompiler as of now, but I have been improving it a lot.

My decompiler doesn't have SFRs yet, so every indirect addressing is being decompiled as 'var_XXX'.
I found this pdf : http://ww1.microchip.com/downloads/en/devicedoc/33023a.pdf
It's the PICmicro mid range mcu reference manual.

I've made a CSV file containing the 35 instructions and I wanted to make another one with the SFRs, but it seems that the register file map in that pdf isn't correct/complete.

I'm currently working with a 12F675 and the SFR map in the pdf (fig 6-5, p. 6-10) is different from the one in the 12F675 datasheet.
There are different SFR names in the datasheet.

So, my questions are:
How would you go from here on ?
Should I make one SFR CSV file per mid-range mcu ? That's going to be a lot of files :-\ .

Thanks for anyone helping out,

pyroesp

PS: CSV file looks like this :
Quote
ADDWF,000111dfffffff,0
ANDWF,000101dfffffff,0
CLRF,0000011fffffff,0
CLRW,0000010xxxxxxx,0
Instruction, bin format and exception. Exception is used to tell the decompiler whether it's a CALL instruction or a GOTO-type instruction.
I can then make the difference between a label address and a function address.

PPS: The decompiler shouldn't have any OS related functions so it should be cross platform for you Linux and Mac fans.
« Last Edit: July 12, 2012, 12:08:39 am by pyroesp »
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 8380
Re: Microchips mid range mcu register file map
« Reply #1 on: July 12, 2012, 10:43:04 am »
Should I make one SFR CSV file per mid-range mcu ? That's going to be a lot of files :-\ .
This seems to be the best way to do it. An alternative is to define the "common" registers in a base file, and model-specific ones in separate files for each model, to minimise duplication. Writing something to extract the data from all the datasheets and produce the files automatically would also be a good idea if there are enough models you want to support.

Obviously each model is going to have a different set of registers, that's why their part numbers are different.
 

Offline pyroespTopic starter

  • Regular Contributor
  • *
  • Posts: 180
  • Country: be
    • Nicolas Electronics
Re: Microchips mid range mcu register file map
« Reply #2 on: July 12, 2012, 03:30:18 pm »
I think I'll go with a common register CSV and model specific register CSV.

I could set some arguments to the program like "prog.exe -h hexfilepath -mcu midrange -p PICXXFXXX".
'midrange' tells the program to use the common register file for midrange mcu and PICXXFXXX the model specific register file.

Making a program to extract data from datasheets will be a waste of time to do. Using a script language wouldn't be a bad idea, but I'm a C/asm guy.

Anyways, thanks for your ideas.

pyroesp
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4277
  • Country: us
Re: Microchips mid range mcu register file map
« Reply #3 on: July 13, 2012, 12:38:36 am »
Quote
Making a program to extract data from datasheets will be a waste of time to do.
Extract them from microchip provided include (.h) files distributed with their freeware compilers.
(watch the licensing; it may not be permitted for YOU to distribute those .h files, but everyone should be able to get them.)
 

Offline notsob

  • Frequent Contributor
  • **
  • Posts: 704
  • Country: au
Re: Microchips mid range mcu register file map
« Reply #4 on: July 13, 2012, 12:44:45 am »
AND the MOST common failing with PICs - READ the ERRATA for the model you are working with
 

Offline poorchava

  • Super Contributor
  • ***
  • Posts: 1672
  • Country: pl
  • Troll Cave Electronics!
Re: Microchips mid range mcu register file map
« Reply #5 on: July 13, 2012, 12:34:17 pm »
+1 with both of my hands and both of my feet for that. I wonder if at some point the errata won't exceed the size of the datasheet itself (dspic30?)
I love the smell of FR4 in the morning!
 

Offline pyroespTopic starter

  • Regular Contributor
  • *
  • Posts: 180
  • Country: be
    • Nicolas Electronics
Re: Microchips mid range mcu register file map
« Reply #6 on: July 13, 2012, 05:28:54 pm »
Thanks for the ideas guys.

I thought about the includes (.inc) files when writing an MPASM code but it doens't only contain the address of the registers.
It also has values and I'm not sure how to tell my program what is a register address and what isn't.

Anyways, got a weird bug in the decompiler, codeblocks debugger doesn't help (at all !).
I've also started programming what I said. One common register file and one specific register file.

Instruction stored in instruction folder, common register stored in common folder and specific registers stored in specific folder.
Parsing the CSV and hex doesn't seem to be a problem. Just the decompiler function does some weird sh** when I want to read a variable in a structure.

pyroesp
 

Offline pyroespTopic starter

  • Regular Contributor
  • *
  • Posts: 180
  • Country: be
    • Nicolas Electronics
Re: Microchips mid range mcu register file map
« Reply #7 on: July 14, 2012, 05:46:08 pm »
Good news guys, I think my decompiler is pretty much done. Still need to add something to generate an EQU list for the variables and add SFRs to it.

Can some of you send me some mid range PIC assembler codes with hex code (not too big please) ?
I'd like to test the decompiler a bit.

pyroesp

EDIT:
Damn, I forgot about the bank switching. I need to add something that will track the STATUS register so the program knows it should add some value to any address after switching banks.
bank 0 -> 0x00
bank 1 -> 0x80
bank 2 -> 0x100
bank 3 -> 0x180

Pain in the *** if you ask me.
« Last Edit: July 14, 2012, 10:21:12 pm by pyroesp »
 

Offline rbk17c

  • Newbie
  • Posts: 7
Re: Microchips mid range mcu register file map
« Reply #8 on: July 31, 2012, 11:22:49 pm »
I think tracking the BANK is next to impossible. - Try tracing down an error, caused by this  :o :o :o :( >:( >:( >:( .

Perhaps you should only attempt to translate if you use 'access' mode, like movwf 0xf0, ACCESS.
Om pic16's it might be next to impossible, consitter decoding:

banksel TRIS
 call setsome
bcf BSR, 1
 call setsome
...
setsome:
 movlw 0x55
 movwf TRIS
 return

- Gulp !
Good luck anyway,

/holger

 

Offline pyroespTopic starter

  • Regular Contributor
  • *
  • Posts: 180
  • Country: be
    • Nicolas Electronics
Re: Microchips mid range mcu register file map
« Reply #9 on: August 01, 2012, 01:23:40 am »
BANKSEL TRIS is just a macro for setting the bank.
The compiler will convert that to some MPASM code.
BANKSEL is not an ASM instruction.

Not sure what the compiler translates it to will have a quick look at the MPASM user guide.

EDIT:
MPASM User Guide : http://ww1.microchip.com/downloads/en/devicedoc/33014j.pdf
see 4.7 p.52 : "For 14-bit instruction width (most PIC12/PIC16) devices, bit set/clear instructions on the STATUS register will be generated".

EDIT 2:
Also, it's a decompiler. Hex to ASM code, that's it. It does not track any errors.
The compiler will warn the user if there's any banking issues and if there's, then it won't compile the code.

Here is a link to the first release of the decompiler : https://www.eevblog.com/forum/projects-designs-and-technical-stuff/mpasm-decompiler/

PS: Don't know what BSR is ... ? Can't find anything in midrange PIC pdf

EDIT 3:
BSR is for enhanced midrange PIC microcontrollers. My decompiler is mid-range only as of now.

Anyways, It's just a matter of keeping track of BSR register. BANKSEL will probably generate something like BSF BSR, 1 or so
Not impossible because I did it with the midrange mcu.

Cheers,

pyroesp
« Last Edit: August 01, 2012, 01:44:28 am by pyroesp »
 

Offline rbk17c

  • Newbie
  • Posts: 7
Re: Microchips mid range mcu register file map
« Reply #10 on: August 01, 2012, 02:56:53 pm »
Oh, yes sorry. Let me try again.
This could be found on a pic12f629:

bcf 0x03, 5 (STATUS, RP0 =0)
call set_open_tristate
bsf 0x03, 5 (STATUS, RP0 =1)
call set_open_tristate
...
set_open_tristate
    movlw 0xff
    movwf 0x05       (GPIO or TRISIO)
    return

How will you ever decode 0x05? It's GPIO on the 1. call and TRIS on the 2. :- (

What Iam trying to say is: tracking STATUS is hopeless to impossible.
How about, instead you just guess at BANK0, and translate to something like:
    movwf GPIO      ; or TRISIO
E.G. use comments in you decoded output ?
Recompiling, should give you the same code.

please don't give up - you are getting there. :-)

/holger



 

Offline pyroespTopic starter

  • Regular Contributor
  • *
  • Posts: 180
  • Country: be
    • Nicolas Electronics
Re: Microchips mid range mcu register file map
« Reply #11 on: August 01, 2012, 03:12:44 pm »
Oh , I see what you did there ...
Yes this will be a problem.

The decompiler will probably see it as a TRISIO if the bank didn't change between the call and the actual function.
That's a tricky one.

That's a flaw in my decompiler.
It just decodes every word into ASM code. If it's a call, it doesn't jump to it, it just continues to decode.

So if you have something like this:
Code: [Select]
    BCF STATUS, RP0
    CALL data_out
    BSF STATUS, RP0
...
data_out
    MOVLW 0xFF
    MOVWF GPIO

It will get decoded as MOVWF TRISIO  :-\.

Anyways, it's just "visual" bug. The address of TRISIO is the same as the GPIO one except for a 0x80 gap.
It should still compile correctly after decompiling, even thought it says TRISIO instead of GPIO.

pyroesp

PS: I'm not giving up, I actually released the decompiler already, even thought it has some flaws. But I'm no expert so I think it's pretty good for now :P

EDIT:
For the bug I just mentioned I could have an array which will have the STATUS register value before the CALL of said function and the address of that function.
All I need to do is check if the address of the hex data is the same as one of the addresses from that array.
If it is, I'll need to set the STATUS register from that array.

If multiple calls are made of the same function, then only the first call will be saved in the array.
« Last Edit: August 01, 2012, 03:18:22 pm by pyroesp »
 

Offline rbk17c

  • Newbie
  • Posts: 7
Re: Microchips mid range mcu register file map
« Reply #12 on: August 01, 2012, 07:55:47 pm »
:) I agree, Its good.

I do not see the previous method as a bug. Iam sure I could think up 10 more ways to make the status impossible to predict. :-)

My point was that there is NO way to know the bank, at a given position in the code. Trying to track it will increase the likelihood of getting it right,
but it could still be wrong, leading to confusion.

Better to specify that you ALWAYS use BANK0-name, and give a hint (comment) that it might as well be something else; like:
  movwf PIR1     ; alteratives - PIE1
- That would ALWAYS be right.

OK - I give you that doing this for a pic16f1826 would be just ridiculous  ;)   :
  movwf PIR1     ; alteratives - PIE1, CM1CON0, EEADRL SSP1BUF, CCPR1K, ...   

And how cool is a disassembler that adds comments!

/holger

 

Offline pyroespTopic starter

  • Regular Contributor
  • *
  • Posts: 180
  • Country: be
    • Nicolas Electronics
Re: Microchips mid range mcu register file map
« Reply #13 on: August 01, 2012, 11:01:25 pm »
:) I agree, Its good.
Thanks  ;D.

Better to specify that you ALWAYS use BANK0-name, and give a hint (comment) that it might as well be something else; like:
  movwf PIR1     ; alteratives - PIE1
- That would ALWAYS be right.

OK - I give you that doing this for a pic16f1826 would be just ridiculous  ;)   :
  movwf PIR1     ; alteratives - PIE1, CM1CON0, EEADRL SSP1BUF, CCPR1K, ...   

And how cool is a disassembler that adds comments!
I'll try to add the comments to the code but I'm still going to keep track of it so it'll still write the register it thinks is used at that point.
Even thought it might be wrong but :P.

Iam sure I could think up 10 more ways to make the status impossible to predict. :-)
I'll be happy to see that, could learn a few tricks ;).

pyroesp
 

Offline baljemmett

  • Supporter
  • ****
  • Posts: 665
  • Country: gb
Re: Microchips mid range mcu register file map
« Reply #14 on: August 01, 2012, 11:45:36 pm »
Better to specify that you ALWAYS use BANK0-name, and give a hint (comment) that it might as well be something else; like:

It should be possible to show the bank n name where you're sure bank n will be selected -- e.g. by keeping track of which program words can be reached indirectly (-> are the target of a goto, call or skip, or within the range of words reachable by a manipulation of PCL, remembering that in the latter case you need to track the possible values of PCLATH too) -- and then if a statement can only be reached sequentially and the bank selection bits are in a known state at that point, print the relevant register name.  If there's any doubt at all, print the alternatives in a comment.  Or print best-guess and always list the alternatives just in case.

You can then get fancy by keeping track of whence indirect control flow came and performing the same operation to see if a selected bank can be determined definitely at those jump sites.  And then you're a mere hop, skip and jump away from needing to solve the halting problem to make any further progress ;)

Doing the second part gives you the ability to provide cross-references too -- 'this code is called from X, Y and Z and is the target of a goto at A and B' style stuff, which is a rather nice feature to have in a disassembler anyway.  I believe IDA provides this with automatically-generated comments at each point that can be reached indirectly, for instance.
 

Offline pyroespTopic starter

  • Regular Contributor
  • *
  • Posts: 180
  • Country: be
    • Nicolas Electronics
Re: Microchips mid range mcu register file map
« Reply #15 on: August 02, 2012, 03:40:41 am »
@baljemmet: Not sure I understood everything you said ( :-[ ), but I think you gave me an awesome idea with which I should correct this bank issue.
Let me know what you guys think about it.

Sorry, it's a pretty HUGE post.

My current version of the compiler is just generating a string array. Each string filled with a variable declaration (using EQU), instruction + arguments (if needed), ORG, label or function.

The idea is to use a structure which will keep 1 word from the hex file, it's address, the status value at that point, the instruction, arguments of the instruction if needed and other.

Something like this:
Code: [Select]
struct sDecompiler{
    uint8_t first; // tells the decompiler it's the first instruction
    uint16_t bank_offset; // tells the decompiler which bank is selected

    /* register */
    uint8_t status; // status variable

    /* decompiler notifications */
    uint8_t type; // GOTO-type instruction, CALL-type instruction, RETURN-instruction, LABEL address, FUNCTION address or neither of those
    uint8_t used;
    uint8_t status_change;

    /* hex file */
    uint16_t PC; // PC
    uint16_t hex_word; // word at address PC

    /* instruction and arguments */
    char *instruction; // string of the used instruction in hex_word

    int16_t f; // -1 if not used, 0x00-0x7F if used
    int16_t d; // -1 if not used, 0-1 if used
    int16_t b; // -1 if not used, 0x0-0x7 if used
    int16_t k; // -1 if not used, literal or goto/call address if used

    /* var, lbl, fct structure */
    var_str *var; // string and address of variables
    lbl_str *lbl; // string and address of labels
    fct_str *fct; // string and address of functions

    /* pointer to instructions */
    struct sDecompiler *prev_instr; // if first instruction then NULL, else previous instruction pointer
    struct sDecompiler *next_instr; // next instruction pointer, NULL if not allocated
};

The decompiler will now work in stages or phases (FSM style).

First stage:
Allocate one sDecompiler structure for each address/word in the hex file.
Set the first variable for the first sDecompiler and set the prev_instr/next_instr pointer.
Set the STATUS register to its POR value.
Set anything else to their unused/null'ed state.

Second stage:
From the first sDecompiler, set the instruction string.
Fill in the instructions' arguments (f, d, b and k).
Set the type variable to it's appropriate value.

If the instruction is a CALL instruction:
- Set the type variable of that instruction to CALL-type
- Go to the function called
- Set the type of the first instruction to FUNCTION
- Set the type variable to RETURN-type when you reach the return instruction
- Go back to the CALL instruction and continue decompiling

If it's a GOTO instruction:
- Set the type variable of that instruction to GOTO-type
- Go to the sDecompiler containging the PC used in the GOTO instruction
- Set the type of that instruction to LABEL
- Go back to the GOTO instruction and continue decompiling

Third stage:
From the first sDecompiler, go through every instruction. Check for any BCF or BSF instruction.
Change the status variable, if needed, and fill the bank_offset with either 0x00, 0x80, 0x100 or 0x180 for every instruction.

Fourths stage:
From the first sDecompiler.
Go through each instruction. Search for a GOTO or CALL instruction.

If it's a GOTO, check if the used variable is set to 1. If it is, skip, else do the following.

If it's a GOTO and wants to jump back to a previous PC:
Check if the status variable of the GOTO instruction is the same as the one in the sDecompiler structure containing that PC.
If it is the same, skip to the next instruction.
If it's not the same :
- Set the used variable of GOTO to 1, so the decompiler knows this GOTO has already been processed once before.
- Go to the sDecompiler containing the PC of the GOTO instruction.
- Set the status register to the one of the GOTO instruction.
- Set the status_change variable to 1, so the decompiler knows there has been a change in the status register during the decompiling. This will write the address of the next used registers instead of their names and add comments with the possible SFRs used. (Like rbk17c)
- Do this for each sDecompiler structure until you find an instruction BSF or BCF that changes the STATUS register, thus changing the bank or until you get to the sDecompiler function containing the GOTO instruction.
- If you come across a CALL instruction, go to that function and get back when you reach the RETURN, unless you find an instruction BSF or BCF that changes the STATUS register.
- Go back to the instruction after the GOTO.

If it's a GOTO and wants to jump to a further PC:

Check if the status variable of the GOTO instruction is the same as the one in the sDecompiler structure containing that PC.
If it is the same, skip to the next instruction.
If it's not the same :
- Set the used variable of GOTO to 1, so the decompiler knows this GOTO has already been processed once before.
- Go to the sDecompiler containing the PC of the GOTO instruction.
- Set the status register to the one of the GOTO instruction.
- Set the status_change variable to 1, so the decompiler knows there has been a change in the status register during the decompiling. This will write the address of the next used registers instead of their names and add comments with the possible SFRs used. (Like rbk17c)
- Do this for each sDecompiler structure until the you find an instruction BSF or BCF that changes the STATUS register, thus changing the bank or until you reach the end of the code.
- If you come across a CALL instruction, go to that function and get back when you reach the RETURN, unless you find an instruction BSF or BCF that changes the STATUS register.
- Go back to the instruction after the GOTO.

If it's a CALL, check if the used variable is set to 0 (or beter said, not 1).
If it is:
- Set all the status variables to the same value as the one used in the CALL instruction sDecompiler structure until you find an instruction BSF or BCF that changes the STATUS register, thus changing the bank or until you reach the RETURN instruction.
- Set all the used variables of the CALLs to the same function to 1. This will prevent the compiler to process the same function twice.
- Go back to the instruction after the CALL.

If it's not ( used = 1 ), check if the status register on that CALL is the same as the first instruction of that function.
If it is, skip, else :
- Set the status_change register of all the instructions in the function to 1 until you find an instruction BSF or BCF that changes the STATUS register, thus changing the bank or until you reach the RETURN instruction.
- Go back to the instruction after the CALL.

If it's a CALL (used or not !), go to the RETURN instruction of that function and check the status variable.
If the status variable of the RETURN instruction is the same as the status variable of the CALL, skip, else:
- Set the status variable of the instructions after the CALL to the same value as the one in the RETURN instruction until you find an instruction BSF or BCF that changes the STATUS register, thus changing the bank or until you reach the end of the code.
-- Note : If a CALL instruction is found, jump into the function and then go back when you find an instruction BSF or BCF that changes the STATUS register or when you reach the RETURN instruction.
-- Note : If it's the RETURN instruction go back to setting the status register, else go back to the instruction after the first CALL.


Once all this processing is done I can generate a string for each instruction, add a variable list, add labels, functions and even comments to SFR where needed.


God, this took forever to make and think through. Had to take a paper and write some stuff down to help.
Hope you guys can give me some feedback on this.

Cheers,

pyroesp
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf