@baljemmet: Not sure I understood everything you said (
), but I think you gave me an awesome idea with which I should correct this bank issue.
Let me know what you guys think about it.
Sorry, it's a pretty HUGE post.My current version of the compiler is just generating a string array. Each string filled with a variable declaration (using EQU), instruction + arguments (if needed), ORG, label or function.
The idea is to use a structure which will keep 1 word from the hex file, it's address, the status value at that point, the instruction, arguments of the instruction if needed and other.
Something like this:
struct sDecompiler{
uint8_t first; // tells the decompiler it's the first instruction
uint16_t bank_offset; // tells the decompiler which bank is selected
/* register */
uint8_t status; // status variable
/* decompiler notifications */
uint8_t type; // GOTO-type instruction, CALL-type instruction, RETURN-instruction, LABEL address, FUNCTION address or neither of those
uint8_t used;
uint8_t status_change;
/* hex file */
uint16_t PC; // PC
uint16_t hex_word; // word at address PC
/* instruction and arguments */
char *instruction; // string of the used instruction in hex_word
int16_t f; // -1 if not used, 0x00-0x7F if used
int16_t d; // -1 if not used, 0-1 if used
int16_t b; // -1 if not used, 0x0-0x7 if used
int16_t k; // -1 if not used, literal or goto/call address if used
/* var, lbl, fct structure */
var_str *var; // string and address of variables
lbl_str *lbl; // string and address of labels
fct_str *fct; // string and address of functions
/* pointer to instructions */
struct sDecompiler *prev_instr; // if first instruction then NULL, else previous instruction pointer
struct sDecompiler *next_instr; // next instruction pointer, NULL if not allocated
};
The decompiler will now work in stages or phases (FSM style).
First stage:
Allocate one sDecompiler structure for each address/word in the hex file.
Set the first variable for the first sDecompiler and set the prev_instr/next_instr pointer.
Set the STATUS register to its POR value.
Set anything else to their unused/null'ed state.
Second stage:
From the first sDecompiler, set the instruction string.
Fill in the instructions' arguments (f, d, b and k).
Set the type variable to it's appropriate value.
If the instruction is a CALL instruction:
- Set the type variable of that instruction to CALL-type
- Go to the function called
- Set the type of the first instruction to FUNCTION
- Set the type variable to RETURN-type when you reach the return instruction
- Go back to the CALL instruction and continue decompiling
If it's a GOTO instruction:
- Set the type variable of that instruction to GOTO-type
- Go to the sDecompiler containging the PC used in the GOTO instruction
- Set the type of that instruction to LABEL
- Go back to the GOTO instruction and continue decompiling
Third stage:
From the first sDecompiler, go through every instruction. Check for any BCF or BSF instruction.
Change the status variable, if needed, and fill the bank_offset with either 0x00, 0x80, 0x100 or 0x180 for every instruction.
Fourths stage:
From the first sDecompiler.
Go through each instruction. Search for a GOTO or CALL instruction.
If it's a GOTO, check if the used variable is set to 1. If it is, skip, else do the following.
If it's a GOTO and wants to jump back to a previous PC:
Check if the status variable of the GOTO instruction is the same as the one in the sDecompiler structure containing that PC.
If it is the same, skip to the next instruction.
If it's not the same :
- Set the used variable of GOTO to 1, so the decompiler knows this GOTO has already been processed once before.
- Go to the sDecompiler containing the PC of the GOTO instruction.
- Set the status register to the one of the GOTO instruction.
- Set the status_change variable to 1, so the decompiler knows there has been a change in the status register during the decompiling. This will write the address of the next used registers instead of their names and add comments with the possible SFRs used. (Like rbk17c)
- Do this for each sDecompiler structure until you find an instruction BSF or BCF that changes the STATUS register, thus changing the bank or until you get to the sDecompiler function containing the GOTO instruction.
- If you come across a CALL instruction, go to that function and get back when you reach the RETURN, unless you find an instruction BSF or BCF that changes the STATUS register.
- Go back to the instruction after the GOTO.
If it's a GOTO and wants to jump to a further PC:
Check if the status variable of the GOTO instruction is the same as the one in the sDecompiler structure containing that PC.
If it is the same, skip to the next instruction.
If it's not the same :
- Set the used variable of GOTO to 1, so the decompiler knows this GOTO has already been processed once before.
- Go to the sDecompiler containing the PC of the GOTO instruction.
- Set the status register to the one of the GOTO instruction.
- Set the status_change variable to 1, so the decompiler knows there has been a change in the status register during the decompiling. This will write the address of the next used registers instead of their names and add comments with the possible SFRs used. (Like rbk17c)
- Do this for each sDecompiler structure until the you find an instruction BSF or BCF that changes the STATUS register, thus changing the bank or until you reach the end of the code.
- If you come across a CALL instruction, go to that function and get back when you reach the RETURN, unless you find an instruction BSF or BCF that changes the STATUS register.
- Go back to the instruction after the GOTO.
If it's a CALL, check if the used variable is set to 0 (or beter said, not 1).
If it is:
- Set all the status variables to the same value as the one used in the CALL instruction sDecompiler structure until you find an instruction BSF or BCF that changes the STATUS register, thus changing the bank or until you reach the RETURN instruction.
- Set all the used variables of the CALLs to the same function to 1. This will prevent the compiler to process the same function twice.
- Go back to the instruction after the CALL.
If it's not ( used = 1 ), check if the status register on that CALL is the same as the first instruction of that function.
If it is, skip, else :
- Set the status_change register of all the instructions in the function to 1 until you find an instruction BSF or BCF that changes the STATUS register, thus changing the bank or until you reach the RETURN instruction.
- Go back to the instruction after the CALL.
If it's a CALL (used or not !), go to the RETURN instruction of that function and check the status variable.
If the status variable of the RETURN instruction is the same as the status variable of the CALL, skip, else:
- Set the status variable of the instructions after the CALL to the same value as the one in the RETURN instruction until you find an instruction BSF or BCF that changes the STATUS register, thus changing the bank or until you reach the end of the code.
-- Note : If a CALL instruction is found, jump into the function and then go back when you find an instruction BSF or BCF that changes the STATUS register or when you reach the RETURN instruction.
-- Note : If it's the RETURN instruction go back to setting the status register, else go back to the instruction after the first CALL.
Once all this processing is done I can generate a string for each instruction, add a variable list, add labels, functions and even comments to SFR where needed.
God, this took forever to make and think through. Had to take a paper and write some stuff down to help.
Hope you guys can give me some feedback on this.
Cheers,
pyroesp