Author Topic: 8051: Open Source Disassembler  (Read 7107 times)

0 Members and 1 Guest are viewing this topic.

Offline sandeepkumarTopic starter

  • Newbie
  • Posts: 4
  • Country: in
8051: Open Source Disassembler
« on: December 08, 2020, 12:55:40 pm »
Greetings
I have an AT89S52 Microcontroller development board which I am using to learn microcontrollers. I am new to electronics and similar stuff but I have some experience in writing programs. I had some free time lying around, so watched tons of Ben Eater Videos and learnt on making good Assembly language programs for the dev board. And I also decided to make an assembler for it. (I have zero idea/knowledge in compilers/assemblers). So, I started my research on the workings of an assembler. Till now, I have made a Disassembler (which converts the hex file into asm file). Yes, I got sidetracked but atleast I made something. I'd appreciate you to take a look at it(It is Open Source), and give me some tips and advice for the assembler.

Forgive me for not documenting and commenting the code :-\

Link to Disassembler page:
https://github.com/w0qs1/disgeek51
« Last Edit: December 08, 2020, 12:57:44 pm by sandeepkumar »
 

Online newbrain

  • Super Contributor
  • ***
  • Posts: 1719
  • Country: se
Re: 8051: Open Source Disassembler
« Reply #1 on: December 08, 2020, 02:54:35 pm »
It is Open Source
Thanks for your effort!
I have zero experience with 8051, but I have some with software licenses:
This is not really Open Source until you state which license it is under (e.g. BSD, MIT) and possibly put a corresponding license file in the repo.
The way it is now, probably the only allowed thing that one can do with the code is reading it.

Disclaimer: IANAL, TINLA.
Nandemo wa shiranai wa yo, shitteru koto dake.
 
The following users thanked this post: sandeepkumar

Offline MarkL

  • Supporter
  • ****
  • Posts: 2133
  • Country: us
Re: 8051: Open Source Disassembler
« Reply #2 on: December 08, 2020, 04:52:28 pm »
If you are not aware of it, you might want to take a look at SDCC, if anything to see how it works.  Besides being a C-compiler, it also has an 8051/8052 assembler and disassembler included in the package.  All open source.

  https://sourceforge.net/projects/sdcc/
 
The following users thanked this post: sandeepkumar

Offline Peabody

  • Super Contributor
  • ***
  • Posts: 2008
  • Country: us
Re: 8051: Open Source Disassembler
« Reply #3 on: December 08, 2020, 05:06:45 pm »
There is also the Naken Assembler, which includes a disassembler, for 8051 and many other processors.  Very small, very fast, command line.

https://www.mikekohn.net/micro/naken_asm.php
 
The following users thanked this post: sandeepkumar

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4040
  • Country: nz
Re: 8051: Open Source Disassembler
« Reply #4 on: December 09, 2020, 12:15:34 am »
Had a quick look. Seems fine. You could probably refactor some stuff and make it a bit shorter, but it's fine.

An assembler is just the opposite :-) You already have all the information you need in your opcodes definition file.

One way to do it would be to read the opcodes definitions into an array but replace strings such as "code addr" with a regexp that matches a code address.

Then you can take each line of assembly language, strip off comments and convert whitespace runs to a single space each, and then test each opcode for a regexp match against your assembly language line.
 
The following users thanked this post: sandeepkumar

Offline sandeepkumarTopic starter

  • Newbie
  • Posts: 4
  • Country: in
Re: 8051: Open Source Disassembler
« Reply #5 on: December 09, 2020, 06:02:16 am »
I thought SDCC was just a C Compiler. I could have asked this forum before starting my own program :palm:
Thank you
 

Offline _joost_

  • Contributor
  • Posts: 16
  • Country: us
Re: 8051: Open Source Disassembler
« Reply #6 on: December 20, 2020, 05:22:21 am »
To write your own assembler (or language ‘x’ compiler) you generally will want to transform your assembly into a series of tokens and then parse these tokens according to your assmbly language. A tokenizer is essemtially a lookup table of all lexical elements of your language.

You should have a look at lex & yacc - lex is the tokenizer, yacc the parser. All unix platforms have these tools. Not sure about windows. Lex & Yacc Will produce C, if you prefer python, look into the Sly library.
« Last Edit: December 20, 2020, 05:24:50 am by _joost_ »
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14490
  • Country: fr
Re: 8051: Open Source Disassembler
« Reply #7 on: December 20, 2020, 06:29:10 pm »
I thought SDCC was just a C Compiler. I could have asked this forum before starting my own program :palm:
Thank you

SDCC comes with a C compiler, an assembler/disassembler and a linker, like most compilers.

Writing your own tools is not necessarily a complete waste of time though - if you learned something while doing this, then it's not wasted time.
It also gives you the opportunity to add some features you have in mind, that could otherwise be hard to add in existing code (even if open source) since you'll know and understand it not nearly as well.

Anyway - just so you don't feel bad. But certainly, to avoid reinventing the wheel when it's not strictly necessary/or if you have limited time, it's usually best to look for existing solutions first.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4040
  • Country: nz
Re: 8051: Open Source Disassembler
« Reply #8 on: December 20, 2020, 09:03:52 pm »
To write your own assembler (or language ‘x’ compiler) you generally will want to transform your assembly into a series of tokens and then parse these tokens according to your assmbly language. A tokenizer is essemtially a lookup table of all lexical elements of your language.

You should have a look at lex & yacc - lex is the tokenizer, yacc the parser. All unix platforms have these tools. Not sure about windows. Lex & Yacc Will produce C, if you prefer python, look into the Sly library.

lex and yacc seem real overkill for a simple assembler you write for you own use -- just yet another complex thing to learn first.

Assembly language is pretty simple. I'd just:

1) read a line of text. Remove any trailing comment, remove leading and trailing whitespace, compress other white space to a single space character
2) if nothing remains, go to step 1
3) split the line at the remaining spaces
4) if the first item ends with a colon then it's a label. Add it and the current . to the symbol table and remove it.
5) If nothing remains, go to step 1
6) the first word is an instruction mnemonic. Check it is valid.
7) The rest of the line is arguments for the instruction. These are pretty simple for 8051 :-)
 

Offline _joost_

  • Contributor
  • Posts: 16
  • Country: us
Re: 8051: Open Source Disassembler
« Reply #9 on: December 21, 2020, 06:02:53 am »
Nah, Lex & Yacc are not difficult - perhaps a bit strange looking at first but there is a ton of documentation on it.  Using Lex and Yacc is easy to keep all things organized and simple to push improvements/fixes/additions through.

6) the first word is an instruction mnemonic. Check it is valid.
"check if it is valid" implies several steps, not just the mnemonic code but also the addressing mode being used.  Easy to specify in yacc, annoying in <your language of choice> and not scalable.

But to each his own.
 

Online T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 21699
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: 8051: Open Source Disassembler
« Reply #10 on: December 21, 2020, 06:34:56 am »
"check if it is valid" implies several steps, not just the mnemonic code but also the addressing mode being used.  Easy to specify in yacc, annoying in <your language of choice> and not scalable.

The state machine method is to match the mnemonic, which determines criteria used to parse the subsequent tokens.

This is just formalized in a grammatical system, of course.

The real question, I suppose, is how much trouble you are willing to spend, implementing and updating these rules.  For an assembler, it's not a whole lot of work, overall.  For a higher level language, you're going to want something to streamline the process; even moreso if it's an experimental, evolving language.

More specifically, the state machine has an exponential number of states, many of which need to be implemented by their own code path.  When N is even just a modest size, it's almost guaranteed intractable; when still very small, it may be tractable.

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14490
  • Country: fr
Re: 8051: Open Source Disassembler
« Reply #11 on: December 21, 2020, 06:29:46 pm »
I agree with Bruce here. Unless you actually want to take such a project as an opportunity to learn lex/yacc/bison... then it's not just overkill, but also a huge waste of time IMHO.
Now if OTOH, you master those tools, and are actually less inclined to write a parser yourself, then why not.

But, in any case, it also depends on what exactly you want to implement. Of course a disassembler alone (which was the original topic) doesn't require any kind of parser. (Or you may reply that it does, but it's such a simple "parser" of binary data that it wouldn't make sense to use a classic parser for this.)

For an assembler, you'll need a parser, and then its complexity will largely depend on what you really want to support in your assembler. Merely translating assembly code is easy. But now if you want to support complex expressions - needed for defining constants, for instance, or for implementing some kind of macro-assembler, then it becomes a bit more complex. It's not ultra difficult, certainly, but if you've never implemented that before, it will probably be much harder to get right than you initially thought.

Now you'd have alternatives for this, of course: you can implement a very basic assembler with a very simple parser, and then support constants and macros using an existing preprocessor to preprocess the input assembly code. Flexible enough. The parsing itself for a simple assembler would then be a matter of a few hours of work, even if you've never done it before. Just my 2 cents.

 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: 8051: Open Source Disassembler
« Reply #12 on: December 21, 2020, 06:42:17 pm »
The real complexity for me is worrying about user errors and producing a few lines about that in a way that is human-understandable - what? where? why? - even suggesting - how? - can the final user fix his/her mistakes.

That's the real hard job :D
« Last Edit: December 21, 2020, 06:54:56 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4040
  • Country: nz
Re: 8051: Open Source Disassembler
« Reply #13 on: December 21, 2020, 10:39:37 pm »
The real complexity for me is worrying about user errors and producing a few lines about that in a way that is human-understandable - what? where? why? - even suggesting - how? - can the final user fix his/her mistakes.

That's the real hard job :D

And completely unnecessary for a person fun project you do just because you don't want to have to write hex opcodes and recalculate branch offsets in your head every time you change something.

I did a simple 6502 assembler in Applesoft BASIC on the Apple ][ as a 17 year old in 1980 with no computer science or programming knowledge of anything that wasn't in the owner's manual for exactly that reason.
 

Offline ledtester

  • Super Contributor
  • ***
  • Posts: 3039
  • Country: us
Re: 8051: Open Source Disassembler
« Reply #14 on: December 22, 2020, 12:30:11 am »
Back in the day "meta-assemblers" or "universal (cross) assemblers" -- programs which could assemble machine language for any processor just given a specification -- were popular, e.g.:

https://www.fourmilab.ch/documents/univac/manuals/pdf/Software/UP-8453_MASM_Programmers_Ref_1977.pdf

https://archive.org/details/universalcrossas803komi/page/2/mode/2up

« Last Edit: December 22, 2020, 01:01:47 am by ledtester »
 

Online T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 21699
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: 8051: Open Source Disassembler
« Reply #15 on: December 22, 2020, 02:15:33 am »
I have a copy of TASM, which covers, well, I'll copy part of the listing,

Code: [Select]
TASM.EXE          - TASM Assembler, executable
TASMP.EXE         - TASM Assembler, executable (DOS Protected Mode)
TASM48.TAB        - 8048 Instruction definition table
TASM51.TAB        - 8051 Instruction definition table
TASM65.TAB        - 6502 Instruction definition table
TASM85.TAB        - 8085 Instruction definition table
TASM80.TAB        - Z80  Instruction definition table
TASM05.TAB        - 6805 Instruction definition table
TASM3210.TAB      - TMS32010  Instruction definition table
TASM3225.TAB      - TMS32025  Instruction definition table
TASM68.TAB        - 6800/6801 Instruction definition table
TASM70.TAB        - TMS7000   Instruction definition table
TASM96.TAB        - 8096      Instruction definition table

Not sure how expressive that is (could it handle the complexity or arguments of x86 or ARM ISAs?) but that's certainly some coverage of the older ones. :)

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline _joost_

  • Contributor
  • Posts: 16
  • Country: us
Re: 8051: Open Source Disassembler
« Reply #16 on: December 22, 2020, 03:35:04 am »
I agree with Bruce here. Unless you actually want to take such a project as an opportunity to learn lex/yacc/bison... then it's not just overkill, but also a huge waste of time IMHO.
Now if OTOH, you master those tools, and are actually less inclined to write a parser yourself, then why not.

Opinions of course; i am very well versed in lex yacc - never had much trouble getting familiar with it. With a tutorial or two, one graps the core concepts, then it is just a bit of practice (for assmbly, a more advanced language takes time of course).  These tools allow one to be busy with the “application part” of the problem, rather than the “coding part”. Its syntax forms a good documentation and allows for expansion (such as simple, one instruction conditional logic and loops) if wanted.

I don’t see the “huge overkill” - unless you had some minutely small tuned loop in mind which imho with todays cpus and memory is not needed.
 

Offline cj7hawk

  • Newbie
  • Posts: 8
  • Country: au
Re: 8051: Open Source Disassembler
« Reply #17 on: December 22, 2020, 02:51:12 pm »
Nice to see this cool project. I remember writing my own assemblers and disassembled back in the 80s.

I noticed you're using the Atmel series processors, so if you want to get into other 8051's in the series, I wrote an SPI programmer for them and it was published in DIYODE magazine about a year ago. It also includes a basic SPI programmer that can be made with off the shelf parts ( a single logic IC, some resistors, caps, etc ) to program the AT89LP4052 from a serial port, so you don't have to purchase a programmer... And there was software to then turn the AT89LP4052 into a super-programmer that can program just about anything with a SPI interface, or even run displays.

Might be a handy thing to make when you progress from the development board to making your own designs. And it's a free design that anyone can make at home. Simple enough to even make on veroboard in about an hour... Beginner level circuitry.

The entire DIY project can be found here.
https://diyodemag.com/projects/serial_to_spi_programmer

Hope this might be of use -

David
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf