EEVblog Electronics Community Forum
Products => Computers => Programming => Topic started by: metertech58761 on November 27, 2022, 06:47:11 pm
-
When going through a bit of 6800 assembly code (actually 6801 / HC11), I came across this section and I know it can be rearranged better, but I cannot get it to work properly. I would like to see if anyone can point out how to properly rearrange these instructions, especially given that the RTS is smack dab in the middle and that a BRA is needed to continue the loop. Now, this is after I already cleaned the 'spaghetti' out of the first half and replaced a loop counter with CPX.
What this does is convert a 7-digit BCD value (readout+7 is MSB, readout+1 is LSB, readout+0 is used as a checksum digit elsewhere) into a 3-byte binary value.
addrCalc CLC
CLR temp6
CLR addrL
CLR addrM
CLR addrH
LDX readout+7
func01_1 LDAA addrM
LDAB addrL
ADDB $00,X
STAB addrL
BCC func01_2
ADDA #$01
STAA temp6
STAA addrM
BCC func01_2
LDAA addrH
ADDA #$01
STAA addrH
LDAA temp6
func01_2 CPX readout
BNE func01_5
RTS
func01_5 ASLD
STAB addrL
STAA temp6
STAA addrM
ROL addrH
LDAA addrH
STAA temp7
LDAA temp6
ASLD
ROL temp7
ASLD
ROL temp7
ADDB addrL
STAB addrL
ADCA addrM
STAA addrM
BCC func01_6
INC addrH
func01_6 LDAA temp7
ADDA addrH
STAA addrH
DEX
BRA func01_1
-
Friends don't just throw 68HC11 assembly at friends and ask what's wrong, friends describe the idea of what the code tries to implement (you mostly did, but not fully), and how. ;)
Is this packed (two decimal digits per byte) or unpacked BCD (each byte being 0..9 only)?
Or is the problem that you use an unpacked algorithm, but the data is really packed?
If it is packed BCD, I guess that your input is something like
Byte Bits Description
B0 0..7 Checksum digit, ignored
B0 4..7 D0, least significant BCD digit we're interested in
B1 0..3 D1
B1 4..7 D2
B2 0..3 D3
B2 4..7 D4
B3 0..3 D5
B3 4..7 D6, most significant BCD digit we're interested in
but your description kinda implies unpacked BCD. Which is it?
In any case, output is bytes O0, O1, O2, such that
O0 + 256*(O1 + 256*O2) = D0 + 10*(D1 + 10*(D2 + 10*(D3 + 10*(D4 + 10*(D5 + 10*D6)))))
i.e.
O0 + 256*O1 + 65536*O2 = D0 + 10*D1 + 100*D2 + 1000*D3 + 10000*D4 + 100000*D5 + 1000000*D6
right? (If you use MSB, just swap O0 and O2 above, they're just labels anyway. I like to use arithmetic numbering, which starts at zero, and increases as the significance increases, because that matches the math: value = sum digit[n]*radix^n. I'm not a "little-endian zealot".)
The core operation here, if I were to implement this from scratch, would be
A ← A*10+B = (A << 3) + (A << 1) + B
where A is the three-byte register, initialized to zero. For 68HC11, we of course can only shift/rotate one bit at a time, so we need a temporary three-byte area for the shifting, since we need to do that sum. We can do with A, B, D accumulators, if we do it before we extract the new digit B, though.
Can I use the NXP M68HC11E manual (https://www.nxp.com/docs/en/data-sheet/M68HC11E.pdf) for reference, or does your architecture have a different instruction set?
-
In this instance, yes, it is unpacked BCD, entered from a keypad before being passed to this routine.
The HC11 instruction set will be just fine.
If you have a better algorithm in mind, I'd be all ears. If you say you need a third temporary byte, use the mnemonic temp5.
Whoever wrote the original code did not know the first thing about 68xx assembly - what I started with was atrocious!
-
I would implement the following pseudocode:
Set addr to 0
Set X to address of first digit
Optional: jump to First
Loop:
Multiply addr by 2
Copy addr to temp
Multiply addr by 2
Multiply addr by 2
Add temp to addr
First:
Load next decimal digit,
and add it to addr
Increment X
If we are not past the digit buffer yet,
jump to Loop
otherwise return from subroutine
noting that temp=addr*2; addr=temp*2*2+temp is equivalent to addr=10*addr.
I have a nagging feeling we could use the Y register for our advantage here, perhaps via XGDY, which exchanges D and Y, to avoid using temp5, and speed things up.. but me brain isn't working well enough now, captain!
I do not have an emulator handy (and can't recall where I could get one quickly for Linux), so consider the following implementation suspect, probably buggy, in need of checking:
The conversion starts with clearing addrH:addrM:addrL, and I do assume addrH has the lowest address and addrL the highest address:
CLR addrL ; addrL = 0
CLR addrM ; addrM = 0
CLR addrH ; addrH = 0
Use X for the address of incoming digits, in increasing order of importance.
LDX readout+1 ; address of least significant digit
JMP first
Next, we have a subroutine that multiplies addr by 10, using temp5 as a scratchpad.
loop LDD addrM ; A=addrM, B=addrL
LSLD
STD addrM
LDAA addrH
ROLA
STAA addrH ; addr now multiplied by 2,
STAA temp5 ; and temp5=addrH
LDAA addrM ; We didn't mangle B above, so reloading addrM suffices
LSLD
ROL temp5 ; second doubling done
LSLD
ROL temp5 ; third doubling done
ADDD addrM ; A=A+addrM, B=B+addrL, carry set if necessary
STD addrM ; addrL and addrM updated
LDAA temp5
ADCA addrH
STAA addrH ; addrH updated
We then want to load the next digit to B,
first LDAB $00,X
and add it to the 24-bit addr:
CLRA
ADDD addrM ; A=addrM, B=addrL+digit
STD addrM ; addrL and addrM updated
LDAA #$0
ADCA addrH
STAA addrH ; addrH updated
Advance to next digit, if any left.
INX
CPX readout+8 ; past the incoming data?
BLO loop
RET
Can you follow the logic above? Could you test if it works?
If it is not fast enough, I'm pretty sure it can be sped up somewhat. I never write fast code on the first run, I strive for correctness.
(In fact, if you think about it, it might be useful to call something like this directly from the keypad, using an extra byte to count the number of digits already pressed. Would also save about seven bytes of RAM. Even if the 68HC11 is running off a 32kHz clock or something, the above is faster than even the fastest human pressing digits.)
-
Just want to confirm a few details - the 24-bit address, it is stored as high / mid / low.
Given the atrocious code I began with, I don't think speed is terribly important (the clock speed of the MPU was a mere 1.0MHz - about the same as a Commodore 64)
Also, there's plenty of space in the ROM - especially after my cleanup / optimization of the overall code, so as long as the code is readable, we should be fine.
-
I think you can study this kind of routines from the old Motorola BUFALO/11 source.
The assembly source is opensource.
-
Do you know of any emulation environments for 68HC11? They have existed for sure, but are there any that haven't bit-rotted too much?
The instruction set is small enough that a simulator for the symbolic assembly would be easy enough to write, so that functions like the above could be simulated and checked for correctness. Binary simulators are much more work.
-
I have a few good emulators and debuggers for Windows XP(1)
None for Linux. Tried a few, all toyish.
I prefer real hardware, with "noICE" support.
(noICE is a commercial tool, again, for Windows).
This stuff is useful for the HC-timers and misc rather than for the HC11 ISA.
For GNU/Linux I wrote a downloader tool.
Kind of avr-dude for hc11. Nothing special.
It supports both bufalo and evb/evbu boards.
(1) I specifically bought a second hand laptop for those Windows XP dev tools.
There is a nice "talker" written in hc11 assembly, the host control program is written in Delphi Pascal.
I'd like to re-write it in C-xmotif. One day :D
-
- THRSim11 v5.30
- Proteus v8.0 (comes with an hc11 simulator)
-
Would a symbolic ASM simulator (running on human-readable assembly source, not modeling any peripherals, only the instruction set) be of any use?
-
LDX readout+7
In 680x assembly, that would have to be
LDX #readout+7
to make any sense.
Given the whole thing, including sample 'readout', I could probably assemble that for 6809 and see how it behaves.
-
Would a symbolic ASM simulator (running on human-readable assembly source, not modeling any peripherals, only the instruction set) be of any use?
limited use, hc11 makes sense for its peripherals :-//
ummm, there should be sim support in gdb.
or, repo like this (https://github.com/f4grx/hc11-sim) (hc11-sim). Never tried.
-
Do you know of any emulation environments for 68HC11? They have existed for sure, but are there any that haven't bit-rotted too much?
The instruction set is small enough that a simulator for the symbolic assembly would be easy enough to write, so that functions like the above could be simulated and checked for correctness. Binary simulators are much more work.
You could look around for a 68HC11 evaluation board. Motorola made several different models using the BUFFALO monitor S/W. There were also some third party evaluation boards providing similar features.
An evaluation board will generally have a fairly reasonable debugger built into it and should prove adequate for most program testing purposes. An evaluation board is also useful when you come to add extra peripheral devices your system and it allows you to debug on the actual hardware rather than trying to simulate those peripheral devices in a S/W emulator.
-
You could look around for a 68HC11 evaluation board. Motorola made several different models using the BUFFALO monitor S/W. There were also some third party evaluation boards providing similar features.
I had one made by Axiom, something like 25 years ago!
-
you have
BCC func01_2
and then
func01_2 CPX readout
BNE func01_5
RTS
Won't exiting the latter with a RTS after initially branching there and consequently not pushing the stack eventually cause a stack underflow?
-
You say 'readout+0 is used as a checksum digit elsewhere', but it will get added in because your final comparison to exit is with readout, not readout+1
Also I don't know how you distinguish between an address as a literal (usually expressed as #address) and an address as a source/destination.
I would have thought you'd be using LDX #readout+7 and CPX #readout+1 but I don't know what assembler you are using.
The code is a bit messy, possibly could use D register (A+B) to advantage in places and I would not call it addrL/M/H as addr implies address, maybe BinRsL/M/H (Binary result) but that's cosmetic. Some documentation would be good. 'here we multiply by 10' 'here we add in the next digit' etc. It may not help you but it will help the next poor sod that has to look at it. I'd just get it going and walk away - too many people fall into the "if it ain't broke, fix it until it is" mindset. Microsoft updates, I'm looking at you.
-
Do you know of any emulation environments for 68HC11? They have existed for sure, but are there any that haven't bit-rotted too much?
There is one here: http://www.hvrsoftware.com/6800emu.htm (http://www.hvrsoftware.com/6800emu.htm)
-
Also I don't know how you distinguish between an address as a literal (usually expressed as #address) and an address as a source/destination.
I would have thought you'd be using LDX #readout+7 and CPX #readout+1 but I don't know what assembler you are using.
I already mentioned that. For the 680x assemblers I encountered, # was used to indicate immediate addressing, where the byte or bytes following the opcode is the information being accessed. So, "LDX readout" would load the contents of readout into the index register. "LDX #readout" would assemble to have the two byte address of readout in the instruction, and it is that value that the instruction puts into the index register.
Using a vintage assembler of the era (which allows only UPPERCASE symbols of maximum length 6!):
1 0000 ORG 0
2 0000 31 READO FCC '1234' directly addressable (one byte address)
3 0100 ORG $100
4 0100 35 READI FCC '5678' extended addressing required (two byte address)
5
6 0104 DE 00 LDX READO loads contents (first two bytes) of READO into X
7 0106 FE 01 00 LDX READI loads contents (first two bytes) of READI into X
8
9 0109 CE 00 00 LDX #READO loads two byte address of READO into x
10 010C CE 01 00 LDX #READI loads two byte address of READI into x
11
12 END
If the assembler used here is different, how do you load the contents of readout into the register?
-
This compiles using m68hc11-as, the GNU assembler for MC68HC11 from package binutils-m68hc1x version 2.18:
; compile using
; m68hc11-as example.s -o example.o
.data
temp5:
.byte 0
addr:
.byte 0 ; Most significant byte
.byte 0
.byte 0 ; Least significant byte
readout:
.byte 0 ; Checksum byte
.byte 0 ; Least significant digit (0-9)
.byte 0
.byte 0
.byte 0
.byte 0
.byte 0
.byte 0 ; Most significant digit (0-9)
endreadout:
.text
convert:
clr addr+0
clr addr+1
clr addr+2
ldx #readout+1 ; address of least significant digit
jmp .first
loop:
ldd addr+1 ; Least significant 16 bits of addr
lsld
std addr+1
ldaa addr+1
rola
staa addr+0 ; addr now multiplied by 2,
staa temp5 ; and temp5=most significant 8 bits of addr
ldaa addr+1 ; We only mangled A above, so reloading addr+1 suffices
lsld
rol temp5 ; second doubling done
lsld
rol temp5 ; third doubling done
addd addr+1 ; (A:B) = (A:B)+(addr+1:addr+2)
std addr+1
ldaa temp5
adca addr+0
staa addr+0 ; addr fully updated
first:
ldab 0,x ; New digit
clra
addd addr+1 ; Add to least significant 16 bits of addr
std addr+1 ; addrL and addrM updated
ldaa #0 ; Does not affect carry (clra would)
adca addr+0 ; Add carry and most significant 8 bits of addr to A
staa addr+0 ; and addr now fully updated
inx
cpx #endreadout ; past the incoming data?
blo .loop
rts
The operand syntax is
#value for immediates (8 and 16 bit, stored in the instruction)
*zero for value read from a zero page address
.label for relative jumps within -128..+127 bytes of the beginning of the next instruction
In other words,
addd addr ; adds the contents of [addr:addr+1] to the D register (A:B)
addd *addr ; adds the contents of [addr:addr+1] to the D register (A:B), addr being in the zero page
addd #addr ; adds the constant addr:addr+1 to the D register (A:B)
which the MC68HC11 calls extended (EXT), direct (DIR), and immediate (IMM), respectively.
The most complex instructions are brclr and brset,
brclr *addr #mask .label
brclr addr,x #mask .label
brclr addr,y #mask .label
that branch to label if bits specified in byte mask are all clear (brclr) or all set (brset).
I'm using Linux, so I don't have an emulator to check if the above code works. I guess I have to write my own, dammit. Fortunately, the opcode coding is extremely straightforward; no bit extraction or such, as it is just a simple bytecode.
-
GAS syntax is not compatible with Motorola ssrmbly syntax.
-
You cannot recompile BUFFALO with GAS.
You need to adjust the code.
Edit:
Just checked, BUFFALO has interesting routines, bcd conversation etc
-
As far as testing environments, I have a small W7 box I picked up after finding that neither Parallels or VMWare Fusion would work on a M1 Mac. :rant:
(things may have changed since then, but this small PC was VERY reasonable in price and was probably a better choice for my remaining Windows needs in the end)
I have both THRSim11 and 6811IDE (former for full compile, latter for quick routine testing)
MIS42N, ozcar: Yes, well spotted. That was a typo - yes, the LDX and CPX opcodes should be immediate-mode (in other words, #readout+7 and #readout)
Circlotron: I'm not sure, but that code is as fragile as it is atrocious. Any attempts to make it flow better visually causes the code to fall apart in a multitude of brainfarts, hence my question to see if it could be rewritten a bit better.
Oh, and there are other parts of the code that are even worse, but that's another thread.
The reason I use 'addr' for the 24-bit binary result is that is exactly what it is - the address of the unit to be tested (this is part of some code for a piece of test equipment I reverse engineered a while back)
And regarding the arrangement of the readout bytes - that was dictated by the requirements of the Intel 8279 display / keyboard controller IC.
If I ever finish optimizing the code, I can probably redo it to work on something juuuust a bit more modern.
Anyway, the data readout portion of the display is 8 digits, the MSB being on the left (readout+7), going right towards readout+1, then the LSB (readout+0) contains the checksum digit (for example - if 64738 is entered as the address, the sum of the digits is 28, which is lopped off to leave just the 8. When the enter key is pressed, the code checks to ensure 8 was entered as the checksum before passing the data down to this routine).
Nominal Animal: Your code didn't work. :-[ The X register address just kept drifting upwards and never exited.
-
Nominal Animal: Your code didn't work. :-[ The X register address just kept drifting upwards and never exited.
Dammit. The INX ; CPX #endreadout ; BLO .loop should have worked. The loop order is correct, it progresses from least significant digit to most significant digit.
Oh well, time to create a simulator –– in Python, I think, for maximal OS portability and user modifiability without a toolchain –– to lick this.
-
Can't resist trying this.
I downloaded what is supposed to be a 6811IDE emulator from http://www.hvrsoftware.com/download.htm (http://www.hvrsoftware.com/download.htm). Unpack the program is SDK6811.exe and run it - doesn't need installing.
When I used the ADDD instruction, it did not seem to deal with the C carry flag properly, setting it when it shouldn't. So I replaced it with ADDB and ADCA. I'm assuming this is a bug in the emulator and ADDD will work properly in firmware. LDD and STD seemed to work OK.
This code appears to work OK in the emulator - just save it as a text file, load it into the emulator with the LOAD button and RUN. The value 7654321 is hard coded, and converts to 0x74CBB1 (location 0003-5)
bra convert
temp5 .byte 0
binV0 .byte 0 ; big endian
binV1 .byte 0
binV2 .byte 0
read0 .byte 0 ; Checksum byte
.byte 1 ; Little endian value
.byte 2 ; 7,654,321
.byte 3
.byte 4
.byte 5
.byte 6
read7 .byte 7 ; first digit
convert clr binV0 ; result = 0
clr binV1
clr binV2
ldx #read7 ; point to first digit
jmp first
loop ldaa binV0 ; Multiply binary value by 10
staa temp5
ldd binV1 ; 2 shifts = multiply by 4
aslb
rola
rol temp5
aslb
rola
rol temp5
addb binV2 ; add the original = multiply by 5
adca binV1
std binV1
ldaa temp5
adca binV0
staa binV0
asl binV2 ; 1 shift = multiply by 10
rol binV1
rol binV0
first ldaa 0,x ; add in next digit
adda binV2
staa binV2
bcc next
inc binV1
bne next
inc binV0
next dex
cpx #read0 ; all done?
bne loop
rts
-
I have that downloaded and it is a handy app to test 68xx code fragments once you understand how it expects the code to be formatted.
Just tried it and seems to work like a champ. Thank you!
-
So, I did what I thought of before - assembled it for 6809 and lo, it worked. I had to adjust the source to keep the assembler used happy, but the instructions are unchanged. At least, unchanged by me, the assembler did its usual thing of converting the DEX instruction (which 6809 does not have) into LEAX -1,X .
0000 20 0C BRA CONVERT
0002 00 TEMP5 FCB 0
0003 00 BINV0 FCB 0 ; BIG ENDIAN
0004 00 BINV1 FCB 0
0005 00 BINV2 FCB 0
0006 00 READ0 FCB 0 ; CHECKSUM BYTE
0007 01 FCB 1 ; LITTLE ENDIAN VALUE
0008 02 FCB 2 ; 7,654,321
0009 03 FCB 3
000A 04 FCB 4
000B 05 FCB 5
000C 06 FCB 6
000D 07 READ7 FCB 7 ; FIRST DIGIT
000E 0F 03 CONVERT CLR BINV0 ; RESULT = 0
0010 0F 04 CLR BINV1
0012 0F 05 CLR BINV2
0014 8E 000D LDX #READ7 ; POINT TO FIRST DIGIT
>0017 7E 003A JMP FIRST
001A 96 03 LOOP LDAA BINV0 ; MULTIPLY BINARY VALUE BY 10
001C 97 02 STAA TEMP5
001E DC 04 LDD BINV1 ; 2 SHIFTS = MULTIPLY BY 4
0020 58 ASLB
0021 49 ROLA
0022 09 02 ROL TEMP5
0024 58 ASLB
0025 49 ROLA
0026 09 02 ROL TEMP5
0028 DB 05 ADDB BINV2 ; ADD THE ORIGINAL = MULTIPLY BY 5
002A 99 04 ADCA BINV1
002C DD 04 STD BINV1
002E 96 02 LDAA TEMP5
0030 99 03 ADCA BINV0
0032 97 03 STAA BINV0
0034 08 05 ASL BINV2 ; 1 SHIFT = MULTIPLY BY 10
0036 09 04 ROL BINV1
0038 09 03 ROL BINV0
003A A6 84 FIRST LDAA 0,X ; ADD IN NEXT DIGIT
003C 9B 05 ADDA BINV2
003E 97 05 STAA BINV2
0040 24 06 BCC NEXT
0042 0C 04 INC BINV1
0044 26 02 BNE NEXT
0046 0C 03 INC BINV0
0048 30 1F NEXT DEX
004A 8C 0006 CPX #READ0 ; ALL DONE?
004D 26 CB BNE LOOP
004F 39 RTS
0 ERROR(S) DETECTED
SYMBOL TABLE:
BINV0 0003 BINV1 0004 BINV2 0005 CONVER 000E FIRST 003A LOOP 001A NEXT 0048
READ0 0006 READ7 000D TEMP5 0002
+++GET EETEST.BIN.1
+++MON
- SP=C073 US=C073 DP=00 IX=CC9C IY=C847
- PC=D379 A=4E B=00 CC: E F - I - Z - -
>E 0000-004F
- 0000 200C0000 00000001 02030405 06070F03 ...............
- 0010 0F040F05 8E000D7E 003A9603 9702DC04 .......~.:......
- 0020 58490902 58490902 DB059904 DD049602 XI..XI..........
- 0030 99039703 08050904 0903A684 9B059705 ................
- 0040 24060C04 26020C03 301F8C00 0626CB39 $...&...0....&.9
>^P PC=D379 0000
>B 004F
>G
- SP=C073 US=C073 DP=00 IX=0006 IY=C847
- PC=004F A=B1 B=D8 CC: E F - I - Z - -
>E 0000-004F
- 0000 200C2E74 CBB10001 02030405 06070F03 ..t............
- 0010 0F040F05 8E000D7E 003A9603 9702DC04 .......~.:......
- 0020 58490902 58490902 DB059904 DD049602 XI..XI..........
- 0030 99039703 08050904 0903A684 9B059705 ................
- 0040 24060C04 26020C03 301F8C00 0626CB39 $...&...0....&.9
>
Another view of it running at the bottom, so yes not run on a real 6809, but I even have a real one that I could probably get to run it.
-
I have that downloaded and it is a handy app to test 68xx code fragments once you understand how it expects the code to be formatted.
Just tried it and seems to work like a champ. Thank you!
You are welcome. I didn't thoroughly test it. The add 1 byte to 3 bytes (label 'first') needs thrashing. I think it's OK, but this is the first bit of code I've written for 680x.
-
So, I did what I thought of before - assembled it for 6809 and lo, it worked. I had to adjust the source to keep the assembler used happy, but the instructions are unchanged. At least, unchanged by me, the assembler did its usual thing of converting the DEX instruction (which 6809 does not have) into LEAX -1,X .
As I'm sure you are aware, they went to some trouble to make the 6809 source code compatible with the 6800, but the binary encodings are completely different.
As another example, the LDAA 0,X was implemented as LDA ,X with no offset at all. The 6809 can do 0, 5, 8 or 16 bit offsets from X, Y, S, or U. The 6800 can only do 8 bit offsets from X.
-
THRSim11 is the best one for me
I have a couple of hc11 boards, I could attach to the internet, but I don't have the technology to redirect the serial monitor to a browser.
I have developed a downloader that resets the board before downloading a program, so it could also be possible to interface it to a browser in order to download a binary-file (s19) to the target.
-
So, I did what I thought of before - assembled it for 6809 and lo, it worked. I had to adjust the source to keep the assembler used happy, but the instructions are unchanged. At least, unchanged by me, the assembler did its usual thing of converting the DEX instruction (which 6809 does not have) into LEAX -1,X .
As I'm sure you are aware, they went to some trouble to make the 6809 source code compatible with the 6800, but the binary encodings are completely different.
As another example, the LDAA 0,X was implemented as LDA ,X with no offset at all. The 6809 can do 0, 5, 8 or 16 bit offsets from X, Y, S, or U. The 6800 can only do 8 bit offsets from X.
6809 opcodes are different, but I wouldn't call them completely different. Here is the same code assembled for 6800:
1 0000 20 0C BRA CONVERT
2
3 0002 00 TEMP5 FCB 0
4
5 0003 00 BINV0 FCB 0 ; BIG ENDIAN
6 0004 00 BINV1 FCB 0
7 0005 00 BINV2 FCB 0
8
9 0006 00 READ0 FCB 0 ; CHECKSUM BYTE
10 0007 01 FCB 1 ; LITTLE ENDIAN VALUE
11 0008 02 FCB 2 ; 7,654,321
12 0009 03 FCB 3
13 000A 04 FCB 4
14 000B 05 FCB 5
15 000C 06 FCB 6
16 000D 07 READ7 FCB 7 ; FIRST DIGIT
17
18 000E 7F 00 03 CONVERT CLR BINV0 ; RESULT = 0
19 0011 7F 00 04 CLR BINV1
20 0014 7F 00 05 CLR BINV2
21 0017 CE 00 0D LDX #READ7 ; POINT TO FIRST DIGIT
22 001A 7E 00 44 JMP FIRST
23
24 001D 96 03 LOOP LDA A BINV0 ; MULTIPLY BINARY VALUE BY 10
25 001F 97 02 STA A TEMP5
26 0021 01 01 01 LDD BINV1 ; 2 SHIFTS = MULTIPLY BY 4
** UNRECOGNIZABLE MNEMONIC
27 0024 58 ASL B
28 0025 49 ROL A
29 0026 79 00 02 ROL TEMP5
30 0029 58 ASL B
31 002A 49 ROL A
32 002B 79 00 02 ROL TEMP5
33 002E DB 05 ADD B BINV2 ; ADD THE ORIGINAL = MULTIPLY BY 5
34 0030 99 04 ADC A BINV1
35 0032 01 01 01 STD BINV1
** UNRECOGNIZABLE MNEMONIC
36 0035 96 02 LDA A TEMP5
37 0037 99 03 ADC A BINV0
38 0039 97 03 STA A BINV0
39 003B 78 00 05 ASL BINV2 ; 1 SHIFT = MULTIPLY BY 10
40 003E 79 00 04 ROL BINV1
41 0041 79 00 03 ROL BINV0
42
43 0044 A6 00 FIRST LDA A 0,X ; ADD IN NEXT DIGIT
44 0046 9B 05 ADD A BINV2
45 0048 97 05 STA A BINV2
46 004A 24 08 BCC NEXT
47 004C 7C 00 04 INC BINV1
48 004F 26 03 BNE NEXT
49 0051 7C 00 03 INC BINV0
50 0054 09 NEXT DEX
51 0055 8C 00 06 CPX #READ0 ; ALL DONE?
52 0058 26 C3 BNE LOOP
53 005A 39 RTS
Assembled strictly for 6800, LDD and STD have quite rightly been rejected, but many of the assembled instructions, BRA, JMP, ASLA , ROLA, and more are identical. Other instructions like CLR BINV0 are only different because 6800 does not have a Direct addressing mode variant, while 6809 does have, and the 6809 opcode for CLR Extended is the same as for CLR Extended on 6800. But yes, there are many differences too.
6809 indexed addressing is very different to 6800. Besides the 0, 5, 8 and 16 bit offsets from X, Y, S or U, it also has register offset (A, B, or D) from X, Y, S or U, 0 offset from X, Y, S, or U with auto post-increment or pre-decrement by 1 or 2, 8 and 16 bit offsets from PC, and then a whole bunch of indirect indexed modes.
-
Hc11 should have special x and y opcodes not implemented in 6800.
I remember something from an old project :-//
-
The main reason the 6801 / HC11 instruction set was called for is that other parts of this code uses the D (combined A / B) register for certain bit-shift operations and the MUL opcode for one math section.
The code also has a routine that goes the other way, from binary to packed BCD (as well as a separate routine that unpacks BCD), but I want to get a better understanding of the arguments passed to the routine first before I ask for help picking apart that bit of code.
-
The main reason the 6801 / HC11 instruction set was called for is that other parts of this code uses the D (combined A / B) register for certain bit-shift operations and the MUL opcode for one math section.
I note that the code in the original post for some reason used ASLD but not LDD/STD, while the code as proposed by MIS42N that I tried on 6809 uses LDD and STD, but not ASLD. So, using ASLD as well and I now have this (also rearranged a bit as I saw no reason for FIRST to be last):
98 438A 7F 43 7C CONVERT CLR BINV0 ; RESULT = 0
99 438D 7F 43 7D CLR BINV1
100 4390 7F 43 7E CLR BINV2
101 4393 CE 43 86 LDX #READ7 ; SETUP FOR FIRST DIGIT
102
103 4396 A6 00 LOOP LDA A 0,X ; NEXT DIGIT TO ADD IN
104 4398 BB 43 7E ADD A BINV2
105 439B B7 43 7E STA A BINV2
106 439E 24 08 BCC NEXT
107 43A0 7C 43 7D INC BINV1
108 43A3 26 03 BNE NEXT
109 43A5 7C 43 7C INC BINV0
110 43A8 09 NEXT DEX
111 43A9 8C 43 7F CPX #READ0 ; ALL DONE?
112 43AC 27 2B BEQ DONE ; YES
113
114 43AE B6 43 7C LDA A BINV0 ; MULTIPLY BINARY VALUE BY 10
115 43B1 B7 43 7B STA A TEMP5
116 43B4 FC 43 7D LDD BINV1 ; 2 SHIFTS = MULTIPLY BY 4
117 43B7 05 ASL D
118 43B8 79 43 7B ROL TEMP5
119 43BB 05 ASL D
120 43BC 79 43 7B ROL TEMP5
121 43BF F3 43 7D ADD D BINV1 ; ADD THE ORIGINAL = MULTIPLY BY 5
122 43C2 FD 43 7D STD BINV1
123 43C5 B6 43 7B LDA A TEMP5
124 43C8 B9 43 7C ADC A BINV0
125 43CB B7 43 7C STA A BINV0
126 43CE 78 43 7E ASL BINV2 ; 1 SHIFT = MULTIPLY BY 10
127 43D1 79 43 7D ROL BINV1
128 43D4 79 43 7C ROL BINV0
129 43D7 20 BD BRA LOOP
130
131 43D9 DONE EQU *
For something a bit different, that was assembled for 6803, and tested on the pretend 6803 that I had to hand, which explains the somewhat weird address that was assembled for.
-
I note that the code in the original post for some reason used ASLD but not LDD/STD, while the code as proposed by MIS42N that I tried on 6809 uses LDD and STD, but not ASLD. So, using ASLD as well and I now have this (also rearranged a bit as I saw no reason for FIRST to be last):
I had not encountered the 6800 before. So I just did what worked in the emulator. No attempt to optimise further. I tried writing ASLD but what I wrote wasn't accepted. At the time I remember thinking there probably wasn't one, but it it is more likely a typo on my part. Mea culpa.
-
To get more familiarity with the 6800, I wrote a binary to packed decimal routine. Normally one would use the DAA instruction, but the emulator doesn't implement it. So in the emulator, the DAA instruction is emulated by a subroutine EmDAA. Two levels of emulation!
The subroutine EmDAA is a bit obscure, it was based on a table in the Motorola_M6800_Programming_Reference_Manual_M68PRM(D)_Nov76.pdf which I found online.
bra convert
Pdec0 .byte 0 ; result
Pdec1 .byte 1 ; big endian
Pdec2 .byte 2 ; packed decimal
Pdec3 .byte 3
binV0 .byte $74 ; big endian
binV1 .byte $CB
binV2 .byte $B1
count .byte 0
convert clr Pdec0 ; result = 0
clr Pdec1
clr Pdec2
clr Pdec3
ldaa #24 ; bits to shift
staa count
loop1 asl binV2 ; get hi bit of binary
rol binV1
rol binV0
ldx #binV0
loop2 dex
ldaa 0,X
adca 0,X
bsr EmDAA ; emulated DAA
staa 0,X
cpx #Pdec0
bne loop2
dec count
bne loop1
rts
; emulate the DAA instruction
EmDAA tab
tpa
anda #$20 ; isolate half carry
bcs EmCy
bne EmCnHy ; C clear H set
; C clear H clear
tba
anda #$0F
cmpa #$0A
bcs xxx
EmCnHy addb #$06
xxx cmpb #$A0
bcs EmXCn
Em60 addb #$60
EmXCy tba
sec
rts
EmXCn clc
tba
rts
; C set
EmCy bne EmCyHy ; C set H set
tba
anda #$0F
cmpa #$0A
bcs Em60
EmCyHy addb #$06
bra Em60
-
To get more familiarity with the 6800, I wrote a binary to packed decimal routine. Normally one would use the DAA instruction, but the emulator doesn't implement it. So in the emulator, the DAA instruction is emulated by a subroutine EmDAA. Two levels of emulation!
The subroutine EmDAA is a bit obscure, it was based on a table in the Motorola_M6800_Programming_Reference_Manual_M68PRM(D)_Nov76.pdf which I found online.
bra convert
Pdec0 .byte 0 ; result
Pdec1 .byte 1 ; big endian
Pdec2 .byte 2 ; packed decimal
Pdec3 .byte 3
binV0 .byte $74 ; big endian
binV1 .byte $CB
binV2 .byte $B1
count .byte 0
convert clr Pdec0 ; result = 0
clr Pdec1
clr Pdec2
clr Pdec3
ldaa #24 ; bits to shift
staa count
loop1 asl binV2 ; get hi bit of binary
rol binV1
rol binV0
ldx #binV0
loop2 dex
ldaa 0,X
adca 0,X
bsr EmDAA ; emulated DAA
staa 0,X
cpx #Pdec0
bne loop2
dec count
bne loop1
rts
; emulate the DAA instruction
EmDAA tab
tpa
anda #$20 ; isolate half carry
bcs EmCy
bne EmCnHy ; C clear H set
; C clear H clear
tba
anda #$0F
cmpa #$0A
bcs xxx
EmCnHy addb #$06
xxx cmpb #$A0
bcs EmXCn
Em60 addb #$60
EmXCy tba
sec
rts
EmXCn clc
tba
rts
; C set
EmCy bne EmCyHy ; C set H set
tba
anda #$0F
cmpa #$0A
bcs Em60
EmCyHy addb #$06
bra Em60
What sort of strange emulator implements the half-carry flag but not the DAA instruction?
Having discovered myself some three decades ago (!) that the DAA instruction is a bit tricky to emulate, I grabbed your EmDAA routine and subjected it to the DAA testing from Wolfgang Schwotzer's CPUTEST program. That is really for testing 6809 emulation (and I think I found it bundled with a 6809 emulator, but I'm not sure now exactly where it was now). However, I think the 6800 DAA and 6809 DAA should be the same. Edit, found it here: https://github.com/aladur/flexemu/blob/master/src/tools/cputest.txt .
That is not a total torture test, but it does test DAA for 16 different accumulator A values for all four combinations of H and C flags, so a total of 256 tests. It found a problem on the final test for H=0, C=0, with A=$FF (should result in A=$65, but instead got A=$05). It stops at that point, so maybe it would find other problems too.
But then maybe you don't need a completely accurate DAA emulation there - I did not get a chance to see how you use the DAA there, and maybe any differences don't matter for your purposes. One way to to give it a good thorough test would be to combine the binary->decimal and decimal->binary conversions, and check that you get the same value you started with. You could start at zero and just add one each time - could be a good speed test for the emulator too.
-
That is not a total torture test, but it does test DAA for 16 different accumulator A values for all four combinations of H and C flags, so a total of 256 tests. It found a problem on the final test for H=0, C=0, with A=$FF (should result in A=$65, but instead got A=$05). It stops at that point, so maybe it would find other problems too.
But then maybe you don't need a completely accurate DAA emulation there - I did not get a chance to see how you use the DAA there, and maybe any differences don't matter for your purposes. One way to to give it a good thorough test would be to combine the binary->decimal and decimal->binary conversions, and check that you get the same value you started with. You could start at zero and just add one each time - could be a good speed test for the emulator too.
An oversight on my part. It is caused when there is a half carry when $06 is added to B, and the high nibble of B was $Fx. You are right, it would not be encountered the way it is being used. The A values are (packed decimal value)*2 + carry from previous (if there is one). So the high nibble of B will be even, and a half carry will make it odd.
Can be fixed by changing
EmCnHy addb #$06
xxx cmpb #$A0
to
EmCnHy addb #$06
bcs Em60
xxx cmpb #$A0
So even if I had done the binary->decimal->binary it wouldn't have shown an error. It needs something like $77+$88 (both legit packed decimal numbers) which after DAA should be $65 + C. So it does need fixing if someone is relying on it.
-
Is there a reference that describes the individual operations exactly?
For example, I thought half carry (H) was defined as
R := (val1 & 0x0F) OP (val2 & 0x0F)
H := (R < 0) OR (R > 9)
where val1 and val2 are the operands, and OP is the arithmetic operation done on them.
Is there an exact reference for these? Actually, the only ones I'm missing are the condition codes' exact rules, and the division and multiplication instructions.
-
That is not a total torture test, but it does test DAA for 16 different accumulator A values for all four combinations of H and C flags, so a total of 256 tests. It found a problem on the final test for H=0, C=0, with A=$FF (should result in A=$65, but instead got A=$05). It stops at that point, so maybe it would find other problems too.
But then maybe you don't need a completely accurate DAA emulation there - I did not get a chance to see how you use the DAA there, and maybe any differences don't matter for your purposes. One way to to give it a good thorough test would be to combine the binary->decimal and decimal->binary conversions, and check that you get the same value you started with. You could start at zero and just add one each time - could be a good speed test for the emulator too.
An oversight on my part. It is caused when there is a half carry when $06 is added to B, and the high nibble of B was $Fx. You are right, it would not be encountered the way it is being used. The A values are (packed decimal value)*2 + carry from previous (if there is one). So the high nibble of B will be even, and a half carry will make it odd.
Can be fixed by changing
EmCnHy addb #$06
xxx cmpb #$A0
to
EmCnHy addb #$06
bcs Em60
xxx cmpb #$A0
So even if I had done the binary->decimal->binary it wouldn't have shown an error. It needs something like $77+$88 (both legit packed decimal numbers) which after DAA should be $65 + C. So it does need fixing if someone is relying on it.
I added the extra instruction, and now it sails through all the tests without complaining.
I just noticed my dodgy arithmetic above, it does a total of 64 DAA tests, not 256. 64 DAA tests should be enough for anybody?
-
Is there a reference that describes the individual operations exactly?
For example, I thought half carry (H) was defined as
R := (val1 & 0x0F) OP (val2 & 0x0F)
H := (R < 0) OR (R > 9)
where val1 and val2 are the operands, and OP is the arithmetic operation done on them.
Is there an exact reference for these? Actually, the only ones I'm missing are the condition codes' exact rules, and the division and multiplication instructions.
I never did much with HC11, and don't remember much about it other than it had MUL, FDIV and IDIV instructions. For 6800, 6803 and 6809 at least I think the half-carry was only meant for addition. I happen to have a real, printed on paper, version of the 6809 reference manual lying here and from that it looks like H is set in a useful way only for 8-bit ADD and ADC instructions. For other instructions including 16-bit ADD, 8-bit or 16-bit SUB, and MUL, H is either "not affected" or "undefined".
My first 6809 emulation was written in 68000 assembly language. 68000 not having half-carry, for the 6809 add instructions that needed valid H, I just added the low nybbles separately and used bit 4 (0x10 position) of the result as H.
-
For 6800, 6803 and 6809 at least I think the half-carry was only meant for addition.
D'oh! You're right; that also applies to 68HC11 as well. :-+
-
I was thinking I had used the Half Carry for something in the past, finally found it. It was an AtMega328P project, I was holding 12 * 16 bit values in register pairs. The 16 bits was split 4/12 and I needed to compare the 12 bit values with one in the register pair ZH/ZL. Usually that would mean ANDing off the high 4 bits. Fortunately the processor set the H bit on compares as well as arithmetic operations so this works:
cp r0,ZL
cpc r1,ZH
brhs n2insR0
This would be really useful in the code for the emulated DAA instead of having to transfer the value to another register and AND $0F to get rid of the high bits. Why does the 6800 not do this on compares? Compares are usually implemented in the processor as an arithmetic operation, discarding the result and keeping the status. One would think more circuitry would be needed to set the H flag only on addition and not otherwise.
-
I was thinking I had used the Half Carry for something in the past, finally found it. It was an AtMega328P project, I was holding 12 * 16 bit values in register pairs. The 16 bits was split 4/12 and I needed to compare the 12 bit values with one in the register pair ZH/ZL. Usually that would mean ANDing off the high 4 bits. Fortunately the processor set the H bit on compares as well as arithmetic operations so this works:
cp r0,ZL
cpc r1,ZH
brhs n2insR0
This would be really useful in the code for the emulated DAA instead of having to transfer the value to another register and AND $0F to get rid of the high bits. Why does the 6800 not do this on compares? Compares are usually implemented in the processor as an arithmetic operation, discarding the result and keeping the status. One would think more circuitry would be needed to set the H flag only on addition and not otherwise.
The 6800 reference manual says that H is "not affected" by compare and subtract instructions. As for why, maybe because subtract and compare were implemented by negating the second operand and then adding, and they couldn't be bothered to sort out the half-borrow, but did not want it to be "undefined"?
Somewhat strangely, the 6809 reference manual that I have ("Original issue, March 1, 1981"), says that H is "undefined" for both 8-bit compare and 8-bit subtracts. That would make source code compatibility with 6800 a bit tricky, so I did a test, this time on a real 6809, and for the few values that I tested, H was actually "not affected", so as stated for 6800. Now if I tested with another 6809, or perhaps even the same 6809 tomorrow, maybe the results would be different, but I don't think that is very likely. But your guess is better than mine as to why they would say "undefined" in the fine manual if you could rely on it being "not affected".