Products > Programming

Cleaning up a bit of 6800 assembly code

**metertech58761**:

When going through a bit of 6800 assembly code (actually 6801 / HC11), I came across this section and I know it can be rearranged better, but I cannot get it to work properly. I would like to see if anyone can point out how to properly rearrange these instructions, especially given that the RTS is smack dab in the middle and that a BRA is needed to continue the loop. Now, this is after I already cleaned the 'spaghetti' out of the first half and replaced a loop counter with CPX.

What this does is convert a 7-digit BCD value (readout+7 is MSB, readout+1 is LSB, readout+0 is used as a checksum digit elsewhere) into a 3-byte binary value.

addrCalc CLC

CLR temp6

CLR addrL

CLR addrM

CLR addrH

LDX readout+7

func01_1 LDAA addrM

LDAB addrL

ADDB $00,X

STAB addrL

BCC func01_2

ADDA #$01

STAA temp6

STAA addrM

BCC func01_2

LDAA addrH

ADDA #$01

STAA addrH

LDAA temp6

func01_2 CPX readout

BNE func01_5

RTS

func01_5 ASLD

STAB addrL

STAA temp6

STAA addrM

ROL addrH

LDAA addrH

STAA temp7

LDAA temp6

ASLD

ROL temp7

ASLD

ROL temp7

ADDB addrL

STAB addrL

ADCA addrM

STAA addrM

BCC func01_6

INC addrH

func01_6 LDAA temp7

ADDA addrH

STAA addrH

DEX

BRA func01_1

**Nominal Animal**:

Friends don't just throw 68HC11 assembly at friends and ask what's wrong, friends describe the idea of what the code tries to implement (you mostly did, but not fully), and how. ;)

Is this packed (two decimal digits per byte) or unpacked BCD (each byte being 0..9 only)?

Or is the problem that you use an unpacked algorithm, but the data is really packed?

If it is packed BCD, I guess that your input is something like

Byte Bits Description

B0 0..7 Checksum digit, ignored

B0 4..7 D0, least significant BCD digit we're interested in

B1 0..3 D1

B1 4..7 D2

B2 0..3 D3

B2 4..7 D4

B3 0..3 D5

B3 4..7 D6, most significant BCD digit we're interested in

but your description kinda implies unpacked BCD. Which is it?

In any case, output is bytes O0, O1, O2, such that

O0 + 256*(O1 + 256*O2) = D0 + 10*(D1 + 10*(D2 + 10*(D3 + 10*(D4 + 10*(D5 + 10*D6)))))

i.e.

O0 + 256*O1 + 65536*O2 = D0 + 10*D1 + 100*D2 + 1000*D3 + 10000*D4 + 100000*D5 + 1000000*D6

right? (If you use MSB, just swap O0 and O2 above, they're just labels anyway. I like to use arithmetic numbering, which starts at zero, and increases as the significance increases, because that matches the math: value = sum digit[n]*radix^n. I'm not a "little-endian zealot".)

The core operation here, if I were to implement this from scratch, would be

A ← A*10+B = (A << 3) + (A << 1) + B

where A is the three-byte register, initialized to zero. For 68HC11, we of course can only shift/rotate one bit at a time, so we need a temporary three-byte area for the shifting, since we need to do that sum. We can do with A, B, D accumulators, if we do it before we extract the new digit B, though.

Can I use the NXP M68HC11E manual for reference, or does your architecture have a different instruction set?

**metertech58761**:

In this instance, yes, it is unpacked BCD, entered from a keypad before being passed to this routine.

The HC11 instruction set will be just fine.

If you have a better algorithm in mind, I'd be all ears. If you say you need a third temporary byte, use the mnemonic temp5.

Whoever wrote the original code did not know the first thing about 68xx assembly - what I started with was atrocious!

**Nominal Animal**:

I would implement the following pseudocode:

Set addr to 0

Set X to address of first digit

Optional: jump to First

Loop:

Multiply addr by 2

Copy addr to temp

Multiply addr by 2

Multiply addr by 2

Add temp to addr

First:

Load next decimal digit,

and add it to addr

Increment X

If we are not past the digit buffer yet,

jump to Loop

otherwise return from subroutine

noting that temp=addr*2; addr=temp*2*2+temp is equivalent to addr=10*addr.

I have a nagging feeling we could use the Y register for our advantage here, perhaps via XGDY, which exchanges D and Y, to avoid using temp5, and speed things up.. but me brain isn't working well enough now, captain!

I do not have an emulator handy (and can't recall where I could get one quickly for Linux), so consider the following implementation suspect, probably buggy, in need of checking:

The conversion starts with clearing addrH:addrM:addrL, and I do assume addrH has the lowest address and addrL the highest address:

CLR addrL ; addrL = 0

CLR addrM ; addrM = 0

CLR addrH ; addrH = 0

Use X for the address of incoming digits, in increasing order of importance.

LDX readout+1 ; address of least significant digit

JMP first

Next, we have a subroutine that multiplies addr by 10, using temp5 as a scratchpad.

loop LDD addrM ; A=addrM, B=addrL

LSLD

STD addrM

LDAA addrH

ROLA

STAA addrH ; addr now multiplied by 2,

STAA temp5 ; and temp5=addrH

LDAA addrM ; We didn't mangle B above, so reloading addrM suffices

LSLD

ROL temp5 ; second doubling done

LSLD

ROL temp5 ; third doubling done

ADDD addrM ; A=A+addrM, B=B+addrL, carry set if necessary

STD addrM ; addrL and addrM updated

LDAA temp5

ADCA addrH

STAA addrH ; addrH updated

We then want to load the next digit to B,

first LDAB $00,X

and add it to the 24-bit addr:

CLRA

ADDD addrM ; A=addrM, B=addrL+digit

STD addrM ; addrL and addrM updated

LDAA #$0

ADCA addrH

STAA addrH ; addrH updated

Advance to next digit, if any left.

INX

CPX readout+8 ; past the incoming data?

BLO loop

RET

Can you follow the logic above? Could you test if it works?

If it is not fast enough, I'm pretty sure it can be sped up somewhat. I never write fast code on the first run, I strive for correctness.

(In fact, if you think about it, it might be useful to call something like this directly from the keypad, using an extra byte to count the number of digits already pressed. Would also save about seven bytes of RAM. Even if the 68HC11 is running off a 32kHz clock or something, the above is faster than even the fastest human pressing digits.)

**metertech58761**:

Just want to confirm a few details - the 24-bit address, it is stored as high / mid / low.

Given the atrocious code I began with, I don't think speed is terribly important (the clock speed of the MPU was a mere 1.0MHz - about the same as a Commodore 64)

Also, there's plenty of space in the ROM - especially after my cleanup / optimization of the overall code, so as long as the code is readable, we should be fine.

Navigation

[0] Message Index

[#] Next page

Go to full version