Products > Programming

Cleaning up a bit of 6800 assembly code

(1/9) > >>

When going through a bit of 6800 assembly code (actually 6801 / HC11), I came across this section and I know it can be rearranged better, but I cannot get it to work properly. I would like to see if anyone can point out how to properly rearrange these instructions, especially given that the RTS is smack dab in the middle and that a BRA is needed to continue the loop. Now, this is after I already cleaned the 'spaghetti' out of the first half and replaced a loop counter with CPX.

What this does is convert a 7-digit BCD value (readout+7 is MSB, readout+1 is LSB, readout+0 is used as a checksum digit elsewhere) into a 3-byte binary value.

addrCalc   CLC   
   CLR   temp6
   CLR   addrL
   CLR   addrM
   CLR   addrH
   LDX   readout+7
func01_1   LDAA   addrM
   LDAB   addrL
   ADDB   $00,X
   STAB   addrL
   BCC   func01_2
   ADDA   #$01
   STAA   temp6
   STAA   addrM
   BCC   func01_2
   LDAA   addrH
   ADDA   #$01
   STAA   addrH
   LDAA   temp6
func01_2   CPX   readout
   BNE   func01_5
func01_5   ASLD   
   STAB   addrL
   STAA   temp6
   STAA   addrM
   ROL   addrH
   LDAA   addrH
   STAA   temp7
   LDAA   temp6
   ROL   temp7
   ROL   temp7
   ADDB   addrL
   STAB   addrL
   ADCA   addrM
   STAA   addrM
   BCC   func01_6
   INC   addrH
func01_6   LDAA   temp7
   ADDA   addrH
   STAA   addrH
   BRA   func01_1

Nominal Animal:
Friends don't just throw 68HC11 assembly at friends and ask what's wrong, friends describe the idea of what the code tries to implement (you mostly did, but not fully), and how.  ;)

Is this packed (two decimal digits per byte) or unpacked BCD (each byte being 0..9 only)?
Or is the problem that you use an unpacked algorithm, but the data is really packed?

If it is packed BCD, I guess that your input is something like
   Byte   Bits    Description
    B0    0..7    Checksum digit, ignored
    B0    4..7    D0, least significant BCD digit we're interested in
    B1    0..3    D1
    B1    4..7    D2
    B2    0..3    D3
    B2    4..7    D4
    B3    0..3    D5
    B3    4..7    D6, most significant BCD digit we're interested in
but your description kinda implies unpacked BCD.  Which is it?

In any case, output is bytes O0, O1, O2, such that
    O0 + 256*(O1 + 256*O2) = D0 + 10*(D1 + 10*(D2 + 10*(D3 + 10*(D4 + 10*(D5 + 10*D6)))))
    O0 + 256*O1 + 65536*O2 = D0 + 10*D1 + 100*D2 + 1000*D3 + 10000*D4 + 100000*D5 + 1000000*D6
right?  (If you use MSB, just swap O0 and O2 above, they're just labels anyway.  I like to use arithmetic numbering, which starts at zero, and increases as the significance increases, because that matches the math: value = sum digit[n]*radix^n.  I'm not a "little-endian zealot".)

The core operation here, if I were to implement this from scratch, would be
    A ← A*10+B = (A << 3) + (A << 1) + B
where A is the three-byte register, initialized to zero.  For 68HC11, we of course can only shift/rotate one bit at a time, so we need a temporary three-byte area for the shifting, since we need to do that sum.  We can do with A, B, D accumulators, if we do it before we extract the new digit B, though.

Can I use the NXP M68HC11E manual for reference, or does your architecture have a different instruction set?

In this instance, yes, it is unpacked BCD, entered from a keypad before being passed to this routine.

The HC11 instruction set will be just fine.

If you have a better algorithm in mind, I'd be all ears. If you say you need a third temporary byte, use the mnemonic temp5.

Whoever wrote the original code did not know the first thing about 68xx assembly - what I started with was atrocious!

Nominal Animal:
I would implement the following pseudocode:
    Set addr to 0
    Set X to address of first digit
    Optional: jump to First
    Multiply addr by 2
    Copy addr to temp
    Multiply addr by 2
    Multiply addr by 2
    Add temp to addr

    Load next decimal digit,
      and add it to addr

    Increment X
    If we are not past the digit buffer yet,
      jump to Loop
      otherwise return from subroutine

noting that temp=addr*2; addr=temp*2*2+temp is equivalent to addr=10*addr.

I have a nagging feeling we could use the Y register for our advantage here, perhaps via XGDY, which exchanges D and Y, to avoid using temp5, and speed things up.. but me brain isn't working well enough now, captain!

I do not have an emulator handy (and can't recall where I could get one quickly for Linux), so consider the following implementation suspect, probably buggy, in need of checking:

The conversion starts with clearing addrH:addrM:addrL, and I do assume addrH has the lowest address and addrL the highest address:
        CLR   addrL             ; addrL = 0
        CLR   addrM             ; addrM = 0
        CLR   addrH             ; addrH = 0

Use X for the address of incoming digits, in increasing order of importance.
        LDX   readout+1         ; address of least significant digit
        JMP   first

Next, we have a subroutine that multiplies addr by 10, using temp5 as a scratchpad.
  loop  LDD   addrM             ; A=addrM, B=addrL
        STD   addrM
        LDAA  addrH
        STAA  addrH             ; addr now multiplied by 2,
        STAA  temp5             ; and temp5=addrH
        LDAA  addrM             ; We didn't mangle B above, so reloading addrM suffices
        ROL   temp5             ; second doubling done
        ROL   temp5             ; third doubling done
        ADDD  addrM             ; A=A+addrM, B=B+addrL, carry set if necessary
        STD   addrM             ; addrL and addrM updated
        LDAA  temp5
        ADCA  addrH
        STAA  addrH             ; addrH updated

We then want to load the next digit to B,
  first LDAB  $00,X

and add it to the 24-bit addr:
        ADDD  addrM             ; A=addrM, B=addrL+digit
        STD   addrM             ; addrL and addrM updated
        LDAA  #$0
        ADCA  addrH
        STAA  addrH             ; addrH updated

Advance to next digit, if any left.
        CPX   readout+8         ; past the incoming data?
        BLO   loop

Can you follow the logic above?  Could you test if it works?

If it is not fast enough, I'm pretty sure it can be sped up somewhat.  I never write fast code on the first run, I strive for correctness.

(In fact, if you think about it, it might be useful to call something like this directly from the keypad, using an extra byte to count the number of digits already pressed.  Would also save about seven bytes of RAM.  Even if the 68HC11 is running off a 32kHz clock or something, the above is faster than even the fastest human pressing digits.)

Just want to confirm a few details - the 24-bit address, it is stored as high / mid / low.

Given the atrocious code I began with, I don't think speed is terribly important (the clock speed of the MPU was a mere 1.0MHz - about the same as a Commodore 64)

Also, there's plenty of space in the ROM - especially after my cleanup / optimization of the overall code, so as long as the code is readable, we should be fine.


[0] Message Index

[#] Next page

There was an error while thanking
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod