LOL, crazy, see this trick
At the time when 8086/80186/80286-class machines had less than 640kB of RAM, it was common for games to use self-modifying code.
Starflight (1986) used this heavily, including storing the modifications on-disk: to restart the game from scratch, you had to reinstall it (create new copies of the game disks)!
ok, seriously it takes too many opcode, let's implement the classic print_string with a string pointer as argument
Well, this takes 33 bytes, noting that now the character to be sent is in AH (and not AL) register,
[bits 16]
EOS EQU (0)
SERIAL_BASE EQU (0x3F8)
SERIAL_THR EQU (SERIAL_BASE+0) ; Transmit Hold Register (w)
SERIAL_RBR EQU (SERIAL_BASE+0) ; Receive Buffer Register (r)
SERIAL_LSR EQU (SERIAL_BASE+5) ; Line Status Register (r)
SERIAL_IS_DATA_READY EQU (0x01)
SERIAL_IS_OVERRUN EQU (0x02)
SERIAL_IS_PARITY_ERROR EQU (0x04)
SERIAL_IS_FRAMING_ERROR EQU (0x08)
SERIAL_IS_BREAK EQU (0x10)
SERIAL_IS_THR_EMPTY EQU (0x20)
SERIAL_IS_EMPTY EQU (0x40)
SERIAL_IS_FIFO_ERROR EQU (0x80)
; Character to be sent in AH
; Clobbers AL
serial_putc:
push dx
mov dx, SERIAL_LSR
.serial_putc_wait:
in al, dx
test al, SERIAL_IS_THR_EMPTY
jz .serial_putc_wait
mov dx, SERIAL_THR
mov al, ah
out dx, al
pop dx
ret
; String to be sent in DS:SI
; SI will point to EOS
serial_puts:
push ax
.serial_puts_next:
mov ah, [si]
cmp ah, EOS
je .serial_puts_done
call serial_putc
inc si
jmp .serial_puts_next
.serial_puts_done:
pop ax
ret
but if you don't need the single-character one, this takes only 28 bytes:
[bits 16]
EOS EQU (0)
SERIAL_BASE EQU (0x3F8)
SERIAL_THR EQU (SERIAL_BASE+0) ; Transmit Hold Register (w)
SERIAL_RBR EQU (SERIAL_BASE+0) ; Receive Buffer Register (r)
SERIAL_LSR EQU (SERIAL_BASE+5) ; Line Status Register (r)
SERIAL_IS_DATA_READY EQU (0x01)
SERIAL_IS_OVERRUN EQU (0x02)
SERIAL_IS_PARITY_ERROR EQU (0x04)
SERIAL_IS_FRAMING_ERROR EQU (0x08)
SERIAL_IS_BREAK EQU (0x10)
SERIAL_IS_THR_EMPTY EQU (0x20)
SERIAL_IS_EMPTY EQU (0x40)
SERIAL_IS_FIFO_ERROR EQU (0x80)
; String to be sent in DS:SI
; SI will point to EOS
serial_puts:
push ax
push dx
.serial_puts_next:
mov ah, [si]
cmp ah, EOS
jnz .serial_puts_one
pop dx
pop ax
ret
.serial_puts_one:
inc si
mov dx, SERIAL_LSR
.serial_puts_wait:
in al, dx
test al, SERIAL_IS_THR_EMPTY
jz .serial_puts_wait
mov dx, SERIAL_THR
mov al, ah
out dx, al
jmp .serial_puts_next
Often, preserving AX isn't useful, so if you let it clobber that too, omit the push/pop AX, and it'll shrink to just 26 bytes.
The reason I didn't use
lodsb in either, is because
lodsb; mov ah, al and
mov ah, [si] ; inc si are both 3 bytes, but the former depends on the direction flag (D), and the latter does not.
If you look at the compare-to-EOS parts, you'll find that adding other characters for true printf-like functionality (in a separate function!) isn't too difficult. I just recommend that you make it caller-cleanup (i.e., you only look at the parameters on stack, not pop them off). For integers, I recommend you use the slow DIV/IDIV-by-radix method (with value in DX*65536+AX), constructing it from right to left in a temporary buffer on stack (16 chars including EOS suffices for 32-bit integers in octal, decimal, and hexadecimal).