I've attached some PIC asm code to implement a 40-bit binary to 13-digit decimal (BCD) conversion.
My approach is based upon Scott Dattalo's analysis of John Payson's code found in the piclist archive pointed to by jpanhalt in an earlier post
https://www.eevblog.com/forum/microcontrollers/20-bit-bin2bcd-(assembly)/msg5193069/#msg5193069Some of the following is paraphrasing Scott's notes based on my own understanding. I tend to be verbose, so bear with me.
We understand that a decimal number such as 3478 can be broken down as a power-of-10 weighted sum of the digits:
3478 = 3*1000 + 4*100 * 7*10 + 8*1
Generally:
N = b
3*1000 + b
2*100 + b
1*10 + b
0 = b
3*10
3 + b
2*10
2 + b
1*10
1 + b
0*10
0 where b
n , n=[0..3] are the decimal digits
Equally, a hexadecimal (hex) number can be written as a power-of-16 (0x10) weighted sum of its digits. For a 16-bit hex number:
N = a
3*16^3 + a
2*16^2 + a
1*16^1 + a
0*16^0
where a
n , n=[0..3] are the hexadecimal digits (16 is a decimal number for brevity).
Then, expanding each 16
xN = a
3*4096 + a
2*256 + a
1*16 + a
0If we desire to convert the hexadecimal number with digits a
n into a decimal number with digits b
n, we can re-write the equations to help. As Scott said, there are infinitely many ways to do this, since we can manipulate the equations in any valid algebraic way. We start with:
N = a
3*4096 + a
2*256 + a
1*16 + a
0N = (4a
3)*1000 + (0a
3+2a
2)*100 + (9a
3+5a
2+1a
1)*10 + (6a
3+6a
2+6a
1+1a
0)
This gives us a sum of powers-of-ten, which is what we need to represent a decimal number. It looks a lot like the first equation above:
N = b
3*1000 + b
2*100 + b
1*10 + b
0Then, equations for the decimal digits b
0..b
4 (5 digits for a general 16-bit case) can be extracted from the above:
b
3 = 4a
3b
2 = 2a
2b
1 = 9a
3 + 5a
2 + a
1b
0 = 6a
3 + 6a
2 + 6a
1 + a
0One may notice that there is no b
4, the most significant digit, and that all of those equations can result in a number greater than 9. To extract the actual decimal digits, we need to normalize each to limit it to 0..9, just like you would "carry" when adding numbers in grade school. After that, we will have the set of decimal digits b
0..b
4 representing the same number as the hexadecimal number with digits a
0..a
3Scott's description of John's highly optimized code goes on to describe various manipulations of the above equations which can reduce the work involved in the calculations (e.g. fewer multiplies) and normalizations. These are interesting, and if you are as fascinated by optimization as I am, it is worth a read.
I needed to convert a 40 bit number to decimal. That is 10 hexadecimal digits, so we'll end up with a rather larger set of equations.
N = a
9*16
9 + a
8*16
8 + a
7*16
7 + a
6*16
6 + a
5*16
5 + a
4*16
4 + a
3*16
3 + a
2*16
2 + a
1*16 + a
0N = a
9*68719476736 + a
8*4294967296 + a
7*268435456 + a
6*16777216 + a
5*1048576 + a
4*65536 + a
3*4086 + a
3*256 + a
1*16 + a
0We can now create the trivial set of equations for the powers-of-ten b
0..b
12.
Let's rearrange it as a visual aid to make the job of extracting powers-of-ten easier
N = a9 * 6 8 7 1 9 4 7 6 7 3 6
+ a8 * 4 2 9 4 9 6 7 2 9 6
+ a7 * 2 6 8 4 3 5 4 5 6
+ a6 * 1 6 7 7 7 2 1 6
+ a5 * 1 0 4 8 5 7 6
+ a4 * 6 5 5 3 6
+ a3 * 4 0 9 6
+ a2 * 2 5 6
+ a1 * 1 6
+ a0 * 1
Each column represents a b
n.
b
10= 6a
9b
9 = 8a
9 + 4a
8b
8 = 7a
9 + 2a
8 + 2a
7b
7 = 1a
9 + 9a
8 + 6a
7 + 1a
6b
6 = 9a
9 + 4a
8 + 8a
7 + 6a
6 + 1a
5b
5 = 4a
9 + 9a
8 + 4a
7 + 7a
6 + 0a
5b
4 = 7a
9 + 6a
8 + 3a
7 + 7a
6 + 4a
5 + 6a
4b
3 = 6a
9 + 7a
8 + 5a
7 + 7a
6 + 8a
5 + 5a
4 + 4a
3b
2 = 7a
9 + 2a
8 + 4a
7 + 2a
6 + 5a
5 + 5a
4 + 0a
3 + 2a
2b
1 = 3a
9 + 9a
8 + 5a
7 + 1a
6 + 7a
5 + 3a
4 + 9a
3 + 5a
2 + 1a
1b
0 = 6a
9 + 6a
8 + 6a
7 + 6a
6 + 6a
5 + 6a
4 + 6a
3 + 6a
2 + 6a
1 + 1a
0You can see right away that this set of equations is far more complex than the 16-bit case described above. In addition, after doing those calculations, the b
n will be much larger than in the 16-bit case, so there is more work to normalize them, and we have many more digits to normalize.
I'm using a lowly PIC with no multiply instruction, and that is a lot of multiplies! Recognize that most a
n are used multiple times at various multiples from 1 to 9. For example, we use multiples 1,3,4,6,7,8,9 of a
9. So we can simply add a
9 to itself multiple times, using the intermediate result to accumulate a partial result of each b
n which uses it. Repeat for each a
n. This avoids using any multiply operations.
One gotcha, also not present in the 16-bit case, is that the accumulated b
n can overflow. We need to detect that and account for it. Checking for overflow is easy, just check the carry bit of STATUS after adding. One effective way to account for overflow is to subtract 200 from the number, then add 2 to b
n+2. In my code, I check for carry only after at least 17*a
n are accumulated into any b
n, since 17*0xF = 255. After this point, carry must always be checked for that b
n.
After finishing these calculations, we still need to normalize all b
n into the range 0..9 so that they represent decimal digits. The trivial way is to repeatedly subtract 10 from b
n then add 1 to b
n+1, until b
n is less than 10. Due to the potentially large size of each b
n, this may take many iterations. The number of iterations can be reduced by first using a larger number, such as 100.
The code here is considerably larger than John's original 16-bit code or the adapted 20-bit code posted in this thread, but given the scale of the work for a 40-bit number, it is reasonable. Execution time is also reasonable given that no special optimizations were made to the equations. Code size is approx 430 instructions, and execution time for worst case 0xFFFFFFFFFF is 998 instruction cycles.
(edit: attached code removed due to bug, see revised code below)