Author Topic: 20-bit BIN2BCD (Assembly) (Read 2755 times)

jpanhalt · « **on:** November 28, 2023, 02:50:15 pm »

Just FYI
Attached is the code I have been working on for converting 20-bit binary to BCD. More discussion of that is here:
https://www.eevblog.com/forum/microcontrollers/converting-bcd(mm)-to-bcd(inch)/

It is based on a polynomial approach developed by Payson and posted on PICList many years ago. Related links:

Link and explanation to 16-bit BIN2BCD by Payson w/ explanation by Dattalo:
https://pic.hallikainen.org/techref/microchip/math/radix/b2bu-16b5d.htm

Link to 17-bit BIN2BCD:
https://pic.hallikainen.org/techref/microchip/math/radix/b2bu-17b5d.htm

DavidAlfa · « **Reply #1 on:** November 28, 2023, 03:13:29 pm »

I think you'd better post your code in github, much easier to find by anyone if described with the proper tags (PIC, pic16, assembler, binary to bcd, bin2bcd ...).
Here it'll get lost in no time unless someone specifically searches for "bin2cbd".

Sacodepatatas · « **Reply #2 on:** November 28, 2023, 03:22:53 pm »

Quote from: DavidAlfa

...someone specifically searches for "bin2cbd".

Somebody searching for a "binary to cannabidiol" converter will not have much luck at eevblog.... 🤣

jpanhalt · « **Reply #3 on:** November 28, 2023, 03:33:37 pm »

Thank you for adding marijuana to what Google will index for that code.

They deserve it.

As an aside, I was in grad school when a 1-step synthesis for THC was published. It got a huge response, and the starting material soon disappeared from every chemical catalog.

DavidAlfa · « **Reply #4 on:** November 28, 2023, 03:40:38 pm »

now that's a hard conversion...

macboy · « **Reply #5 on:** December 02, 2023, 05:01:19 am »

Thank you for posting.
I had a look, hoping to further adapt to my 40 bit use case. Unfortunately, the original code posted to piclist is very cleverly optimised to the specific 16 bit conversion, and is not easily extended beyond the 20 bits that you took it to.
I was however very intrigued by this method of BCD conversion, using accumulated powers of ten derived from each hex digit. So I wrote up the equations for the 40 bit (10 hex digits, 13 decimal digits) case, and wrote entirely new code to implement them. Those equations are vastly bigger and more complex than the relatively simple 16 bit conversion. My code is written in assembly for baseline PIC (12 bit instruction set). The code is rather big at around 550 instructions. Worst case run time I've seen is 900 cycles. That's about three to four times as fast as my double-dabble implementation, which is astounding. It is also no bigger than that slower implementation due to a big look up table I used in that one. Critically it also fits within the ~ 1900 cycle budget that I can afford within the main periodic loop of my target project. There is definitely room to reduce the code size while increasing execution time; I did lots of stuff inline instead of with loops or subroutines, to avoid lots of expensive branching. Using a mid-range (14 bit instruction) PIC could save a few dozen instructions and cycles too. I can clean up the code and post it for anyone interested.

jpanhalt · « **Reply #6 on:** December 02, 2023, 07:08:07 am »

The approach I used based on PICList and the 17-bit version is limited to 20 bits. At least, that seems to be the case with the band aid addition. Starting anew, one could probably extend it, and with the enhanced mid-range instructions, there is less justification for keeping every equation negative. I tried some versions without offsets. The code was longer, but they still worked. That was several years ago. The 12-bit core devices do have limitations, and as I recall, Microchip has abandoned the 12F label for its newer 8-pin chips.

I would be very interested in seeing your approach and how you extended it.

John

macboy · « **Reply #7 on:** December 04, 2023, 05:17:00 pm »

I've attached some PIC asm code to implement a 40-bit binary to 13-digit decimal (BCD) conversion.

My approach is based upon Scott Dattalo's analysis of John Payson's code found in the piclist archive pointed to by jpanhalt in an earlier post https://www.eevblog.com/forum/microcontrollers/20-bit-bin2bcd-(assembly)/msg5193069/#msg5193069

Some of the following is paraphrasing Scott's notes based on my own understanding. I tend to be verbose, so bear with me.

We understand that a decimal number such as 3478 can be broken down as a power-of-10 weighted sum of the digits:
3478 = 3*1000 + 4*100 * 7*10 + 8*1
Generally:
N = b₃*1000 + b₂*100 + b₁*10 + b₀
= b₃*10³ + b₂*10² + b₁*10¹ + b₀*10⁰
where b_n , n=[0..3] are the decimal digits

Equally, a hexadecimal (hex) number can be written as a power-of-16 (0x10) weighted sum of its digits. For a 16-bit hex number:
N = a₃*16^3 + a₂*16^2 + a₁*16^1 + a₀*16^0
where a_n , n=[0..3] are the hexadecimal digits (16 is a decimal number for brevity).
Then, expanding each 16^x
N = a₃*4096 + a₂*256 + a₁*16 + a₀

If we desire to convert the hexadecimal number with digits a_n into a decimal number with digits b_n, we can re-write the equations to help. As Scott said, there are infinitely many ways to do this, since we can manipulate the equations in any valid algebraic way. We start with:

N = a₃*4096 + a₂*256 + a₁*16 + a₀
N = (4a₃)*1000 + (0a₃+2a₂)*100 + (9a₃+5a₂+1a₁)*10 + (6a₃+6a₂+6a₁+1a₀)

This gives us a sum of powers-of-ten, which is what we need to represent a decimal number. It looks a lot like the first equation above:
N = b₃*1000 + b₂*100 + b₁*10 + b₀

Then, equations for the decimal digits b₀..b₄ (5 digits for a general 16-bit case) can be extracted from the above:
b₃ = 4a₃
b₂ = 2a₂
b₁ = 9a₃ + 5a₂ + a₁
b₀ = 6a₃ + 6a₂ + 6a₁ + a₀

One may notice that there is no b₄, the most significant digit, and that all of those equations can result in a number greater than 9. To extract the actual decimal digits, we need to normalize each to limit it to 0..9, just like you would "carry" when adding numbers in grade school. After that, we will have the set of decimal digits b₀..b₄ representing the same number as the hexadecimal number with digits a₀..a₃

Scott's description of John's highly optimized code goes on to describe various manipulations of the above equations which can reduce the work involved in the calculations (e.g. fewer multiplies) and normalizations. These are interesting, and if you are as fascinated by optimization as I am, it is worth a read.

I needed to convert a 40 bit number to decimal. That is 10 hexadecimal digits, so we'll end up with a rather larger set of equations.
N = a₉*16⁹ + a₈*16⁸ + a₇*16⁷ + a₆*16⁶ + a₅*16⁵ + a₄*16⁴ + a₃*16³ + a₂*16² + a₁*16 + a₀
N = a₉*68719476736 + a₈*4294967296 + a₇*268435456 + a₆*16777216 + a₅*1048576 + a₄*65536 + a₃*4086 + a₃*256 + a₁*16 + a₀
We can now create the trivial set of equations for the powers-of-ten b₀..b₁₂.
Let's rearrange it as a visual aid to make the job of extracting powers-of-ten easier

N = a₉ * 6 8 7 1 9 4 7 6 7 3 6
  + a₈ *   4 2 9 4 9 6 7 2 9 6
  + a₇ *     2 6 8 4 3 5 4 5 6
  + a₆ *       1 6 7 7 7 2 1 6
  + a₅ *         1 0 4 8 5 7 6
  + a₄ *             6 5 5 3 6
  + a₃ *               4 0 9 6
  + a₂ *                 2 5 6
  + a₁ *                   1 6
  + a₀ *                     1

Each column represents a b_n.

b₁₀= 6a₉
b₉ = 8a₉ + 4a₈
b₈ = 7a₉ + 2a₈ + 2a₇
b₇ = 1a₉ + 9a₈ + 6a₇ + 1a₆
b₆ = 9a₉ + 4a₈ + 8a₇ + 6a₆ + 1a₅
b₅ = 4a₉ + 9a₈ + 4a₇ + 7a₆ + 0a₅
b₄ = 7a₉ + 6a₈ + 3a₇ + 7a₆ + 4a₅ + 6a₄
b₃ = 6a₉ + 7a₈ + 5a₇ + 7a₆ + 8a₅ + 5a₄ + 4a₃
b₂ = 7a₉ + 2a₈ + 4a₇ + 2a₆ + 5a₅ + 5a₄ + 0a₃ + 2a₂
b₁ = 3a₉ + 9a₈ + 5a₇ + 1a₆ + 7a₅ + 3a₄ + 9a₃ + 5a₂ + 1a₁
b₀ = 6a₉ + 6a₈ + 6a₇ + 6a₆ + 6a₅ + 6a₄ + 6a₃ + 6a₂ + 6a₁ + 1a₀

You can see right away that this set of equations is far more complex than the 16-bit case described above. In addition, after doing those calculations, the b_n will be much larger than in the 16-bit case, so there is more work to normalize them, and we have many more digits to normalize.

I'm using a lowly PIC with no multiply instruction, and that is a lot of multiplies! Recognize that most a_n are used multiple times at various multiples from 1 to 9. For example, we use multiples 1,3,4,6,7,8,9 of a₉. So we can simply add a₉ to itself multiple times, using the intermediate result to accumulate a partial result of each b_n which uses it. Repeat for each a_n. This avoids using any multiply operations.

One gotcha, also not present in the 16-bit case, is that the accumulated b_n can overflow. We need to detect that and account for it. Checking for overflow is easy, just check the carry bit of STATUS after adding. One effective way to account for overflow is to subtract 200 from the number, then add 2 to b_n+2. In my code, I check for carry only after at least 17*a_n are accumulated into any b_n, since 17*0xF = 255. After this point, carry must always be checked for that b_n.

After finishing these calculations, we still need to normalize all b_n into the range 0..9 so that they represent decimal digits. The trivial way is to repeatedly subtract 10 from b_n then add 1 to b_n+1, until b_n is less than 10. Due to the potentially large size of each b_n, this may take many iterations. The number of iterations can be reduced by first using a larger number, such as 100.

The code here is considerably larger than John's original 16-bit code or the adapted 20-bit code posted in this thread, but given the scale of the work for a 40-bit number, it is reasonable. Execution time is also reasonable given that no special optimizations were made to the equations. Code size is approx 430 instructions, and execution time for worst case 0xFFFFFFFFFF is 998 instruction cycles.

(edit: attached code removed due to bug, see revised code below)

jpanhalt · « **Reply #8 on:** December 04, 2023, 08:58:18 pm »

1000 Tcy is still a lot less than the time for Hemsley and other methods on PICList for just 32-bit. Good work.

Thanks for posting. I will play later. The idea of avoiding negatives had occurred to me as a necessity as the bits increased past 20 or so, unless you wanted to go double precision. ( I haven't looked at your code yet.) However, "normalization" with newer enhanced PICs becomes easier using subtraction with positive values.

Thanks again.

John

macboy · « **Reply #9 on:** December 05, 2023, 10:36:19 pm »

What good piece of software has no bugs? First, I'm not setting the digit 13 (only 0 or 1) properly. Probably introduced that when cleaning up the code. Easy to fix. But I also ran a test vector of 10000 random numbers through it and 5 had the wrong result. I'll search for that issue as time allows.

macboy · « **Reply #10 on:** December 06, 2023, 01:51:28 pm »

The bug was a simple typo preventing one digit from being processed, causing it to later overflow only under very specific conditions.
I attached fixed code here, including some added code to read test numbers from the UART, and output the result on UART. This is meant to be run in MPLAB SIM with a prepared file as input. The output file is then compared to a prepared expected output file, which should be identical. The input and expected output can be prepared by a script or with help of a spreadsheet, etc. The attached files contain almost 10000 randomly generated numbers to convert to BCD. MPLAB SIM takes only a few seconds to run the test.
Good testing makes good software.

jpanhalt · « **Reply #11 on:** December 06, 2023, 06:48:39 pm »

Thank you.

I find it very difficult to proofread anything I write, including code. That is particularly true when it is fresh in my mind. I see what I wanted to say, not what is actually on paper. Reviewing something a week or so later always helps.

SiliconWizard · « **Reply #12 on:** December 06, 2023, 11:20:58 pm »

Definitely, and that's also why having stuff reviewed by others can help as well.

jpanhalt · « **Reply #13 on:** January 03, 2024, 03:32:16 pm »

Encouraged by macboy and not really comfortable with a band aid solution, I decided to work on a full 24-bit Binary to BCD conversion using a polynomial approach (attached) to fill the holiday week.

1) It seems to work in simulation.
2) I decided to post a development version with all the variations I tried to get normalization to run a little faster. I have not carefully gone over the comments.
3) I have tested a variety of values to check for potential overflow problems. The attached Test Problems does not include all of them. It is simply 0x000000 to 0xFFFFFF in which each nibble (ai) is duplicated.
4) Statistics:
a) Average Tcy to the fill the basic equations was 88 (range 78 to 100)
b) Average Tcy for each of the normalization protocols was: 263 (-10 only), 243 (-20 + -10), 220 (-30 + -10), 210 (-40 +
-10) and 217 (-50 + -10). That's a pretty small savings. I will use either -10 alone or -10 and -40. The macros can be
cleaned up, if one is only going to use one.
c) Adding the averages, it's about 300 to 350 Tcy total, depending on which normalization one uses.

EDIT: There is an error in the Assembly code. I forgot to add b3 (thousands). That has been corrected. The corrected average times are: 204 for -20, 199 for -30, 238 for -40, and 202 for -50, which gives a total of about 300 Tcy., with -20, -30, and -50 being about the same. I will probably use -30. The revised current code has been uploaded. As there is no way to edit attachments, i deleted the code with the error.

themadhippy · « **Reply #14 on:** January 03, 2024, 03:46:58 pm »

jeez so much hard graft,but i suppose thats the advancement of technology ,to think a few years ago it was as hard as just chucking a 74185 or 2 into the circuit

macboy · « **Reply #15 on:** January 03, 2024, 04:38:30 pm »

Quote from: themadhippy on January 03, 2024, 03:46:58 pm

jeez so much hard graft,but i suppose thats the advancement of technology ,to think a few years ago it was as hard as just chucking a 74185 or 2 into the circuit

A 74185 or 2 ... or 27.
Yes, 27 of those ICs for 20-bit binary to BCD conversion. Of course, you got the result in circa microsecond time scale, orders of magnitude faster than the microcontroller. I can't imagine the 40-bit hardware version. If you have a binary number on a bus, and need a BCD number on a bus, very quickly, then a hardware solution might be the one. If you have a 40 bit number in microcontroller memory, and need to consume the result in the microcontroller, then a software solution is called for. Horses for Courses or whatever.
The conversion between number bases (the fundamental problem being solved here) is non-trivial especially for large numbers. For some, creating and optimizing the algorithm is fun and rewarding because of the challenge.

jpanhalt · « **Reply #16 on:** January 03, 2024, 04:50:35 pm »

This ultimately is for my project to reuse the linear scales from my Mitutoyo DRO on my Bridgeport-type mill on my lathe. At 16 MHz, 1000 Tcy to go from BCD mm to BCD inch won't be a noticeable delay. Of course, that DRO could be used directly, but not for controlling any of the lathe functions, such as controlling threading. The slowest part will be the scales, which can only be read about every 30 to 50 ms or so.

PCB.Wiz · « **Reply #17 on:** January 03, 2024, 10:12:24 pm »

Quote from: jpanhalt on January 03, 2024, 04:50:35 pm

This ultimately is for my project to reuse the linear scales from my Mitutoyo DRO on my Bridgeport-type mill on my lathe. At 16 MHz, 1000 Tcy to go from BCD mm to BCD inch won't be a noticeable delay. Of course, that DRO could be used directly, but not for controlling any of the lathe functions, such as controlling threading. The slowest part will be the scales, which can only be read about every 30 to 50 ms or so.

I looked at my notes for 8051, and found a Bin32ToBCD10R, that does 32bit BIN to 10 BCD digits, in 77 bytes and 70~85us on an older 1T 8051 @ 24MHz sysclk.

jpanhalt · « **Reply #18 on:** January 03, 2024, 10:22:27 pm »

Sure, but that MCU is not a PIC. Please convert it to the PIC RISC so we can compare. The midrange PIC's have very limited math capability and lots of peripherals.

jpanhalt · « **Reply #19 on:** January 04, 2024, 04:10:06 pm »

There was an error in the code attached to post #14. That has been corrected. See edit there.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: 20-bit BIN2BCD (Assembly) (Read 2755 times)

Share me