I was trying to understand why we use BCD codes? I only understand that instead of operating the whole decimal number, each decimal number is converted to 4-bit or 8-bit binary and then operated on, which might be considered a BCD and easy code to work with. But can you please help me understand any application (with some easy example if possible) or the concept on why we do this? Is there any links to further understand this concept?
Please feel free to correct me if my understanding of BCD is itself not correct.
BCD largely means that each digit in a number is coded into 4 bits, usually two digits in a 8 bit byte.
As you can probably infer from most of the answers, BCD isn't really commonly used today due to the large computing power available even in the lowest cost processors. In the days where BCD was prevalent, it was far less computationally expensive to take a number in and store it in BCD, do all your math in BCD, and then output each digit from BCD. Nowadays it doesn't really matter.
As an example, imagine adding the numbers 45 and 12, entered as a string. Without BCD, you'd enter 45, the computer would convert the '4' in ascii to the number 4, multiply it by 10, then convert the '5' to the number 5, and add it to the previous result. This would be repeated with 12. So you'd end up with binary numbers 45 and 12. You can then simply add the binary numbers. To get it back out you have to take the result, divide it by 10, giving you 5 (integer math). Then convert the '5' to ascii and output it. Then you have to take the 5, multiply it by 10, subtract this result (50) from the result of the addition (57), giving you 7, then convert the 7 to ascii and output it.
Note that multiply and divide by 10 were fairly expensive things to do, generally taking a lot of time (relatively). This is because most of the early processors didn't natively do multiply, so it had to be done with some sort of multiplication routine. If you knew you were dividing or multiplying by 10 you could probably optimize a routine to do this, but you were still looking at a lot of processor cycles to do so.
For comparison sake, let's just say that with older processors, 100 or so instruction cycles wouldn't be unreasonable to expect for this operation. This might be a bit off (or even way low), but it's good enough for this discussion.
If you were doing the same thing with BCD, it would be much simpler: You'd take the first digit from your string, '4', convert it to the digital number 40, with a single subtraction. Convert the '5' to the number 5 (another subtraction), then bitwise-or it with the existing value. Depending on the number of registers this is probably about a 3-4 instruction operation so far. Repeat with the second number. The addition can either be done without any additional effort with some processors (which handle BCD automatically), or sometimes it's 3-4 instructions to do BCD if the processor doesn't natively support it. Once you have the result, outputting it is easy. You take the result, convert the top 4 bits to the first character and output it, and convert the bottom 4 bits to the second character and output it. Probably another 3-4 instructions.
So you have for the BCD implementation around 15 instructions. A clever assembly programmer might even be able to trim this down to 10 or so.
With modern CPU's which run billions of instructions a second, the added overhead isn't a big issue, especially since you're not really outputting characters anymore, but drawing them on the screen. So the actual math to strip out each digit is rather simple in comparison. It is also far more efficient to multiply and divide in modern CPU's and with code generated by modern compilers, as the CPU often has instructions which will do a very wide multiply in a couple of cycles.
But back when BCD was big, computers were lucky to run a million instructions per second, with 500,000 p/s being pretty common, so using an efficient coding like BCD was a big deal. There are still places where BCD is probably still more efficient, usually in embedded and small processor applications - such as when writing code which will run on a ten cent processor which runs at 32khz for some sort of cost critical application. Or something driving a display. And so on. But in most cases, it's pretty much not used anymore.