Thought I'd start a thread on this, as it seems:
1. There isn't all that much hands-on info out there (e.g. just a few threads on StackOverflow).
2. Have seen
one example, which... I didn't read in detail, and it looks overly verbose, concentrating more on the abstraction than the mechanism? Will read in detail yet.
Also relevant, the Langdon paper:
https://www.cs.cmu.edu/~aarti/Class/10704/Intro_Arith_coding.pdf3. The abstract concept is slick, and reasonably easy to understand. But I'm still absorbing it and thinking through how it works mechanically, in particular when it comes to multiples that aren't powers of 2 (i.e. when the rescaling (multiplication) step isn't a trivial bit shift).
4. I've written a Huffman-like encoder before. Which, hmm, I don't have the code released anywhere, but I used it to store the small and bold fonts in
this screenshot, which netted a modest 59% compression ratio. Although it's rather less when the decoder is included, which is rather complex, and also probably in need of optimization. So, I understand Huffman coding, and I also understand how can be seen as (is equivalent to) a restricted case of A.C. Which almost implies that it should just drop right in...
I'm not sure yet how a base-2^N encoding is dramatically different from a Huffman encoding; I can see that, for very extreme probabilities, a given state can be persistent between bits, i.e., a rate of <1 bit/symbol is possible, or equivalently, a bit can encode repeated symbols. I'm not so sure how to en/decoded that... Probably this becomes obvious when limited precision is solved.
FWIW, these are the statistics for Lucidia 8pt, code points 32-127, the graphics data I'm working with right now. Antialiased, 16 color palette, 0 = black, 15 = white. Understandably, white is massively more common than the rest, black has a modest showing, and the rest are pretty evenly distributed. I've added the cumulative probability, because... that has something to do with arithmetic encoding but I haven't solved that yet either.
Index Value Count Probability (%) Cum. Prob.
0 0 824 13.35 0
1 23 92 1.49 13.35
2 36 116 1.88 14.84
3 54 110 1.78 16.72
4 68 88 1.43 18.50
5 85 73 1.18 19.93
6 100 57 0.92 21.11
7 119 53 0.86 22.03
8 134 49 0.79 22.89
9 151 76 1.23 23.68
10 165 96 1.56 24.91
11 183 73 1.18 26.47
12 197 113 1.83 27.65
13 213 93 1.51 29.48
14 229 111 1.80 30.99
15 255 4147 67.20 32.79
99.99 (rounding +/-0.01)
1-14 1200 19.45 (avg. 1.39%/ea.)
Total 6171 100
https://www.seventransistorlabs.com/Images/LucidiaGuides.pngHere's some sample graphics, including red dots marking the start of each character box (it's a variable width font after all). And the 90 degree rotation as mentioned, and also it's mirrored because... I never flipped that bit on the ST7735 so just encoded the font flipped instead, whatever.

Part of my motivation behind this is, the original (Huffman) encoder I wrote, it's column oriented. Which works fine on the ST7735 display I was playing with at the time, which has write modes in all cardinal directions. But now I'm playing with something that only has row-major writes, so I can't really do that. So, the
half-assedsimplest porting over of the Huffman source, draws the text at a 90 degree angle.

I suspect that arithmetic encoding can not only provide better compression, but better compression even for very short sequences (like single rows of a character). I think what I want is to encode the characters, one row at a time, so that the string can be drawn one row at a time, selecting each segment of that row from the respective characters, before moving to the next row and translating/decoding, etc. (This saves the most display I/O -- setting the draw window is relatively expensive, and it can only be set in horizontal multiples of 8 pixels. So, I could also draw whole block characters at a time, but it's slow.) I can also add a higher level compression, like eliminating redundant (identical) rows of characters; which should help modestly for characters with lots of verticals (1, l, I, etc.), but I'll have to run some tests to see how much it does overall; it might not actually be worthwhile.
- Why not just use X?
I'll get to zlib and friends, sooner or later. Compression rate is probably much better, too. And the method is pretty much what I just hinted at, i.e. referencing a dictionary of various sized blocks. But as long as I don't
know arithmetic encoding, I would like to address that, and having a motivating application is best.

- Why not buffer the graphics before writing?
AVR platform, teeny RAM. Graphics are 24 bit. I can't do worse than packed 4-bit bitmaps. Which, is what I'd be doing... if I didn't already write a compression routine. But I did, so obviously I want even more!
An aside, the memory allocation on this project is ludicrous... MCU has 16k SRAM and 256k Flash, video controller has dedicated (note, no bus read access) 16MB SDRAM, and a 4GB SD card (or more). Just personal yak shaving, no need for sensible design.

Tim