Author Topic: Best library/algorithm for low RAM embedded data compression?  (Read 2889 times)

0 Members and 1 Guest are viewing this topic.

Offline Martin F

  • Regular Contributor
  • *
  • Posts: 128
  • Country: dk
Best library/algorithm for low RAM embedded data compression?
« on: October 01, 2019, 06:08:52 pm »
Hi all,

We're looking to implement streamed data compression on our data logger.

The data logger records vehicle data at very high frequency (generating 1 MB/minute of time series data), which is stored on the internal SD card. Data includes raw CAN bus form of time series data like vehicle speed, RPM, etc. We log this data in a binary format - see a sample here: https://uploadfiles.io/khxs13bt

Currently we're looking to implement embedded compression on our device - and we're trying to find the best suitable libraries/starting points.

Algorithm requirements (UPDATED):
- Implemented on ARM Cortex-M7 running 300 MHz
- Algorithm cannot use heap (malloc)
- Support for fast shutdown, max ~50 ms (likely not working if large output / working buffers are used)
- Flash usage <100kb (the flash could potentially store a static dictionary)
- Ram usage <30kb
- Target compression rate > 40% on test data file
- System is currently running at 30% idle time. Target idle time with compression >20%
- Entire file zipped
- Lossless compression

We're particularly interested in suggestions for:
- Suitable compression libraries (we've e.g. looked at LZ4)
- Good articles/literature on e.g. benchmarks of different libraries or new promising techniques/algorithms
- Specific compression concepts/methods that could be particularly relevant for this type of embedded compression
- Experts e.g. from academia that could potentially provide a bit of sparring
- Libraries & concepts related to the use of dictionaries

Your inputs would be highly appreciated - thanks!

Martin
« Last Edit: October 02, 2019, 07:55:39 am by Martin F »
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 2182
  • Country: fi
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #1 on: October 01, 2019, 07:20:37 pm »
Consider zlib, very well known, very widely used. Open source, tools available. I think you can get the memory footprint somewhere around 20-30kB with compile-time parameters limiting the search window. You can stream it, you need a buffer to hold chunks, but you can decide the chunk size yourself.

You can approximate the zlib-like generic data compression ratio by simply zipping your current dataset to give a first order approximation (zlib with reduced memory footprint and reduced compression level setting (to save CPU) will perform somewhat worse, but the ballpark will be the same). If your data contains a lot of sensor noise, your expectation of 30-40% compression is valid. If your data has a lot of repeat, similar values, only small amount of noise, you can easily have compression ratios well over 90%. Test it.

What about worst case? Can you accept the fact that sometimes your compression ratio sucks, or even slightly increases the size? Or are you just looking to maximize the service time between SD card readouts, so only average matters and isn't critical.

 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 4225
  • Country: fr
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #2 on: October 01, 2019, 07:28:50 pm »
Consider zlib, very well known, very widely used. Open source, tools available. I think you can get the memory footprint somewhere around 20-30kB with compile-time parameters limiting the search window.

Are you sure? You may be right, but remembering having to use zlib a couple years ago, and deal with its source code, my first impression would be that I'd really doubt that... or maybe it needs to be seriously stripped down and configured.

 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 18554
  • Country: nl
    • NCT Developments
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #3 on: October 01, 2019, 07:30:43 pm »
What I have done in the past for vehicle data (albeit at a much lower rate but for long periods) is to store data as text in an XML-ish format and only fill a field when it has changed. Otherwise it is a comma. This way I got to much less data compared to binary data. I don't recall the compression rate exactly but I is significant (like 50% IIRC). Because you only store numbers you could compress the ASCII into 5 bits instead of 8.
« Last Edit: October 01, 2019, 07:33:18 pm by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 4225
  • Country: fr
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #4 on: October 01, 2019, 07:46:49 pm »
Well, for text data, using just a simple Huffman algorithm (very lightweight) can get you something in the 30% to 50% on average. Takes almost nothing both in code and data. Could be enough. Now if it's pure binary data, that will really depend on the content.

Of course, for logged data, a simple approach can be to leverage the fact that successive measurements may be very close to one another? So you could devise your own compression based on this. Data loggers that log physical measurements rarely yield completely random data...
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 2182
  • Country: fi
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #5 on: October 01, 2019, 08:02:03 pm »
Consider zlib, very well known, very widely used. Open source, tools available. I think you can get the memory footprint somewhere around 20-30kB with compile-time parameters limiting the search window.

Are you sure? You may be right, but remembering having to use zlib a couple years ago, and deal with its source code, my first impression would be that I'd really doubt that... or maybe it needs to be seriously stripped down and configured.

I'm referencing to this:

Quote
The memory requirements for compression depend on two parameters, windowBits and memLevel:

    deflate memory usage (bytes) = (1 << (windowBits+2)) + (1 << (memLevel+9))

For the default values of 15 and 8, respectively, this is 256 KB. Both windowBits and memLevel can be set to lower values at compile time via the MAX_WBITS and MAX_MEM_LEVEL macros, but only at a cost in compression efficiency.

The memory requirements for decompression depend only on windowBits, but this is, in a sense, a harsher limitation: whereas data streams compressed with a smaller window will merely be a bit larger than they would have otherwise, a reduced window size for decompression means that streams compressed with larger windows cannot be decompressed at all. Having said that:

    inflate memory usage (bytes) = (1 << windowBits) + 1440*2*sizeof(int)
( https://zlib.net/zlib_tech.html )

Assuming decompression isn't needed on-board, 20-30kB of RAM should be possible. I have no idea how detrimental setting windowBits=8 and memLevel=5, for example, would be for compression ratio. It's worth trying. These parameters can be adjusted runtime, example code to test it on a workstation is approx. 20 lines of code.

Zlib is one of the easiest libraries I have ever seen. First ever implementation from zero to a working solution was less than two hours. This is why it may be worth trying before going into custom.

The point is, if the data is raw CAN frames, it's going to contain the same repeating IDs, likely a lot of always-zero MSbs, flag fields and whatnot, and it's going to compress really well using a generic algorithm (like zlib), even with limited search window and limited chunk size (since the CAN frames are small, and a vehicle CAN system won't have data from gazillion different sources, only a dozen max). Any custom solution would be either reimplementing some classical generic compression, or adding a lot of runtime interpretation and parsing of the data to store it more efficiently.

If you do parse all of the data anyway, then reorganizing it might be the best bet.
« Last Edit: October 01, 2019, 08:09:10 pm by Siwastaja »
 

Offline Martin F

  • Regular Contributor
  • *
  • Posts: 128
  • Country: dk
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #6 on: October 01, 2019, 08:13:35 pm »
Hi again,

Thanks a lot for all the great inputs already! One challenge in our setup is that we're unable to use heap and hence malloc.
I think zlib uses heap from what we've gathered, but let me know if I'm wrong on this account.

Thanks again!
« Last Edit: October 01, 2019, 08:17:36 pm by Martin F »
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 18554
  • Country: nl
    • NCT Developments
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #7 on: October 01, 2019, 08:20:44 pm »
Well, for text data, using just a simple Huffman algorithm (very lightweight) can get you something in the 30% to 50% on average. Takes almost nothing both in code and data. Could be enough. Now if it's pure binary data, that will really depend on the content.
I'm not talking about compressing text. I'm talking about using text as a way to reduce the amount of data. In binary a 4 byte number will always take 4 bytes. Even if it doesn't change then it depends entirely on the data surrounding it whether it can be compressed or not. Say you have a binary record with 6 4 byte numbers (24 bytes in total) The text format I used works as follows:
Code: [Select]
1234;56473;587596;226;5859;492|
;;;;;|
1236;56472;;;;|
The first record is complete. In the second record nothing has changed, in the third record only the first 2 fields changed. In total 52 characters are used instead of 72 bytes. Shove the 52 characters into 4 bits (0..9 and 2 extra characters as field seperators and record seperators) and you need 26 bytes which is more than a 60% compression ratio. Like any compression algorithm the actual compression will depend on the data but if you need something with a small footprint and low processing overhead then this is a simple & effective way.

@Martin F: I think an existing compression library will be hard to use on a microcontroller since you'll need lots of memory (both flash and RAM) and a heap. I have been down that road trying to compress FPGA images.
« Last Edit: October 01, 2019, 08:24:08 pm by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 2182
  • Country: fi
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #8 on: October 01, 2019, 08:36:46 pm »
Well, for text data, using just a simple Huffman algorithm (very lightweight) can get you something in the 30% to 50% on average. Takes almost nothing both in code and data. Could be enough. Now if it's pure binary data, that will really depend on the content.
I'm not talking about compressing text. I'm talking about using text as a way to reduce the amount of data. In binary a 4 byte number will always take 4 bytes. Even if it doesn't change then it depends entirely on the data surrounding it whether it can be compressed or not. Say you have a binary record with 6 4 byte numbers (24 bytes in total) The text format I used works as follows:
Code: [Select]
1234;56473;587596;226;5859;492|
;;;;;|
1236;56472;;;;|
The first record is complete. In the second record nothing has changed, in the third record only the first 2 fields changed. In total 52 characters are used instead of 72 bytes. Shove the 52 characters into 4 bits (0..9 and 2 extra characters as field seperators and record seperators) and you need 26 bytes which is more than a 60% compression ratio. Like any compression algorithm the actual compression will depend on the data but if you need something with a small footprint and low processing overhead then this is a simple & effective way.

@Martin F: I think an existing compression library will be hard to use on a microcontroller since you'll need lots of memory (both flash and RAM) and a heap. I have been down that road trying to compress FPGA images.

You are taking advantage of a peculiar feature of your dataset of it mostly containing small numbers, but having some odd large outliers. Sure, in that case, the extra overhead of using ASCII may be less than the compression you achieve.

You can achieve the same idea in many other ways, without using ASCII, in binary, and then the result will be even smaller. For example, this:
If the value is < 65535, write it out in 16 bits (2 bytes).
If the value is >= 65535, first write 65535 (an identifier that 4 more bytes will follow), then all the 4 bytes.

As a result, a typical "median" case will take 2 bytes (instead of 5 as in your ASCII solution), and the worst case takes 5 bytes (as opposed to 11 bytes in your ASCII version).

Your first 6-item dataset is 24 bytes in full 32-bit numbers, 30 bytes in your ASCII representation (it has grown, not shrunk as you seem to think), and 15 bytes in my proposed encoding.

Your data size reduction comes from your "not changed" compression. Binary solution for the same is obviously even smaller.

Your idea of applying temporal compression by only saving changed fields is sound and widely used, but I struggle to understand how you somehow attribute it to using ASCII, and especially XML-ism (which your encoding obviously doesn't have, thank god) ???

Do note that in presence of noisy sensor readings, such simple scheme is only useful if it's about some sensors sampling more often than others, and your logging scheme forcing you to do equitemporal writes. In reality, a logger which can list what values are present and what are not (instead of always comma-separating everything) would likely be even smaller. Especially if you could fit the data ID into 8 bits.

Usually in vehicle data logging, I'd guesstimate, the sensor measurements fall within a much smaller range, so that 32-bit values are not needed. And here lies the clue to the idea of rearranging: if you parse the data anyway, you may be able to fit RPM, for example, to 14 bits. If you happen to have a 8-bit wide flag field which only contains 1 or 2 bits, you can just put them to the always-zero MSbs of the 16-bit RPM value. But if you want to just dump raw CAN data stream to the card, having to parse everything to rearrange it may be an unnecessary burden that even a simple generic compression algorithm would solve for you without much custom work.
« Last Edit: October 01, 2019, 08:49:03 pm by Siwastaja »
 

Offline Martin F

  • Regular Contributor
  • *
  • Posts: 128
  • Country: dk
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #9 on: October 01, 2019, 08:42:57 pm »
One extra note: We do recognize that an existing library may be a challenge due to us being unable to use heap and due to the 20-30 kB RAM limit — as such were also interested in basic algorithms as some have suggested.

Ie if there’s a base algorithm that would make sense for our type of data and setup, as well as perhaps some suggestions for concepts to combine with this in order to achieve what we’re looking for - then we’d probably be able to code this up from scratch in C in relatively short time.

In particular it would be interesting for us if there’s some frontier techniques and algorithms that might be particularly suited for this type of application. We’ve tried to look to academia as well for inputs, though it’s also a jungle and difficult to separate the good stuff from the rest.
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 18554
  • Country: nl
    • NCT Developments
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #10 on: October 01, 2019, 08:54:20 pm »
Do note that in presence of noisy sensor readings, such simple scheme is only useful if it's about some sensors sampling more often than others
Not really. Data from noisy sensors won't compress either way. After all you can't compress noise. What all compression algortihms have in common is that you assign the shortest 'code' to the most occuring data pattern. However in some cases using an algorithm tailored to the data is more effective than a standard compression algorithm. A good example is Xilinx configuration data. If you pull this through a generic compression algorithm you won't get much compression. Use an algorithm which is optimised and you get much better results; even if the algorithm in itself isn't very complicated. Data from sensors is a similar situation. You can make use of the fact that the data consists of rows. A generic compression algorithm won't really care about this and just sees it as a blob of random data.
« Last Edit: October 01, 2019, 08:59:51 pm by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 2182
  • Country: fi
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #11 on: October 01, 2019, 08:59:15 pm »
Do note that in presence of noisy sensor readings, such simple scheme is only useful if it's about some sensors sampling more often than others
Not really. Data from noisy sensors won't compress either way. After all you can't compress noise.

This is stupid. Data like:
1000
1001
999
1005

compresses fairly well. How well, it depends on the algorithm, but unless it totally sucks, it will compress significantly.

But it doesn't compress through a simple "is changed?" algorithm.

You are right if your sensor only gives 100% white noise. That can't be compressed.

 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 18554
  • Country: nl
    • NCT Developments
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #12 on: October 01, 2019, 09:02:01 pm »
Do note that in presence of noisy sensor readings, such simple scheme is only useful if it's about some sensors sampling more often than others
Not really. Data from noisy sensors won't compress either way. After all you can't compress noise.

This is stupid. Data like:
1000
1001
999
1005

compresses fairly well. How well, it depends on the algorithm, but unless it totally sucks, it will compress significantly.
Try it and you'll see it isn't as easy to come up with a generic algorithm. I have taken some courses on compression algorithms BTW.

The first step is to analyse the data and see how it can be compressed.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 2182
  • Country: fi
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #13 on: October 01, 2019, 09:03:19 pm »
Code: [Select]
hrst@jorma ~/Desktop $ xxd -p 7f34b296_00000009_00000008.mf4 | grep "ff" -o | wc -l
1049531
hrst@jorma ~/Desktop $ xxd -p 7f34b296_00000009_00000008.mf4 | grep "00" -o | wc -l
2518089

Taking a quick look at the data, it seems zero and 255 bytes are taking approx. 40% of your data.
 

Offline Martin F

  • Regular Contributor
  • *
  • Posts: 128
  • Country: dk
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #14 on: October 01, 2019, 09:06:38 pm »
That’s correct, 00 and FF are very common in the data so there might be some option to utilize that in the compression
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 2182
  • Country: fi
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #15 on: October 01, 2019, 09:13:34 pm »
What all compression algortihms have in common is that you assign the shortest 'code' to the most occuring data pattern. However in some cases using an algorithm tailored to the data is more effective than a standard compression algorithm. A good example is Xilinx configuration data. If you pull this through a generic compression algorithm you won't get much compression. Use an algorithm which is optimised and you get much better results; even if the algorithm in itself isn't very complicated.

Can confirm this is spot on, and I recommend looking at the data if it is easily available. (By available, I mean, you parse the CAN packets on that MCU, and read the data in meaningful variables like rpm, speed...)

OTOH, if you need just to log all the CAN traffic and cater for future messages you don't know about, parsing and understanding everything may be too tedious. If you need to look at all the data anyway, then by all means compress it in a specifically tailored way. Simple restructuring (in binary, not in ASCII nor XML) without actual compression may already solve the case. By restructuring, I mean inefficiently chosen bit widths and data fields that can be combined or even stripped away if you can be sure they will be of no interest.

This dataset has a header from some sort of CAN logging device, so I'm kinda assuming they are after "raw" logging. If the data contains full CAN packet headers, and timestamps with massive resolution, the obvious thing to do is to remove most of the header and reduce the number of timestamp bits, even allowing it to wrap around (you can count the number of wrap arounds during decode, assuming there are no long silent periods).
« Last Edit: October 01, 2019, 09:17:47 pm by Siwastaja »
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 6909
  • Country: us
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #16 on: October 01, 2019, 09:41:05 pm »
Have you thought about run length encoding?  If the data doesn't change, it isn't written.  When it is written, there is a length count ahead of the actual data value.
 
The following users thanked this post: Kilrah

Offline jhpadjustable

  • Frequent Contributor
  • **
  • Posts: 295
  • Country: us
  • Salt 'n' pepper beard
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #17 on: October 01, 2019, 09:42:21 pm »
One extra note: We do recognize that an existing library may be a challenge due to us being unable to use heap and due to the 20-30 kB RAM limit — as such were also interested in basic algorithms as some have suggested.
You can verify by visual inspection that zlib only uses dynamic memory allocation when building and tearing down the state object and buffers at the start and end of the (in|de)flate functions, does not allocate more while the (in|de)flation is in progress, and does not reallocate at all. You could modify the code to replace the dynamic allocations with static buffers quite easily if your coding standards so require. It's extremely permissively licensed (public-domain) so nobody needs to see what you did to it. ;)

The zlib deflate algorithm is documented in an IETF RFC if you still prefer not to use zlib code.
"There are more things in heaven and earth, Arduino, than are dreamt of in your philosophy."
 

Online Mechatrommer

  • Super Contributor
  • ***
  • Posts: 9391
  • Country: my
  • reassessing directives...
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #18 on: October 01, 2019, 10:12:52 pm »
best? i dont know but this is another https://liblzg.bitsnbites.eu/ i think to know whats best, you  have to try few schemes, including your own custom made specific to your data pattern. dictionary based is the famous general purpose to give better average compression rate but the code is not so simple. to know how good is the scheme, you may try to compare with well known application such as winzip, your 8MB data compressed to 2MB using winzip here, so that can be used as benchmark. the 8MB compressed to about 5MB using my super crappy simple (FIN) compressor 37% rate so within your spec range ;D iirc its 256 bytes sliding table, you may improve by increasing the table size... http://www.darkridge.com/~jpr5/mirror/alg/node175.html but sometime "best" is not dictated by compression rate alone, some other aspect such as program or memory size required, time needed to do the compression, is data recoverable even one byte corrupted? etc... ymmv.
if something can select, how cant it be intelligent? if something is intelligent, how cant it exist?
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 1333
  • Country: us
  • Formerly SiFive, Samsung R&D
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #19 on: October 01, 2019, 10:44:49 pm »
lz4, which you say you've already looked at, is still the best low CPU low memory use compressor.  You're not going to come close to it using "a simple huffman" compressor. zlib can get a  little better compression but at the cost of a LOT of CPU and memory.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 1333
  • Country: us
  • Formerly SiFive, Samsung R&D
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #20 on: October 01, 2019, 11:06:54 pm »
lz4, which you say you've already looked at, is still the best low CPU low memory use compressor.  You're not going to come close to it using "a simple huffman" compressor. zlib can get a  little better compression but at the cost of a LOT of CPU and memory.

lz4 -1 reduced your 8660615 byte file to 4004159 bytes in 0.025s.
lz4 -9 reduced your 8660615 byte file to 3047573 bytes in 0.057s.
gzip reduced your 8660615 byte file to 2839987 bytes in 0.328s.
gzip -9 reduced your 8660615 byte file to 2661840 bytes in 2.454s.
bzip2 reduced your 8660615 byte file to 2465104 bytes in 0.556s.

lz4 can reduce your file size by 50% to 60% very quickly. Yes, other algorithms can get an extra 7% to 12% compression over lz4 -9 or 20% over lz4 -1, but at the cost of a lot of time and memory.

I didn't experiment with the lz4 library, but you can decide how much RAM you want it to use. Using 20-30 kb is not a problem. Using less memory means you might pass up some opportunity to reuse a string from further back in the history, but I don't think it will be a big effect with your data.

 

Offline Martin F

  • Regular Contributor
  • *
  • Posts: 128
  • Country: dk
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #21 on: October 02, 2019, 07:14:05 am »
Hi again,

LZ4 could also be an option. We've also looked at e.g. Zstandard and c-blosc2, but we find it a bit difficult to identify the best option.

Note also that if we use a rolling dictionary, we're limited to the 20-30 kb RAM usage. However, an alternative might be to use a static dictionary, which we would then train based on data samples. The advantage of a static dictionary would be that we could utilize the device flash, in which we'd be able to allocate 100-150 kb - i.e. a significantly larger dictionary could be used.

If we go down the static dictionary route, do you have recommendations for what would be the best libraries for this - would it still be e.g. LZ4/Zstandard or similar, or something else? Also, do you know of suggested tweaks/optimizations to optimally utilize e.g. a 100 kb static dictionary on an embedded device for real-time streamed compression purposes?

Thanks again,
Martin
 

Online hamster_nz

  • Super Contributor
  • ***
  • Posts: 2196
  • Country: nz
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #22 on: October 02, 2019, 07:27:15 am »
Generic compression gets generic results. Data-aware compression can usually transform the raw data into something better,

For example, GPS location data compresses better if you convert it into a stream of an occasional reference points and lots of deltas.

When the GPS receiver is moving the deltas are more likely to be the same (and compress better) than a stream of constantly shifting lat-long-alt breadcrumbs.
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Online Mechatrommer

  • Super Contributor
  • ***
  • Posts: 9391
  • Country: my
  • reassessing directives...
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #23 on: October 02, 2019, 09:33:28 am »
If we go down the static dictionary route, do you have recommendations for what would be the best libraries for this - would it still be e.g. LZ4/Zstandard or similar, or something else? Also, do you know of suggested tweaks/optimizations to optimally utilize e.g. a 100 kb static dictionary on an embedded device for real-time streamed compression purposes?
those dynamically changing dictionary based algorithms work hard on producing optimal code for the continually evolving dictionary/table from previous input data in real time, sort of dynamic huffman code generation, but will not produce optimal code with drastically changing data pattern, unless you give it pretty big memory to store. furthermore you cant rebuild the same table during decode if the encoded data bits changed even the slightest, the decoded data afterward will be beyond recognition. if you can build static table/library, that will be a lot better you wont need complex algorithm such as LZx or what, you only need the precalculated optimal huffman code for your static library, so you'll have a static huffman code beforehand, and then all is needed is 1 to 1 map on your encoded output, and 1 to 1 remap back (reverse) during decode, pretty fast O(1) operation, no one algorithm can beat. and the corrupted encoded data can still be reconstructed to some degree on receiver end. the problem is when you change revision or upgrade to different data pattern, and needing different static table/dictionary to avoid suboptimal compression rate, decoder need to be aware of which revision the data is using. still you can have the fastest encoder, but slightly bloated decoder. but since system usually put much effort on encoder efficiency, so decoder is not much of a problem. ymmv.
« Last Edit: October 02, 2019, 09:37:03 am by Mechatrommer »
if something can select, how cant it be intelligent? if something is intelligent, how cant it exist?
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 4225
  • Country: fr
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #24 on: October 02, 2019, 12:10:43 pm »
Just a note about sensor noise. Whether it really hinders compression depends on the compression scheme and, of course, on the bandwidth of the noise. Wideband noise is kind of a worst case - it's getting close to completely random data, but you rarely get that from sensors - but a small amount of noise, which you'll often encounter, is not a big problem if you use some kind of compression scheme that only encode value differences. The key will just be the noise amplitude, and not the fact there is noise or not. Of course some schemes like RLE won't work well for noisy data, even if the noise amplitude is low, but other schemes, like just encoding differences, do. In the vein of ADPCM and the likes.

You can devise a compression scheme that combines several methods of course.

Finally, if you can get either zlib or LZ4 to yield >40% compression, and fit within 30KB RAM and 100KB code (including possible predefined tables), That'll be interesting. Don't hesitate to post figures if you can. For RAM usage, it seems possible from what has been noted above, but I still wonder whether this is not optimistic as to the real total RAM usage, including all local variables etc... but the thing I wonder about the most is code size. So, it'll be interesting to see.
 

Offline ebclr

  • Super Contributor
  • ***
  • Posts: 1997
  • Country: 00
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 3124
  • Country: us
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #26 on: October 02, 2019, 09:01:33 pm »
IIRC, the "original" LZW compression algorithms were designed to be implemented IN HARDWARE, using relatively small fixed memories (of the sort that you could find ~1983)  They use a compression table of 2n bytes to produce an n-bit output symbol, and utilities like "compress" let you specify "n", directly controlling the memory used.  "compress -b14" (should use 16k of memory for its table) yields about 44% compression on your sample.
(and it should be expandable using standard "uncompress.")
I'm not sure how fast it is; compression is typically much slower than decompression.

That would perhaps be "good enough", and it should be trivial to play with...

You probably won't find an existing unix-y library that doesn't use dynamic memory allocation.  Some should be trivial to convert to static arrays.

Beware licensing issues.  The story is that L,Z, and W patented what they thought was a hardware technique applicable to disk drives and such, and were merely amused when someone implemented the "compress" software utility.  Entities started getting less amused when the software algorithm started popping up all over, in devices like modems (which looked a lot like hardware...)  Implementing V.42bis (the compression used in last-gen dialup modems, based on LZW) and selling it required separate licenses from three different companies.  Not horribly expensive licenses by corporate standards (~$15k one-time?  Each or all together, I don't remember.)  But annoying, and maybe beyond the reach of smaller companies.
 

Offline I wanted a rude username

  • Frequent Contributor
  • **
  • Posts: 296
  • Country: au
  • ... but this username is also acceptable.
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #27 on: October 02, 2019, 10:24:08 pm »
lz4, which you say you've already looked at, is still the best low CPU low memory use compressor.  You're not going to come close to it using "a simple huffman" compressor. zlib can get a  little better compression but at the cost of a LOT of CPU and memory.

You're absolutely right for OP's application, but there's at least one use case in which Huffman is a solid choice: packing plaintext strings into a low-end microcontroller.

Huffman and gzip -1 both provided 53% compression on a corpus of 2 KiB of all-lowercase strings (and Huffman uses practically no RAM). Otherwise I would have used the lz4.org implementation.
 

Online hamster_nz

  • Super Contributor
  • ***
  • Posts: 2196
  • Country: nz
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #28 on: October 02, 2019, 10:35:59 pm »
So the bulk of the data appears to be signed data in 36-byte repeating structure:
Code: [Select]
c4 74 7f a2 42 47 ff ef 98 08 08 18 54 02 00 00 00 00 00 02 08 00 00 00 f2 20 18 18 00 00 ff ff 01 00 40 f0
 cd 74 7f a2 42 47 6e fe 88 08 08 24 54 02 00 00 00 00 00 02 08 00 00 00 ff ff ff ff 05 02 ff ff 01 00 60 91
 d5 74 7f a2 42 07 ff ef 98 08 08 30 54 02 00 00 00 00 00 02 08 00 00 00 f0 0d ff ff ff ff ff ff 01 00 20 b9
 de 74 7f a2 42 03 fe ff 8c 08 08 3c 54 02 00 00 00 00 00 02 08 00 00 00 8e 7c af cc d0 05 ff e5 01 00 e0 e0
 e7 74 7f a2 42 19 f8 ff 98 08 08 48 54 02 00 00 00 00 00 02 08 00 00 00 05 21 1a 3c 00 00 00 00 01 00 00 82
 ef 74 7f a2 42 26 31 e6 9c 08 08 54 54 02 00 00 00 00 00 02 08 00 00 00 a8 ea 03 00 00 00 00 00 01 00 60 30
 fa 74 7f a2 42 19 f8 ff 98 08 08 60 54 02 00 00 00 00 00 02 08 00 00 00 08 ff ff ff ff ff ff ff 01 00 00 8b
 2c 75 7f a2 42 c6 fe ff 8c 08 08 6c 54 02 00 00 00 00 00 02 08 00 00 00 ea 54 0a f5 09 00 ff ff 01 00 c0 b2
 35 75 7f a2 42 c6 fb ff 8c 08 08 78 54 02 00 00 00 00 00 02 08 00 00 00 f2 3f ff 00 00 00 00 00 01 00 80 da
 3e 75 7f a2 42 c6 fb ff 8c 08 08 84 54 02 00 00 00 00 00 02 08 00 00 00 f2 4f ff 00 00 00 00 00 01 00 00 5f
 5d 75 7f a2 42 00 03 f0 8c 08 08 90 54 02 00 00 00 00 00 02 08 00 00 00 ff fe 0c ff ff ff ff ff 01 00 c0 86
 66 75 7f a2 42 00 ff ef 8c 08 08 9c 54 02 00 00 00 00 00 02 08 00 00 00 64 06 55 7d ff ff ff ff 01 00 e0 27
 6e 75 7f a2 42 00 04 f0 8c 08 08 a8 54 02 00 00 00 00 00 02 08 00 00 00 f1 ff 88 e4 1c ff ff ff 01 00 a0 4f
 77 75 7f a2 42 00 23 f0 98 08 08 b4 54 02 00 00 00 00 00 02 08 00 00 00 00 00 f2 ff ff 72 ff ff 01 00 c0 cd
 ec 75 7f a2 42 06 00 ef 84 08 08 c0 54 02 00 00 00 00 00 02 08 00 00 00 64 15 17 f1 5a 2e 77 0e 01 00 80 f5
 f5 75 7f a2 42 47 f1 fe 98 08 08 cc 54 02 00 00 00 00 00 02 08 00 00 00 ff 05 02 ff ff ff ff ff 01 00 a0 96
 fd 75 7f a2 42 03 ff ef 8c 08 08 d8 54 02 00 00 00 00 00 02 08 00 00 00 6d 1a 00 10 e0 1c f2 03 01 00 60 be
 06 76 7f a2 42 31 ff ff 8c 08 08 e4 54 02 00 00 00 00 00 02 08 00 00 00 5c f1 ff 1e 00 00 ff fc 01 00 80 5f
 0e 76 7f a2 42 03 ff ef 8c 08 08 f0 54 02 00 00 00 00 00 02 08 00 00 00 6b 39 f9 08 dc 03 8d 2c 01 00 40 87
 17 76 7f a2 42 31 ff ff 8c 08 08 fc 54 02 00 00 00 00 00 02 08 00 00 00 5e 0d fc ff f3 4f f0 0f 01 00 60 28
 1f 76 7f a2 42 47 fc fe 98 08 08 08 55 02 00 00 00 00 00 02 08 00 00 00 ff 98 ff ff ff ff ff ff 01 00 c0 d6
 29 76 7f a2 42 05 22 f0 8c 08 08 14 55 02 00 00 00 00 00 02 08 00 00 00 31 02 ff ff ff ff ff 20 01 00 80 fe
 32 76 7f a2 42 05 fe ff 8c 08 08 20 55 02 00 00 00 00 00 02 08 00 00 00 8b ff ff ff ff ff fa ff 01 00 40 26
The data has two parts, text and binary.

The binary is flicking between FFs and 00s are caused by values that are slightly positive and slightly negative.

By biasing these values by 0x80000000 before you log them (or whatever size the field value is) you could make even more of the data zeros, aiding compression.

The real question is, if I do write spend time writing a data-aware heapless compression system for you, what is the reward?  ;D

If you want to play at home. run the datafile (renamed to "to_compress.dat") through and watch the patterns:

Code: [Select]
#include <stdio.h>

int main(int argc, char *argv[]) {
   int i = 0;
   int c;
   FILE *f = fopen("to_compress.dat","rb");

   c = getc(f);
   while(c != EOF) {
      printf(" %02x", c);
      if(i == 35) {
         putchar('\n');
         i = 0;
      } else {
         i++;
      }
      c = getc(f);
   }
}
« Last Edit: October 02, 2019, 10:38:29 pm by hamster_nz »
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 4225
  • Country: fr
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #29 on: October 02, 2019, 11:27:12 pm »
Obvious note: we talked a lot about footprint, but not much about execution time. If you have time constraints, some algorithms could simply be out of the question. That would certainly be something to check almost first. (And yes, I also think, as I hinted earlier, that a custom compression could be the answer here...)
 

Online hamster_nz

  • Super Contributor
  • ***
  • Posts: 2196
  • Country: nz
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #30 on: October 03, 2019, 12:06:07 am »
So...

Most common bytes:
Code: [Select]
  83785  fe
  94219  f0
 174505  8c
 246123  42
 248959  a2
 266717  01
 272894  02
 734713  08
1035297  ff
2449890  00

XORing with the character 36 characters ago produces mostly zeros:

Code: [Select]
17 fd 10 e3 00 00 c0 28 1b 00 00 00 00 00 00 00 00 00 00 0c 00 00 00 00 00 00 00 00 00 00 00 00 02 71 fb 08
28 f3 10 e3 00 00 40 f2 2e 00 00 00 00 05 fb 1f 00 00 00 14 00 00 00 00 00 00 00 00 00 00 00 00 9f df 85 0a
dd 0e 00 00 00 00 60 86 4b 00 00 00 00 8c fa 0f 00 00 00 7c 00 00 00 00 00 00 00 00 00 00 00 00 7d 05 14 f1
e6 00 00 b0 00 00 40 38 0f 00 00 00 00 00 01 00 00 00 00 14 00 00 00 00 00 00 00 00 00 00 00 00 d9 87 ee 86
86 82 00 b0 00 00 c0 e9 fa 00 00 00 00 00 00 00 00 00 00 0c 00 00 00 00 00 00 00 00 00 00 00 00 c2 82 82 00
82 82 00 82 00 00 e0 e0 50 00 00 00 00 cb b6 01 00 00 00 34 00 00 00 00 00 00 00 00 00 00 00 00 68 00 00 82
00 00 00 82 00 00 60 56 37 00 00 00 00 00 27 00 04 00 00 1c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
fa fd 00 00 00 00 40 78 0b 00 00 00 00 44 90 01 04 00 00 f4 01 00 00 00 00 00 00 00 00 00 00 00 71 83 50 33
d5 07 00 1b 00 00 a0 52 1f 00 00 00 00 2b 00 00 10 00 00 0c 00 00 00 00 00 00 00 00 00 00 00 00 29 2c ae cc
2f fa 00 1b 00 00 e0 9d c2 03 00 00 00 ee 00 00 10 00 00 14 00 00 00 00 00 00 00 00 00 00 00 00 4d 05 0b f8
f6 fb 00 00 00 00 60 c4 73 00 00 00 00 36 01 10 14 00 00 3c 00 00 00 00 00 00 00 00 00 00 00 00 1a 5c f5 07
f9 fb 00 00 00 00 c0 b3 22 00 00 00 00 f0 fc 1f 14 00 00 14 00 00 00 00 00 00 00 00 00 00 00 00 0f f7 f3 00
0f 00 00 00 00 00 20 61 08 00 00 00 00 00 fc 1f 00 00 00 0c 00 00 00 00 00 00 00 00 00 00 00 00 9b f8 46 82
00 00 00 00 00 00 40 38 19 00 00 00 00 00 fb 1f 00 00 00 74 00 00 00 00 00 00 00 00 00 00 00 00 95 f9 cd 75
e2 00 00 00 00 00 c0 68 f7 00 00 00 00 23 fb 0f 14 00 00 1c 00 00 00 00 00 00 00 00 00 00 00 00 d6 fe 7b f7
e2 00 00 00 00 00 60 17 1a 00 00 00 00 07 00 00 00 00 00 34 00 00 00 00 00 00 00 00 00 00 00 00 55 cd fd f5
36 00 00 36 00 00 c0 3f 39 00 00 00 00 f1 e5 01 14 00 00 0c 00 00 00 00 00 00 00 00 00 00 00 00 0f b1 f1 35
36 00 00 36 00 00 80 72 5c 00 00 00 00 d3 1a 11 08 00 00 14 00 00 00 00 00 00 00 00 00 00 00 00 19 68 ea ce
30 d2 88 f1 00 00 20 a3 f9 01 00 00 00 05 ff 00 08 00 00 fc 00 00 00 00 00 00 00 00 00 00 00 00 09 0f 1a e1
c7 30 85 0d 00 00 a0 f2 17 00 00 00 00 31 00 00 00 00 00 14 00 00 00 00 00 00 00 00 00 00 00 00 89 e5 87 ef
f7 e2 0d 03 00 00 20 a1 09 00 00 00 00 31 00 00 00 00 00 0c 00 00 00 00 00 00 00 00 00 00 00 00 8f e6 7e f7
33 fc 72 2c 00 00 40 e8 3a 00 00 00 00 06 01 10 00 00 00 34 00 00 00 00 00 00 00 00 00 00 00 00 e0 e6 06 f7
33 fc 77 d3 00 00 60 a3 0b 00 00 00 00 37 01 10 14 00 00 1c 00 00 00 00 00 00 00 00 00 00 00 00 7a e2 00 bd

Now makes the statistics:
Code: [Select]
  40975  1c
  45391  1f
  55494  82
  58589  0f
  63518  10
  67533  0c
  75553  40
  83274  c0
  85332  01
 152862  14
 169547  ff
5618071  00

Second set of encoding - use an 8-bit value to indicate as a bit mask of "these are zeros" and then follow the mask with with the non-zero byte.

Code: [Select]
e2 00 00 00 00 00 60 17 1a 00 00 00 00 07 00 00 00 00 00 34 00 00 00 00 00 00 00 00 00 00 00 00 55 cd fd f5 36 00 00 36

Broken into 8-byte words

Code: [Select]
e2 00 00 00 00 00 60 17
1a 00 00 00 00 07 00 00
00 00 00 34 00 00 00 00
00 00 00 00 00 00 00 00
55 cd fd f5 36 00 00 36

With the mask on front and zeros removed:

Code: [Select]
83 e2 60 17
84 1a 07
10 34
00
F9 55 cd fd f5 36 36

which becomes
Code: [Select]
83 e2 60 17 84 1a 07 10 34 00 F9 55 cd fd f5 36 36

40 bytes to 17 bytes = compression to ~42% of original size

You might need to reset the "last 36" characters array on a block boundary (e.g. every 8192 bytes of output) so you can recover from a partial file corruption.
« Last Edit: October 03, 2019, 12:16:55 am by hamster_nz »
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 
The following users thanked this post: I wanted a rude username

Offline ebclr

  • Super Contributor
  • ***
  • Posts: 1997
  • Country: 00
« Last Edit: October 03, 2019, 12:23:10 am by ebclr »
 

Online hamster_nz

  • Super Contributor
  • ***
  • Posts: 2196
  • Country: nz
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #32 on: October 03, 2019, 04:35:01 am »
Running with this far more than I should...

Here is a filter to take STDIN and compress it to STDOUT:

Code: [Select]
#include <stdio.h>

static char last[36];
static char emitting[8];
static int emit_count = 0;
static int emit_mask = 0;
static int emit_data_count = 0;

static void emit(char c) {
  int i;
  if(c) {
     emit_mask |= 1<<emit_count;
     emitting[emit_data_count++] = c;
  }
  emit_count++;

  if(emit_count != 8)
     return;

  putchar(emit_mask);
  for(i = 0; i < emit_data_count; i++)
     putchar(emitting[i]);

  emit_data_count = 0;
  emit_count = 0;
  emit_mask = 0;
}

static void emit_flush(void) {
  int i;
  // Zeros that are explicitly added at the
  // end of file, not by the "zero mask" are padding
  while(emit_count != 8)
  {
    emit_mask |= 1<<emit_count;
    emitting[emit_data_count++] = 0x00;
    emit_count++;
  }

  putchar(emit_mask);
  for(i = 0; i < emit_data_count; i++)
     putchar(emitting[i]);

  emit_data_count = 0;
  emit_count = 0;
  emit_mask = 0;
}

int main(int argc, char *argv[]) {
   int i = 0, c;
   c = getchar();

   while(c != EOF) {
      emit((c^last[i]) &0xFF);
      last[i] = c;
      if(i == sizeof(last)-1) {
         i = 0;
      } else {
         i++;
      }
      c = getchar();
   }
   emit_flush();
}

And here is the opposite to expand STDIN to STDOUT:

Code: [Select]
#include <stdio.h>

char last[36];
int last_i;
void emit(char c) {
   c ^= last[last_i];
   putchar(c);
   last[last_i] = c;

   if(last_i == sizeof(last)-1)
       last_i = 0;
   else
       last_i++;
}

int main(int argc, char *argv[]) {
  int mc;
  int mask;
  mc = getchar();
  while(mc != EOF) {
     int c;
     mask = 0x1;
     while(mask != 0x100) {
       if((mc&mask) == 0) {
         emit(0);
       } else {
         c = getchar();
         if(c != EOF && c != 0)
           emit(c);
       }
       mask = mask << 1;
     }
     mc = getchar();
  }
}

Very simple, and compresses the test 8,660,615 byte file to 4,125,122 bytes (47% of original size).


Algorithm requirements (UPDATED):
- Implemented on ARM Cortex-M7 running 300 MHz  not tested - assume yes
- Algorithm cannot use heap (malloc) Tick!
- Support for fast shutdown, max ~50 ms (likely not working if large output / working buffers are used) Tick!
- Flash usage <100kb (the flash could potentially store a static dictionary) Tick!
- Ram usage <30kb Tick!
- Target compression rate > 40% on test data file Tick!
- System is currently running at 30% idle time. Target idle time with compression >20% not tested - assume yes
- Entire file zipped Tick!
- Lossless compression Tick!

« Last Edit: October 03, 2019, 04:40:17 am by hamster_nz »
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 2182
  • Country: fi
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #33 on: October 03, 2019, 07:58:13 am »
Good work.

All the data isn't in 36-byte records; there will be jumps like this:
Code: [Select]
00 02 08 00 00 00 8b ff ff ff ff ff fa ff 01 00 80 56 b8 82 81 a2 42 00 ca fe 98 06 06 12 a1 03 00 00 00 00
00 02 06 00 00 00 00 ff 00 00 00 00 01 00 a0 f7 bf 82 81 a2 42 05 ff ef 8c 08 08 1c a1 03 00 00 00 00 00 02

Where you can see the 82 81 pattern, which is clearly a part of a time stamp, jump by 2 bytes - so that particular frame is only 34 bytes. But they happen surprisingly rarely, so most messages are of the same length - probably 36 bytes is the maximum possible, since all jumps happen to the left.

The proposed "look-back-36-bytes" compression works so well because the jumps are so rare.

Code: [Select]
08 00 00 00 ff 0e 61 ff 07 ff ff ff 01 00 80 75 2c 95 81 a2 42 06 ff ea 98 03 03 08 ac 03 00 00 00 00 00 02
03 00 00 00 e5 fe 00 01 00 20 a4 8f 95 81 a2 42 06 00 ef 84 08 08 0f ac 03 00 00 00 00 00 02 08 00 00 00 64

This frame shows a 5-byte jump (pattern 95 81), i.e., it's a 31-byte record.


Why I looked it like this, I'm trying to guesstimate how many bits there is in the timestamp. It clearly has at least 4 bytes (you can see an increasing value pattern on 4 bytes) , but I'd guess there are more: one more on the LSB side, which is basically a random value due to the timestamp having enough resolution, and another on the MSB side, which is just fixed in such short capture. So maybe 6 bytes?

I guess it's this:

Code: [Select]
20 01 00 e0    9b b5 47 83 a2 42    06 fb ff 98 08 08 31 ce 04 00 00 00 00 00 02 08 00 00 00 00 05 85 d8 ff ff ff
ff 01 00 a0    c3 be 47 83 a2 42    05 fe ff 8c 08 08 3d ce 04 00 00 00 00 00 02 08 00 00 00 8b ff ff ff ff ff fa
ff 01 00 c0    64 c6 47 83 a2 42    05 ff ef 8c 08 08 49 ce 04 00 00 00 00 00 02 08 00 00 00 6c 51 84 09 00 00 21
1d 01 00 80    8c cf 47 83 a2 42    05 ff ef 8c 08 08 55 ce 04 00 00 00 00 00 02 08 00 00 00 6e 29 02 00 c0 f1 ff
ff 01 00 c0    41 34 48 83 a2 42    00 04 f0 8c 08 08 61 ce 04 00 00 00 00 00 02 08 00 00 00 f1 ff 8f 8c 1c ff ff
ff 01 00 80    69 3d 48 83 a2 42    00 df fe 98 08 08 6d ce 04 00 00 00 00 00 02 08 00 00 00 82 20 1d ff ff ff ff
00 01 00 e0    4c 54 48 83 a2 42    8c fe ff 8c 08 08 79 ce 04 00 00 00 00 00 02 08 00 00 00 63 ff ff fc ff ff ff

0x42 has to be the MSB - bytes after it change and can't be a part of the timestamp.
e0, a0, c0, 80, c0, 80, e0 clearly indicate some flags or similar, and are not the LSB of the timestamp.

I would guess these are raw CAN frames with added logging header, which obviously has a timestamp. With 6 bytes in timestamp, and raw extended frame being max 16 bytes, there are still 14 more bytes in the logger-generated header.

Getting rid of excess "extra" data, whatever it is in the headers, would help a lot in data reduction.

Timestamp can be reduced to 2 or 3 bytes likely (if you can guarantee no long silent breaks, and just let it wrap and reconstruct the MSBs later), and maybe you can just throw away everything else on that logger header.

It seems, the actual CAN frames only use 16/36 = 44% of the data - 56% is logger headers.



 

Offline Martin F

  • Regular Contributor
  • *
  • Posts: 128
  • Country: dk
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #34 on: October 03, 2019, 08:21:58 am »
Hi again,

Thanks for all the great inputs!

I'll try to provide some more detail on the specific format soon. Note that the sample I included is indeed raw CAN data from a tractor, aka J1939. In most J1939 cases there are 8 data bytes - but in other cases the data length may be lower, i.e. 1-8 data bytes. It can also be CAN FD, which is up to 64 data bytes. So in short, the data bytes will vary. There might be some optimization along your lines that could be implemented for specific cases (e.g. a J1939 specific compression), but for the general case the data length will vary.

Best,
Martin
 

Offline Jeroen3

  • Super Contributor
  • ***
  • Posts: 3399
  • Country: nl
  • Embedded Engineer
    • jeroen3.nl
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #35 on: October 03, 2019, 10:05:13 am »
LZ4 seems to have:
Code: [Select]
/*
 * LZ4_HEAPMODE :
 * Select how default compression functions will allocate memory for their hash table,
 * in memory stack (0:default, fastest), or in memory heap (1:requires malloc()).
 */
So your heap requirement is not a problem.
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 487
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #36 on: October 03, 2019, 11:19:26 am »
Just curious what the underlying need for the compression happens to be. It would seem in any case once you get to a certain point, you either get a bigger sd card or use some form of compression, and those two choices follow you all the way up until sd card size limitations kick in (hardware or software). If you eliminate compression, you just move yourself 1/2 step forward in upgrading sd card sizes. I doubt the price of sd cards is a factor (if talking about tractors), so must be something else (not that there has to be a reason).

One thing maybe to consider in any compression methods, is how easy it is to deal with any corruption that may occur in the path of writing/storing/reading. Maybe its too uncommon to worry about though, although if it did happen and the data was important I suppose dealing with something more like raw data would be easier to recover. I'm not sure I would want to be on the cutting edge of compression techniques, and later wish I had just used a bigger sd card instead and avoided any problems.

I guess I'm just wondering when compression is a good idea, and when it is not.
 
The following users thanked this post: Kilrah

Offline Martin F

  • Regular Contributor
  • *
  • Posts: 128
  • Country: dk
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #37 on: October 03, 2019, 01:30:10 pm »
Hi cv007, your inputs make a lot of sense. The challenge here is that the data will be periodically offloaded from the SD card via WiFi or 3G/4G. The period in which a WiFi hotspot is available for the offload can be limited, so the more throughput we can achieve per minute, the better. This gives a secondary purpose to the compression beyond only saving SD capacity. For 3G/4G uploading at scale, it has direct cost implications that can be relatively impactful in business cases utilizing the device. But you're completely right that for the pure SD card aspect, it would probably not be worth it alone.
 

Online mariush

  • Super Contributor
  • ***
  • Posts: 3909
  • Country: ro
  • .
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #38 on: October 03, 2019, 03:09:19 pm »
Let me throw some suggestions around.

Have you considered using some binary serialization format, like MessagePack or BSON?

Have you investigated using something super simple like plain LZ77 or the Palmdoc version of LZ77?
See https://wiki.mobileread.com/wiki/PalmDOC and https://en.wikipedia.org/wiki/LZ77_and_LZ78  and sourcecode for PalmDOC compression and compression : http://bazaar.launchpad.net/~kovid/calibre/trunk/view/head:/src/calibre/ebooks/compression/palmdoc.c

The PalmDOC format uses 4K blocks independent of each other and everything's in 2 bytes so it's fast to compress and uses very little memory.

I believe Crush ( https://sourceforge.net/projects/crush/files/CRUSH/1.00/crush100src.zip/download ) uses a LZ77 variant ... i tested on the 8 MB sample and compressed it down to 3,122,328 bytes

If you have multi-byte numbers but most values are small, maybe see if it's worth encoding numbers as variable length?
ex
* like UTF-8 encoding ( https://en.wikipedia.org/wiki/UTF-8#Description ) or
* just plain bit set to 1 means more bytes follow, 0 means it's last byte (so anything less than 128 would use 1 byte, anything that uses less than 14 bits can be stored in 2 bytes and so on)


« Last Edit: October 03, 2019, 03:13:08 pm by mariush »
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 2182
  • Country: fi
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #39 on: October 03, 2019, 03:35:58 pm »
Hi again,

Thanks for all the great inputs!

I'll try to provide some more detail on the specific format soon. Note that the sample I included is indeed raw CAN data from a tractor, aka J1939. In most J1939 cases there are 8 data bytes - but in other cases the data length may be lower, i.e. 1-8 data bytes. It can also be CAN FD, which is up to 64 data bytes. So in short, the data bytes will vary. There might be some optimization along your lines that could be implemented for specific cases (e.g. a J1939 specific compression), but for the general case the data length will vary.

Best,
Martin

Hi Martin,

What is the purpose for the logging? Is it to prove something legally, or for "unknown" future purposes, so that you need to store every header bit-by-bit, and add enough ancillary information such as precise timestamps?

If not, how about just getting rid of the large headers, and just saving the raw CAN packet, possibly with a minimized timestamp (think about a 3 or 4 byte header, instead of 20 bytes you seem to now have). 1 byte would signify the total CAN frame len, and two bytes for timestamp.

As for CAN data, why not drop everything except ID, data, CRC and possibly the RTR & ACK bits? You could go from 128 bits (assuming extended frame with 8 bytes payload) to about 110 bits, (from 16 bytes to to 14 bytes.

It would be kinda elegant to fit a full 8 bytes of payload to 16 bytes, with the timestamp. This would be a larger reduction (-56%) than you specified, without any compression and not much trickery.
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 487
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #40 on: October 03, 2019, 09:27:34 pm »
>The challenge here is that the data will be periodically offloaded from the SD card via WiFi or 3G/4G

I thought the data transfer rate going out would be a prime suspect, but dismissed that I as was only thinking of some direct method via pc (card reader, usb, etc). Wireless had not crossed my mind, although just like compression its all around me and largely goes unnoticed.

I guess you could still log to the sd card in 'raw', then use compression of some form on the upload. Maybe it does not matter when it is done, but the option to move the process to only when its needed may be useful. Maybe not.

Sorry for the slight detour.
 

Online Mechatrommer

  • Super Contributor
  • ***
  • Posts: 9391
  • Country: my
  • reassessing directives...
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #41 on: October 04, 2019, 12:32:44 am »
the 1MB/minute (16KBps) is hardly categorized as high speed in Wifi world (lowest CAN bus speed spec), except if using china generic 433MHz radio, cheapest price internet subscription plan or interplanetary communication. but some other aspects may call for the compression scheme, such as to maximize data storage in a specified capacity storage device. to some, it matters if a DSO has 12 vs 24Mpts capacity, but with compression, the spec can be upped using 12Mpts HW to store 24Mpts data, so the DSO can be (BOM) cheaper at same the spec as more expensive HW device and hence win. DSO example maybe not suitable as its quite difficult to implement real time compression at GSps rate unless extra FPGA/MCU power (price) is provided but OP has demonstrated the need in his case earlier. if we have more/extra processing power at $0 cost, it will be better to be able to store double the capacity given a storage device size, we can also transmit double the speed in the air from what it can in uncompressed state, fwiw.
if something can select, how cant it be intelligent? if something is intelligent, how cant it exist?
 

Offline Martin F

  • Regular Contributor
  • *
  • Posts: 128
  • Country: dk
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #42 on: October 04, 2019, 05:01:43 am »
To clarify, the device logs data to an SD card - it may be without WiFi for days or weeks. When it reaches WiFi, it may have e.g. 10 minutes to offload all the data - sometimes gigabytes. We considered compression during the transfer, but the challenge with this is that the data also needs to be encrypted in real-time on the SD card, as some users may have highly sensitive data. The encryption can only happen after the compression - hence our setup.
 

Offline Berni

  • Super Contributor
  • ***
  • Posts: 2672
  • Country: si
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #43 on: October 04, 2019, 05:55:40 am »
The most effective compression algorithms will always be the ones that are "content aware". This is why JPEG, PNG, GIF, FLAC, MPEG4 etc... exist

But that being said developing the custom algorithm for your application can take a LOT of work. Noisy data is quite annoying to lossless compression since it has a lot of "detail" that has to be perfectly captures yet it completely unpredictable. For waveform data this is usually dealt with applying lossy compression, taking the compression error and saving it along the data(But this data might only need 8bits of depth). But if lossy compression is acceptable then you can get really staggering compression ratios that get the data size bellow 10% of its original size while looking pretty much the same to a human looking at it.

This approach also makes progressive data transfer possible. Where you first send the lossy compressed data only, but then start sending the fine correction waveform to fix the compression errors and make it lossless again. That way the first part of the transfer already delivers usable data if it gets cut off mid transfer. A similar idea to this is used by progressive mode JPEG where the image in a browser can already be displayed entirely in low quality after the browser only receives the first 1/10th of the image file, the data that comes afterwards contains the rest of the data that is used to take the image to full quality.
 

Offline Harjit

  • Regular Contributor
  • *
  • Posts: 79
  • Country: us
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #44 on: October 08, 2019, 04:37:23 pm »
I need to compress log data on an M7 device with similar requirements. The links I've stored away for when I do the work are below. If you happen to benchmark these, I'd be very grateful if you shared the results.

https://github.com/lz4/lz4
https://github.com/atomicobject/heatshrink
https://github.com/richgel999/miniz
https://www.segger.com/products/compression/emcompress/emcompress-togo/

I happen to know Rich (richgel999 link above) and asked him about dynamic allocation and his response is below. Also, Rich is really into compression and has been for years.

From: Rich Geldreich
Sent: Monday, October 08, 2018 8:32 AM
To: harjit
Subject: Re: Hello
 
It's possible to disable malloc/free in miniz by #defining MINIZ_NO_MALLOC. You can't use any of the allocation related API's though, just the API's that take pointers to buffers. The tdefl and tinfl functions don't use malloc directly.
 
The following users thanked this post: Jeroen3

Online hamster_nz

  • Super Contributor
  • ***
  • Posts: 2196
  • Country: nz
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #45 on: October 10, 2019, 04:05:25 am »
Did you get anywhere with this?

Last night I played with a very simple sliding window compression....
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline Martin F

  • Regular Contributor
  • *
  • Posts: 128
  • Country: dk
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #46 on: October 10, 2019, 06:18:52 am »
Hi again,

We will be looking into this over the next weeks.

We'll be investigating 2 options to start with, Zstandard (zstd) and emCompress by Segger. We're also investigating some more custom routines on the side, but the aim is to get an initial idea of the performance of those two first. It's worth noting that the dataset we shared was probably overly simplistic - there can be variations in many cases, e.g. instead of a fixed 8 data byte set, we can have between 1 and 64 data bytes, as well as regular/extended length IDs.

We looked at LZ4 and zstd and for now we'll try zstd first as we found the decoding simpler, while zstd also more prominently seems to feature the option of using static dictionaries. While we don't yet know if this will be the ideal route, it's nice to have the option.

As for emCompress, it seems quite tailor fit for the purpose and we've used SEGGER products in the past with good experiences. The price tag is not the main issue, so it's more a matter of whether we can use the decoding code as e.g. a *.dll file and keep it with the rest of our open source code on github. We're getting this clarified hopefully this week.

Martin
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 1333
  • Country: us
  • Formerly SiFive, Samsung R&D
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #47 on: October 10, 2019, 11:01:01 am »
"we'll try zstd first as we found the decoding simpler"

You've got to be kidding.

lz4 has an incredibly simple format, which can be decoded with about a dozen lines of C. zstd format is extremely complex, and you need to juggle nested data blocks at several different levels, huffman trees .. all kinds of stuff.

Both lz4 and zstd were created by the same person, but with very different aims. lz4 is *fast*. Very fast. Compression is significantly faster than zstd and decompression is about 3x faster. But zstd has a much higher compression ration -- one of the best available.

Unoptimized lz4 decompression is as simple as this (I just tested it on an lz4 file with the initial header removed -- it's correct)

Code: [Select]
char *decompress(FILE *f, char *outPtr){
  int tok, n;
  while (tok = fgetc(f)){
    int litLen = (tok>>4)&0xf, copyLen = tok&0xf;
    if (litLen == 15) do {n=fgetc(f); litLen += n;} while (n == 255);
    for (int i=0; i<litLen; ++i) *outPtr++ = fgetc(f);
    int copyOff = fgetc(f) + fgetc(f)*256;
    if (copyLen == 15) do {n=fgetc(f); copyLen += n;} while (n == 255);
    char *copySrc = outPtr-copyOff;
    for (int i=0; i<copyLen+4; ++i) *outPtr++ = *copySrc++;
  }
  return outPtr;
}
 

Online mariush

  • Super Contributor
  • ***
  • Posts: 3909
  • Country: ro
  • .
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #48 on: October 10, 2019, 12:42:30 pm »
I mentioned the PalmDOC compression a few posts above...

I wrote a code to compress and decompress using this algorithm, written in PHP... see attachment.
It's less than 200 lines of actual code, and you'll need less than 10 KB of RAM to compress 4 KB chunks (less if you use 2 KB chunks)
Well technically, you could compress with less than 1 KB of ram used, but it would be very slow... I'm using ram as a "dictionary" to get speed
You can encode and decode each chunk separately and you can append the compressed blocks without any problems as everything is done in bytes.



 

Offline Marco

  • Super Contributor
  • ***
  • Posts: 4597
  • Country: nl
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #49 on: October 10, 2019, 03:30:16 pm »
No idea about the (code) quality, but since there's no prediction based codec mentioned yet there's also Sprintz. An IoT codec for uniformly sampled time series data.

https://github.com/dblalock/sprintz
 
The following users thanked this post: I wanted a rude username

Offline Martin F

  • Regular Contributor
  • *
  • Posts: 128
  • Country: dk
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #50 on: October 18, 2019, 06:50:20 pm »
Hi all, thanks again for a lot of good inputs.

We ended up evaluating SEGGER's emCompress vs. Heatshrink. Both were potentially valid, but for our particular combination of memory and speed requirements, it looks like Heatshrink may be the better option (and open source). Thanks again for the great suggestions!

Martin

 
The following users thanked this post: nctnico

Offline Harjit

  • Regular Contributor
  • *
  • Posts: 79
  • Country: us
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #51 on: October 18, 2019, 11:48:49 pm »
Thank you for letting us know where you ended up. If you have measurements you can share, I would love to see them.
 

Offline Martin F

  • Regular Contributor
  • *
  • Posts: 128
  • Country: dk
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #52 on: October 20, 2019, 11:17:45 am »
Hi again, we’re getting between 50-70% reduction in the size, which is consistent with what we’re aiming for. For us it was in particular the writing speed that was critical in ensuring lossless logging at the same time.
 
The following users thanked this post: I wanted a rude username

Offline Harjit

  • Regular Contributor
  • *
  • Posts: 79
  • Country: us
Re: Best library/algorithm for low RAM embedded data compression?
« Reply #53 on: October 20, 2019, 04:26:00 pm »
Thanks.

Same for me - writing speed is important because I want to do this on the MCU which it is doing other real time tasks.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf