EEVblog Electronics Community Forum

Products => Computers => Programming => Topic started by: HwAoRrDk on January 22, 2021, 10:30:51 pm

Title: LZ4 decompression
Post by: HwAoRrDk on January 22, 2021, 10:30:51 pm
I've been writing some C code to implement LZ4 block decompression. I have a working implementation, but there is one aspect of the 'spec' (if you can call it that) that I am unsure of.

It concerns the scenario where the match offset plus the match length is greater than the current output position - that is, there is an overlap. The block format documentation (https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md) has this to say about it:

Quote
In some cases, matchlength is larger than offset. Therefore, match_pos + matchlength > current_pos, which means that later bytes to copy are not yet decoded. This is called an "overlap match", and must be handled with special care. A common case is an offset of 1, meaning the last byte is repeated matchlength times.

I don't understand how this should be handled. There is no further elaboration.

On this blog post (http://ticki.github.io/blog/how-lz4-works/) I found suggestion that this is somehow some kind of run-length-encoding. But again, no elaboration.

I have looked at several open-source implementations of LZ4 block decoding logic, and none of them seem to handle this 'overlap' scenario. Is this perhaps something that is not actually required because no extant compressors/encoders actually produce blocks like this?

Any help from anyone with knowledge would be most appreciated. :-+
Title: Re: LZ4 decompression
Post by: Nominal Animal on January 22, 2021, 11:17:33 pm
Let's say the current buffer contents (from lower to higher addresses) is abcdef, and you have a match at d of length 8.  The end result is abcdefdefdefde.

This is common, as this way the encoder only needs to have the first repeating sequence in the buffer, and it can be repeated any number of times.

This does need some care to handle correctly, because the standard C library does not have a function to do this; it is the opposite of memmove() with respect to memcpy(), and I like to call it memrepeat().  If you use an open-coded loop to fill/emit the data, there is no problem (and no special handling is needed), because the destination-source delta stays constant.
Title: Re: LZ4 decompression
Post by: HwAoRrDk on January 22, 2021, 11:58:55 pm
Oh, I think I understand now! :) So with your example, a simple copy loop (not using stdlib mem*() functions) would go like the following, automatically handling the situation:

abcdef
   ^
abcdefd
    ^
abcdefde
     ^
abcdefdef
      ^
abcdefdefd
       ^
abcdefdefde
        ^
abcdefdefdef
         ^
abcdefdefdefd
          ^
abcdefdefdefde


So if my code is currently as follows, it should already handle this, right?

Code: [Select]
const uint8_t *copy_src = out - copy_off;

while(copy_len-- > 0) *out++ = *copy_src++;

Thanks! :-+
Title: Re: LZ4 decompression
Post by: ejeffrey on January 23, 2021, 12:19:12 am
Yep.  This is a pretty standard way to handle repetitive strings in dictionary based compression algorithms.

You could run into trouble if you tried to optimize by copying multi-byte words, or if you used memcpy which has an unspecified overlap behavior.

Thats probably why it looked like the implementations you saw didn't handle this: they were doing it fine but no special care is needed except to avoid doing something tricky.