Author Topic: you have raw bytes: how to understand the used lossless algorithm?  (Read 1464 times)

0 Members and 1 Guest are viewing this topic.

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
so, yesterday I found a source shooting two packets of 500K raw byte each ever 208 seconds
I am 100% sure
1) they are not encrypted
2) they are lossless images
3) 24bit color, 3 byte per pixel
which is good, but ...

how to understand the used lossless algorithm?  :o :o :o
 

Offline AndyBeez

  • Regular Contributor
  • *
  • Posts: 208
  • Country: gb
  • The Neon Finger
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #1 on: June 14, 2022, 04:52:31 pm »
Any more clues? A RAW image stream off a CCD is lossless. So too is Windows BMP. In what context is the image data? Is it a TS transport stream?
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #3 on: June 14, 2022, 05:11:16 pm »
Code: [Select]
0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0xFF, 0x00, 0x01, 0x00,
0x01, 0x01, 0x00, 0x00, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x05, 0x01,
0x01, 0x03, 0xFF, 0x00, 0x00, 0x03, 0x01, 0x00, 0x01, 0x01, 0x00,
0xFF, 0x00, 0x00, 0x01, 0xFF, 0x01, 0xFF, 0x01, 0x00, 0x00, 0x02,
0xFF, 0x00, 0x00, 0x00, 0x01, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00,
0x01, 0x01, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0xFF, 0x01, 0x00,
0xFF, 0xFF, 0x01, 0x01, 0x01, 0x00, 0xFF, 0xFF, 0x00, 0x00, 0x03,
0x00, 0x00, 0x00, 0x01, 0xFF, 0xFE, 0x00, 0x00, 0x01, 0xFF, 0x01,
0x00, 0x01, 0x00, 0xFF, 0xFF, 0x01, 0x00, 0x00, 0x02, 0xFF, 0x01,
0xFF, 0x00, 0x01, 0x00, 0x01, 0x00, 0x00, 0x01, 0xFF, 0xFF, 0x00,
0xFE, 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x01, 0x00, 0xFF, 0xFF,
0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x01, 0x00, 0xFF, 0xFF, 0x00,
0x01, 0x00, 0xFF, 0x01, 0xFE, 0x00, 0xFF, 0x00, 0x00, 0x02, 0x01,
0x00, 0x00, 0x00, 0xFF, 0xFF, 0x01, 0x00, 0x00, 0x05, 0xFF, 0xFF,
0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0xFF, 0x00, 0xFF,
0x01, 0x01, 0x00, 0x00, 0x01, 0x00, 0xFF, 0x00, 0x01, 0x01, 0x00,
0x00, 0x00, 0x01, 0xFF, 0x01, 0x00, 0x01, 0x01, 0x00, 0x00, 0xFF,
0x01, 0xFF, 0x00, 0x00, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00,
0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0xFF, 0x01, 0x00, 0x00, 0x00,
0xFF, 0x00, 0x00, 0x00, 0x01, 0xFF, 0x00, 0x00, 0x00, 0xFF, 0x00,
0x00, 0x00, 0xFE, 0x01, 0x00, 0xFF, 0x00, 0x01, 0x00, 0xFF, 0x00,
0x01, 0x00, 0x01, 0x00, 0x01, 0x01, 0x00, 0x00, 0x00, 0x02, 0x01,
0x00, 0xFF, 0x00, 0x00, 0x01, 0xFF, 0xFF, 0x00, 0x00, 0xFF, 0x00,
0x00, 0x05, 0x02, 0x01, 0xFF, 0x01, 0xFF, 0xFF, 0x00, 0x00, 0x00,
0x01, 0x01, 0x01, 0x00, 0xFF, 0x00, 0x00, 0x00, 0xFF, 0x00, 0x01,
0x01, 0x00, 0xFF, 0xFF, 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00,
0x00, 0xFF, 0xFE, 0x00, 0x00, 0x01, 0xFF, 0x00, 0xFF, 0x00, 0x00,
0x00, 0xFF, 0x00, 0x00, 0x02, 0xFF, 0x00, 0x01, 0x01, 0x00, 0x00,
0x00, 0x03, 0x01, 0xFF, 0x00, 0x01, 0xFF, 0x01, 0x00, 0xFF, 0x01,
0x00, 0x00, 0x00, 0x01, 0x00, 0xFF, 0x00, 0xFF, 0x00, 0x00, 0x04,
...

This is how it looks  :o :o :o
 

Offline AndyBeez

  • Regular Contributor
  • *
  • Posts: 208
  • Country: gb
  • The Neon Finger
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #4 on: June 14, 2022, 05:43:19 pm »
Strange indeed. All I can 'suspect' is the 0xFF is some kind of control byte that terminates a data group.
Syntax wise: { [byte]... 0xFF }...

But how it maps to pixels, I have 0.0 idea?
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #5 on: June 14, 2022, 06:53:22 pm »
umm, it seems the compression is done by a small FPGA, something like Xilinx Spartan3 400.

So, the next question is: which are the best image lossless compression algorithms that you can implement in a small FPGA?
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #6 on: June 14, 2022, 06:58:05 pm »
or, which are the most widely used among lossless data compression algorithms for hardware implementation?

(then I will try to apply their decoders, and see ... it something matches)
 

Offline mariush

  • Super Contributor
  • ***
  • Posts: 4543
  • Country: ro
  • .
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #7 on: June 14, 2022, 07:32:11 pm »
Can you upload the whole 500 KB or whatever the amount is sent?

There may be a color palette at the end, or something like that.

Curious if it's always 0xFF  ... was thinking that byte is used to signify something as in if bit is 1 then it's a reference to a color palette entry, 0 means something else (actual color definition or something like that, like new color information, and add to the color palette)
 
Too many 0x00 in series to be actual compression imho... at most could be some RLE scheme
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #8 on: June 14, 2022, 08:32:20 pm »
attached a binary image with two consecutive packets
[attachurl=1]
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #9 on: June 14, 2022, 11:15:56 pm »
I wonder ... MJPEG (Motion JPEG)?  :o :o :o

Implementing MJPEG in a small fpga le 3S400 should be doable
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #10 on: June 14, 2022, 11:25:09 pm »
Yesterday I hacked the Kodak PixCam used in Palm m100 era.

Nothing special, it's a 320x240x24bit-color(1) pixel camera that shoots over a serial line, but the used format was unusual, because it adds a marker of 4 byte every 14440 byte.

Stupid, but easy to get.

Unfortunately it's not the case here  :-//


(1) true false colors, the first byte is always 0x0, so it really only uses 4 bits but even if it pretends to store every R, G, B channel in one byte. I tried every single official software, no dice, always the same.
 

Online golden_labels

  • Frequent Contributor
  • **
  • Posts: 715
  • Country: pl
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #11 on: June 15, 2022, 12:00:31 am »
The data has low symbol-to-symbol entropy: you can tell by it being easily compressible by 75% with any modern algorithm. That excludes MJPEG, as Huffman coding would prevent that. Same goes for any reasonable compression, obfuscation or encryption algorithm.

If that stores data as separate pixels, you may pick an arbitrary fragment and try displaying it using different row lengths and bits per pixel. If you see some patterns, adjusting those two parameters will at some point produce the desired image and that will tell you what is the actual geometry.
You are grounded! — said mom to pin 11 of an LM324 op-amp
Worth watching: Calling Bullshit — protect your friends and yourself from bullshit!
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 3087
  • Country: nz
  • Formerly SiFive, Samsung R&D
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #12 on: June 15, 2022, 12:25:38 am »
Code: [Select]
0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0xFF, 0x00, 0x01, 0x00,
0x01, 0x01, 0x00, 0x00, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x05, 0x01,
0x01, 0x03, 0xFF, 0x00, 0x00, 0x03, 0x01, 0x00, 0x01, 0x01, 0x00,
0xFF, 0x00, 0x00, 0x01, 0xFF, 0x01, 0xFF, 0x01, 0x00, 0x00, 0x02,
0xFF, 0x00, 0x00, 0x00, 0x01, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00,
0x01, 0x01, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0xFF, 0x01, 0x00,
0xFF, 0xFF, 0x01, 0x01, 0x01, 0x00, 0xFF, 0xFF, 0x00, 0x00, 0x03,
0x00, 0x00, 0x00, 0x01, 0xFF, 0xFE, 0x00, 0x00, 0x01, 0xFF, 0x01,
0x00, 0x01, 0x00, 0xFF, 0xFF, 0x01, 0x00, 0x00, 0x02, 0xFF, 0x01,
0xFF, 0x00, 0x01, 0x00, 0x01, 0x00, 0x00, 0x01, 0xFF, 0xFF, 0x00,
0xFE, 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x01, 0x00, 0xFF, 0xFF,
0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x01, 0x00, 0xFF, 0xFF, 0x00,
0x01, 0x00, 0xFF, 0x01, 0xFE, 0x00, 0xFF, 0x00, 0x00, 0x02, 0x01,
0x00, 0x00, 0x00, 0xFF, 0xFF, 0x01, 0x00, 0x00, 0x05, 0xFF, 0xFF,
0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0xFF, 0x00, 0xFF,
0x01, 0x01, 0x00, 0x00, 0x01, 0x00, 0xFF, 0x00, 0x01, 0x01, 0x00,
0x00, 0x00, 0x01, 0xFF, 0x01, 0x00, 0x01, 0x01, 0x00, 0x00, 0xFF,
0x01, 0xFF, 0x00, 0x00, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00,
0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0xFF, 0x01, 0x00, 0x00, 0x00,
0xFF, 0x00, 0x00, 0x00, 0x01, 0xFF, 0x00, 0x00, 0x00, 0xFF, 0x00,
0x00, 0x00, 0xFE, 0x01, 0x00, 0xFF, 0x00, 0x01, 0x00, 0xFF, 0x00,
0x01, 0x00, 0x01, 0x00, 0x01, 0x01, 0x00, 0x00, 0x00, 0x02, 0x01,
0x00, 0xFF, 0x00, 0x00, 0x01, 0xFF, 0xFF, 0x00, 0x00, 0xFF, 0x00,
0x00, 0x05, 0x02, 0x01, 0xFF, 0x01, 0xFF, 0xFF, 0x00, 0x00, 0x00,
0x01, 0x01, 0x01, 0x00, 0xFF, 0x00, 0x00, 0x00, 0xFF, 0x00, 0x01,
0x01, 0x00, 0xFF, 0xFF, 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00,
0x00, 0xFF, 0xFE, 0x00, 0x00, 0x01, 0xFF, 0x00, 0xFF, 0x00, 0x00,
0x00, 0xFF, 0x00, 0x00, 0x02, 0xFF, 0x00, 0x01, 0x01, 0x00, 0x00,
0x00, 0x03, 0x01, 0xFF, 0x00, 0x01, 0xFF, 0x01, 0x00, 0xFF, 0x01,
0x00, 0x00, 0x00, 0x01, 0x00, 0xFF, 0x00, 0xFF, 0x00, 0x00, 0x04,
...

This is how it looks  :o :o :o

If that's compressed it's pretty badly compressed!  gzip cuts that down from 330 bytes to 150. Even lz4 reduces it to 227. They would both do much better on the full file if it continues like that.
 

Offline DavidAlfa

  • Super Contributor
  • ***
  • Posts: 2875
  • Country: es
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #13 on: June 15, 2022, 01:00:31 am »
If the receiver knows the resolution beforehand, it2 xould4 be just raw RGB data.
If the packets are always the same size, I'd discard any compression.
Do you know the resolution of the device?
Attach a full packet, so we can play with it...
Hantek DSO2x1x            Drive        FAQ          DON'T BUY HANTEK! (Aka HALF-MADE)
Stm32 Soldering FW      Forum      Github      Donate
 

Online Mechatrommer

  • Super Contributor
  • ***
  • Posts: 10664
  • Country: my
  • reassessing directives...
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #14 on: June 15, 2022, 01:46:00 am »
I wonder ... MJPEG (Motion JPEG)?  :o :o :o
definitely no. RLE at best... or just a custom format. probably MSB (sign bit) (or LSB?) may tell something.
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 10169
  • Country: fr
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #15 on: June 15, 2022, 02:03:58 am »
It's very likely a form of RLE. You'd need to analyze the data for given known frames with simple content to be able to figure it out in a reasonable amount of time.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #16 on: June 15, 2022, 11:31:20 am »
I am 100% sure that the source is a 640x480 image with 1 byte for each color, so 24 bit color

I will try to capture a compressed frame with the highest compression ratio. I mean, I noticed that if I try to congest the link then there are two packets of 50-40 Kbytes in size instead of two packets of 400-500 Kbytes

This should probably mean that the compression ratio is "adjustable" from "low" to "extreme"  :-//
 

Offline dietert1

  • Super Contributor
  • ***
  • Posts: 1397
  • Country: br
    • CADT Homepage
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #17 on: June 15, 2022, 01:08:12 pm »
Somehow this reminds me of a 2009 project, description here: http://cadt.de/notes/DesktopVI.pdf. Image compression happened on a Digilent SpartanII board. It had lossless compression by about a factor 100 in realtime with a throttle algorithm. The idea was to get a live color display from a scope with a grayscale screen. The pdf includes a link to a youtube video that is still available..

These things depend a lot on the context. E.g. in the case of the Lecroy a fixed palette and an almost empty screen. Is the image available that belongs to the data packets posted above?

Regards, Dieter
 
The following users thanked this post: DiTBho

Offline emece67

  • Frequent Contributor
  • **
  • Posts: 616
  • Country: 00
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #18 on: June 15, 2022, 01:30:02 pm »
.
« Last Edit: August 19, 2022, 05:36:58 pm by emece67 »
 

Offline mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 13035
  • Country: gb
    • Mike's Electric Stuff
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #19 on: June 15, 2022, 01:51:04 pm »
If possible, looking at how the compressed image size varies with content may give some clues.
e.g. show it limited range of colours, large/small amoints of detail in X & Y, mono vs. colour images etc.
Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Offline m k

  • Super Contributor
  • ***
  • Posts: 1101
  • Country: fi
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #20 on: June 15, 2022, 04:08:35 pm »
1028 bytes wide editor would be nice.
(cut 1st 0x200 off)

E,
counter of sorts is indicating a base color/line.
« Last Edit: June 15, 2022, 05:39:29 pm by m k »
Aneng-Appa-AVO-Fluke-General Radio-Heathkit-Herbert Arnold-HP-Kaise-Kyoritsu-Mastech-Simpson-Tektronix-YFE
(plus lesser brands from the work shop of the world)
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #21 on: June 16, 2022, 09:24:35 am »
I finally captured a packet containing an image compressed with level=5, which should  be "extreme" compression.
[attachurl=1]
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #22 on: June 16, 2022, 10:08:45 am »
It's very likely a form of RLE. You'd need to analyze the data for given known frames with simple content to be able to figure it out in a reasonable amount of time.

What if one day we will receive a data-packet from a radio telescope capturing things from outer-space?
You don't know the encoding algorithm, you don't know the compressing algorithm, ...
Code: [Select]
                           ________________
                          |                |
 A L I E N                |                |
unknown message ------> x |    blackbox    | y --------> message received
                          |    y = h(x)    |
                          |                |
                          |________________|
Code: [Select]
                           ________________
                          |                |
 H U M A N                |                |
unknown message ------> x |    blackbox    | y --------> message received
                          |    y = f(x)    |
                          |                |
                          |________________|

Code: [Select]
                           ________________
                          |                |
 H U M A N                |                |
known message --------> x |    blackbox    | y --------> message received
                          |    y = f(x)    |
                          |                |
                          |________________|

In this case we are lucky because I can force a known-image into the black-box and capture the output  :o :o :o

[attachurl=1]
[attachimg=2]
 

Online Mechatrommer

  • Super Contributor
  • ***
  • Posts: 10664
  • Country: my
  • reassessing directives...
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #23 on: June 16, 2022, 11:00:57 am »
What if one day we will receive a data-packet from a radio telescope capturing things from outer-space?
You don't know the encoding algorithm, you don't know the compressing algorithm, ...
you call linguists, cryptologists, expert in encoding schemes, scientists etc etc... decoding unknown and complex data communication is not a one man job. on earth alone, several disciplines are needed to decode just for say, Egyptian hieroglyphs.. btw, your level 5 lossless compression looks like not a compresion at all (see attached).. maybe next test is to send the blackbox a noncompressible (random noise) image and ask it to do maximum compression regardless...
« Last Edit: June 16, 2022, 11:31:45 am by Mechatrommer »
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
Re: you have raw bytes: how to understand the used lossless algorithm?
« Reply #24 on: June 16, 2022, 11:43:42 am »
you call linguists, cryptologists, expert in encoding schemes, scientists etc etc... decoding unknown and complex data communication is not a one man job. on earth alone, several disciplines are needed to decode just for say, Egyptian hieroglyphs

Hollywood makes first contact with aliens a breeze, but it's not.

I'm trying to train myself to decode everything I can access just in case ... and it's so damn hard, as you can see here.

Four bloody weeks just to sniff the USB protocol, figure out where frames start and end, and still don't understand how they are compressed.

I have a purpose for that device, it's not just a game, but it's also a challenge to understand how complex certain things are in reality.

I mean, I love comics, I've read tons of comics and watched hundreds of movies, and in a Superman movie you see Lex Luthor able to handle how to decode an alien algorithm to unlock the door of an alien spaceship and immediately access the door. . entire Cryptorian database to create the worst villain ever, Doomsday, whose birth has only one purpose: Superman's death.

If you watch the movie, in less than 48 hours Lex Luthor was also able to decode a super complex Cryptorian technology of a bio-weapon classified as "only generals can access it", and he easily bypassed the security level and got it. . .

Insanely bright! Is he a super-ultra-sapiens human being? the evolution of human beings, hence more intelligent, clever and smart than Einstein?

Crazy Hollywood! This part sucks because it's too far from reality  :-// :-// :-//

your level 5 lossless compression looks like not a compresion at all (see attached)
how come a finger print encoded into a skull is something you've not told us...

No, they are two different images. The one with the finger is the only one frame I can capture with a known input image.
The skull one comes from a different source, which I cannot control.

But, if it's not compressed, then it's interesting because it means that sometimes it shoots an uncompressed frame  :o :o :o

So, I have to look at the control frame to see if there is a special command to issue a not compressed  shot.

I have to improve the C source that tries to capture frames (based on libusb, on Linux). It's not easy, there are a lot of control packets in the middle.
But images are only transported on bulk-frames (up to 500Kbyte)  and I cannot simply ignore everything else because I have to check the compression level, which usually precedes the bulk frame.


Ctrl frame, it sometimes report the compression level, a number from 1 to 5
bulk frame, 1 of 4, sometimes it's 1 of 2, sometimes it's 1 of 1, it depends on the compression level and on the image size
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf