Electronics > Projects, Designs, and Technical Stuff
Digital FPV video for drone racing
(1/28) > >>
hexahedron:
Hey all, first time poster here, I have a pretty cool project I've been thinking about for a few years. I started working on this project a few months ago, and I'd like to share my progress so far. If I'm posting in the wrong area, or if you have any other problems, please let me know! I don't want a bad rep right off the bat!
So, what is this project? Well, currently, all competitive drone racers use an "fpv" (first person view) system to pilot their drone at breakneck speeds. This system involves an all-analog system, based on analog television broadcast standards. This system uses the 5.8Ghz band, and when as few as 4 pilots are flying at once (there can be up to 8 in a race!) the signals start to interfere with each other, resulting in poor visibility. In many competitions, the pilots are flying with a mostly black and white signal that is riddled with static. My project intends to solve this problem.
The plan is to create a system that will compress a 480p, 60fps video stream into a signal that will fit within a bandwidth of 6Mhz. Why 6Mhz? MultiGP (largest drone racing thing) only allows for a select few video transmitters to be used for a drone to be legal. As these transmitters are designed for an NTSC video signal input, they all have low-pass filters on the video input, which will kill any higher frequencies (correct me if I'm wrong). MultiGP does NOT restrict the camera that can be used however, which means that my system should be legal.
How can we achieve compressing the video stream into that bandwidth constriction? The answer is that it isn't easy. To start, instead of using a 1 bit data stream, we can use 3 bits (8 values) for our data stream. This triples our data rate from 6Mbits/sec to 18Mbits/sec. Let's get some math out of the way. 640*480*60 is ~ 18.5Mpixels/sec, meaning that for every pixel in the video stream, we can only transmit less than 1 bit of information without going over our limit. I'll get into exactly how this can be accomplished in my next post, but for now I want to see how this community responds to this post.
LaserSteve:
I'm a amateur radio operator.  Could you please design your rig so emissions stay clear (60 dB down)  of the 5.7 Ghz weak signal calling frequencies. Our emissions are  narrow FM voice and SSB Voice at these frequencies, and some of our operators are doing good if they are within 1 Mhz of center.

ITU Regions:
5,668.2 MHz Region 1 Calling Frequency 1[3]
5,760.1 MHz Region 2 Calling Frequency[4]
5,760.2 MHz Region 1 Calling Frequency 2[3]


Thanks....

Steve
hexahedron:
Will do. Most people using an fpv link don't go over 25mw
hexahedron:
Well, I guess almost no response is better than a negative one, moving on!
This post will be about the concepts of how this will work going forward, and the next will be about my progress so far. Feel free to skip this post if you are inclined to do so.
So, how is it possible to compress the video stream in such as way that each pixel represents less than a single bit? The answer is found everywhere on the internet, in the form of jpeg compression! I will try my best to explain the process the best I can, but if you are still confused, check out https://en.wikipedia.org/wiki/JPEG , It's a great article.
For now, let's assume we just have an image we need to compress. The first step is to convert the color-space from RGB to YCbCr. This will separate the brightness from the color of the image. Why do this? Our eyes are not as good at sensing changes in brightness as they are in changes in color. This allows us to reduce the resolution of the 2 color channels in the image, with no perceptible difference. In our example, we will reduce the resolution by a factor of 2 in both the X and Y dimensions. Next, we will split each color channel into 8*8 pixel blocks. Then we will apply the Discrete Cosine Transform (DCT) algorithm to each 8*8 block. Explaining what DCT does is very difficult, I would recommend checking out https://en.wikipedia.org/wiki/Discrete_cosine_transform . In essence, we will separate the 8*8 block into another 8*8 block where the different frequencies of the image are separated, where the top left is the low frequency, and the bottom right is high frequency. Why do this step? Well, our eyes are not very good at perceiving high frequency patterns in images as the low frequency ones. This means we can reduce the amount of high frequency data contained in the image without there being a perceivable difference, to an extent. That step is called "quantization" where we divide each block (now DCTified) by a 8*8 quantization table. This table determines the level of compression in the image, and generally is a gradient of numbers where the values increase as they get closer to the lower right corner. This leaves us with 8*8 blocks that have largeish numbers in the upper left corner, and mostly numbers that are less than 1 in the rest of the image. We then round down each 8*8 block, removing all decimal points, resulting in a 8*8 block that mostly consists of zeros. At this point, depending on the quantization table, our image now mostly consists of zeros, and numbers on a range of -127-127 (single byte). The next step is called zig-zag encoding, where we simply convert our 8*8 image into a 64 value long string of bytes from a zig-zag pattern, See image.
We now can apply a compression algorithm called "run length encoding". https://en.wikipedia.org/wiki/Run-length_encoding This wikipedia article explains it much better than I can, so go read that. And we're done! Now, how effective was this? Well, let's do some math.

The OV7725 camera module (the one I will be using) uses 16 bits to represent each pixel.
Vres  hres   fps   bits
640 x 480 x 60 x 16 = ~ 295Mbits/sec

In order to fit 295Mbits/sec into a 18Mbits/sec datastream, we need to reduce the data by a factor of 16:1. If we look at the JPEG wikipedia article, We can see that the jpeg compression standard "High quality" has a compression ratio of 15:1. I have made some programs that implement this in python, and I can confirm that ratios of 16:1 are indeed possible without being able to tell the difference without a reference image! The next post will be about what I have done to implement this algorithm on an FPGA.
StillTrying:

--- Quote from: hexahedron on January 18, 2019, 08:57:29 pm ---Well, I guess almost no response is better than a negative one, moving on!
--- End quote ---

Well I don't think it can be done, but I hope I'm wrong. :)
Any gap in a digital compressed stream will leave a much bigger hole than in an analogue stream.
Navigation
Message Index
Next page
There was an error while thanking
Thanking...

Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod