Products > Test Equipment
A High-Performance Open Source Oscilloscope: development log & future ideas
tom66:
Another challenge I am working on is how to do the rendering all on the FPGA.
This would free up the CPUs of the Pi and the GPU could be used for e.g. FFTs and 2D acceleration tasks.
The real challenge is - waveforms are stored linearly, but every X pixel on the display needs a different Y coordinate for a given wavevalue. So, it is not conducive to bulk write operations at all (e.g. AXI Burst). The 'trivial' improvement is to rotate the buffer 90 degrees (which is what my SW renderer does) so that your accesses tend to hit the same row at least and will be more likely to be sitting in the cache. But this is still a non-ideal solution. So the problem has to be broken down into tiles or slices. Zynq should read, say, 128 waveform values (fits nicely into a burst), and repeat for every waveform (with appropriate translations provided), write all the pixel values for that into BRAM (~12 bits x 128 x 1024, for a 1024 height canvas with 12 bits intensity grading = ~1.5Mbits pr about half of all available BlockRAMs), and write that back into DDR in order to get the most performance with burst operations used as much as possible.
It implies a fairly complex core and that's without considering multiple channels (which introduce even more complexity, because do you handle each as a separate buffer, or accumulate each with a 'key value' or ...?) The complexity here is that the ADC multiplexes samples, so in 1ch mode the samples are A0 .. A7, but in 2ch mode they are A0 B0 A1 B1 .. A3 B3 which means you need to think carefully about how you read and write data. You can try to unpack the data with small FIFOs on the acquisition side, but then you need to reassemble the data when you stream it out.
This is essentially solving the rotated polygon problem that GPU manufacturers solved 20 years ago, but solving it in a way that can fit in a relatively inexpensive FPGA and doing it at 100,000 waves/sec (60 Mpoints/sec plotted). And then doing it with vectors or dots between points - ArmWave is just dots for now though there is a prototype slower vector plotter I have written somewhere.
If you look at Rigol DS1000Z then you can see a fairly hefty SRAM chip attached to the FPGA, in addition to a regular DDR2/3 memory device. It is almost certain that the DDR memory is used just for waveform acquisition and that the waveform is rendered into the SRAM buffer and then streamed to the i.MX processor (possibly over the camera port like I am using.) Whether the FPGA colourises the camera data or whether Rigol use the i.MX's ISP block to do that is unknown to me. Rigol likely chose an expensive SRAM because it allows for true random access with minimal penalty in jumping to random addresses.
Current source code for ArmWave, the rendering engine presently used for anyone curious:
https://github.com/tom66/armwave/blob/master/armwave.c
This is about as fast as you will get an ARM rendering engine while using just one core and it has been profiled to death and back again. 4 cores would make it faster although some of the limitation does come from memory bus performance. It's at about 20 cycles per pixel plotted right now.
Fungus:
--- Quote from: james_s on November 16, 2020, 06:38:53 pm ---I loathe touchscreens, I tolerate one on my phone because of obvious constraints with the form factor but ... roughly the same price will get me a 4 channel Siglent in a nice molded housing with real buttons and knobs and support
--- End quote ---
Trust me: The knobs are OK for things like adjusting the timebase but a twisty, pushable, multifunction knob is not better for navigating menus, choosing options, etc.
eg. Look at the process of enabling a bunch of on-screen measurement on a Siglent. Does that seem like the best way?
https://youtu.be/gUz3KYp_5Tc?t=2925
tautech:
--- Quote from: Fungus on November 16, 2020, 06:56:14 pm ---
Look at the process of enabling a bunch of on-screen measurement on a Siglent. Does that seem like the best way?
--- End quote ---
Best is accurate:
https://www.eevblog.com/forum/testgear/testing-dso-auto-measurements-accuracy-across-timebases/
sb42:
--- Quote from: Fungus on November 16, 2020, 06:56:14 pm ---
--- Quote from: james_s on November 16, 2020, 06:38:53 pm ---I loathe touchscreens, I tolerate one on my phone because of obvious constraints with the form factor but ... roughly the same price will get me a 4 channel Siglent in a nice molded housing with real buttons and knobs and support
--- End quote ---
Trust me: The knobs are OK for things like adjusting the timebase but a twisty, pushable, multifunction knob is not better for navigating menus, choosing options, etc.
eg. Look at the process of enabling a bunch of on-screen measurement on a Siglent. Does that seem like the best way?
https://youtu.be/gUz3KYp_5Tc?t=2925
--- End quote ---
Also, with a USB port it might be possible to design something around a generic USB input interface like this one:
http://www.leobodnar.com/shop/index.php?main_page=product_info&cPath=94&products_id=300
nctnico:
--- Quote from: tom66 on November 16, 2020, 06:41:11 pm ---Another challenge I am working on is how to do the rendering all on the FPGA.
This would free up the CPUs of the Pi and the GPU could be used for e.g. FFTs and 2D acceleration tasks.
--- End quote ---
I'm not saying it can't be done but you also need to address (literally) shifting the dots so they match the trigger point.
IMHO you are at a cross road where you either choose for implementing a high update rate but poor analysis features and few people being able to work on it (coding HDL) versus a lower update rate and having lots of analysis features with many people being able to work on it (using OpenCL or even Python extensions). Another advantage of a software / GPU architecture is that you can update to higher performance hardware as well by simply taking the software to a different platform. Think about the NVidia Jetson / Xavier modules for example. A Jetson TX2 module with 128Gflops of GPU performance starts at $400. More GPU power automatically translates to a higher update rate. This is also how the Lecroy software works; look at how Lecroy's Wavepro oscilloscopes work and how a better CPU and GPU drastically improve the performance.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version