Products > Test Equipment
A High-Performance Open Source Oscilloscope: development log & future ideas
<< < (6/71) > >>
Fungus:

--- Quote from: tom66 on November 16, 2020, 06:41:11 pm ---If you look at Rigol DS1000Z then you can see a fairly hefty SRAM chip attached to the FPGA, in addition to a regular DDR2/3 memory device.  It is almost certain that the DDR memory is used just for waveform acquisition and that the waveform is rendered into the SRAM buffer and then streamed to the i.MX processor (possibly over the camera port like I am using.)   Whether the FPGA colourises the camera data or whether Rigol use the i.MX's ISP block to do that is unknown to me.  Rigol likely chose an expensive SRAM because it allows for true random access with minimal penalty in jumping to random addresses.

--- End quote ---

I believe the Rigol main CPU can only "see" a window of 1200 samples at a time, as decimated by the FPGA. This is the reason that all the DS1054Z measurements are done "on screen", etc.

1200 samples is twice the screen display (600 pixels).

tom66:

--- Quote from: nctnico on November 16, 2020, 07:35:36 pm ---IMHO you are at a cross road where you either choose for implementing a high update rate but poor analysis features and few people being able to work on it (coding HDL) versus a lower update rate and having lots of analysis features with many people being able to work on it (using OpenCL or even Python extensions). Another advantage of a software / GPU architecture is that you can update to higher performance hardware as well by simply taking the software to a different platform. Think about the NVidia Jetson / Xavier modules for example. A Jetson TX2 module with 128Gflops of GPU performance starts at $400. More GPU power automatically translates to a higher update rate. This is also how the Lecroy software works; look at how Lecroy's Wavepro oscilloscopes work and how a better CPU and GPU drastically improve the performance.

--- End quote ---

I agree, although there's no reason you can't do both;  I had always intended for the waveform data to be read out by the main application software in a different pipeline to that of the render pipeline.  In a very early prototype, I did that by changing the Virtual Channel ID of the data set, so you could set up two simultaneous receiving engines.

What this means is though the render engine might be complex HDL you'll still be able to read linear wave data in any instance - I'd like for instance this to interface well with Numpy arrays and Python slices as well as a fast C API for reading the data. 

But it would be good to ask.  Do people really, genuinely benefit from 100kwaves/sec?  I have regarded intensity grading as a "must have" so the product absolutely will have that, but is 30kwaves/sec "good enough" for almost all uses, that potential users would not notice the difference?  I have access to a Keysight DSOX2012A right now, and I wouldn't say the intensity grading function is that much more useful that my Rigol DS1074Z despite the Keysight scope having an on-paper spec of ~8x that of the Rigol.
 
Certainly, a more useful function would (in my mind) be the rolling history function combined with >900Mpts of sample memory so you can go back up to ~90 seconds in time to see what the scope was showing at that moment and I find the Rigol's ~24Mpt memory far more useful than the ~100kpt memory of the Keysight.

Shifting the dots is computationally simple even with sinx/x (which is not yet implemented).  It's just offsetting a read pointer and a ROT-64 with an 8-bit multiple, practically perfect FPGA territory.  In the present implementation I simply read 0..3 dummy words from the FIFO, then rotate two words to get the last byte offset.
tom66:

--- Quote from: Fungus on November 16, 2020, 08:13:24 pm ---
--- Quote from: tom66 on November 16, 2020, 06:41:11 pm ---If you look at Rigol DS1000Z then you can see a fairly hefty SRAM chip attached to the FPGA, in addition to a regular DDR2/3 memory device.  It is almost certain that the DDR memory is used just for waveform acquisition and that the waveform is rendered into the SRAM buffer and then streamed to the i.MX processor (possibly over the camera port like I am using.)   Whether the FPGA colourises the camera data or whether Rigol use the i.MX's ISP block to do that is unknown to me.  Rigol likely chose an expensive SRAM because it allows for true random access with minimal penalty in jumping to random addresses.

--- End quote ---

I believe the Rigol main CPU can only "see" a window of 1200 samples at a time, as decimated by the FPGA. This is the reason that all the DS1054Z measurements are done "on screen", etc.

1200 samples is twice the screen display (600 pixels).

--- End quote ---

Yes it seems likely to me that it is transmitted as an embedded line in whatever is transmitting the video data.  The window is about 600 pixels across so it makes sense that they would be using e.g. the top eight lines for this data, two per channel.  It is also clear that Rigol use a 32-bit data bus instead of my 64-bit data bus as the holdoff/delay counter resolution is half what I support. (My holdoff setting has 8ns resolution due to 125MHz clock; theirs is 4ns/250MHz.)  They use a Spartan-6 with fewer LUTs than my 7014S so it's perhaps a trade off there.

I am almost certain (though have not physically confirmed it) that the Rigol is doing all the render work on the FPGA.  Perhaps they are using the i.MX CPU for the Anti-Alias mode which gets very slow on longer timebases as it appears to be rendering more (all?) of the samples.

The Rigol also does not decimate the data when doing the waveform rendering, so you can get aliasing in some cases although they are fairly infrequent corner cases.
nctnico:

--- Quote from: tom66 on November 16, 2020, 08:13:55 pm ---
--- Quote from: nctnico on November 16, 2020, 07:35:36 pm ---IMHO you are at a cross road where you either choose for implementing a high update rate but poor analysis features and few people being able to work on it (coding HDL) versus a lower update rate and having lots of analysis features with many people being able to work on it (using OpenCL or even Python extensions). Another advantage of a software / GPU architecture is that you can update to higher performance hardware as well by simply taking the software to a different platform. Think about the NVidia Jetson / Xavier modules for example. A Jetson TX2 module with 128Gflops of GPU performance starts at $400. More GPU power automatically translates to a higher update rate. This is also how the Lecroy software works; look at how Lecroy's Wavepro oscilloscopes work and how a better CPU and GPU drastically improve the performance.

--- End quote ---

I agree, although there's no reason you can't do both;  I had always intended for the waveform data to be read out by the main application software in a different pipeline to that of the render pipeline.  In a very early prototype, I did that by changing the Virtual Channel ID of the data set, so you could set up two simultaneous receiving engines.

What this means is though the render engine might be complex HDL you'll still be able to read linear wave data in any instance - I'd like for instance this to interface well with Numpy arrays and Python slices as well as a fast C API for reading the data. 

But it would be good to ask.  Do people really, genuinely benefit from 100kwaves/sec?  I have regarded intensity grading as a "must have" so the product absolutely will have that, but is 30kwaves/sec "good enough" for almost all uses, that potential users would not notice the difference?  I have access to a Keysight DSOX2012A right now, and I wouldn't say the intensity grading function is that much more useful that my Rigol DS1074Z despite the Keysight scope having an on-paper spec of ~8x that of the Rigol.
 
Certainly, a more useful function would (in my mind) be the rolling history function combined with >900Mpts of sample memory so you can go back up to ~90 seconds in time to see what the scope was showing at that moment and I find the Rigol's ~24Mpt memory far more useful than the ~100kpt memory of the Keysight.

Shifting the dots is computationally simple even with sinx/x (which is not yet implemented).  It's just offsetting a read pointer and a ROT-64 with an 8-bit multiple, practically perfect FPGA territory.  In the present implementation I simply read 0..3 dummy words from the FIFO, then rotate two words to get the last byte offset.

--- End quote ---
Personally I don't have a real need for high waveform update rates. Deep memory is usefull (either as a continuous record or as segmented / history buffer; segmented and history are very much the same). But with deep memory also comes the requirement to be able to process it fast.

Nearly 2 decades ago I embarked on a similar project where I tried to cram all the realtime & post processing into the FPGAs. In the end you only need to fill the width of a screen which is practically 2000 pixels. This greatly reduces the bandwidth towards the display section but needs a huge efford on the FPGA side. The design I made could go through 1Gpts of 10bit data within 1 second and (potentially) produce multiple views of the data at the same time. The rise of cheap Asian oscilloscopes made me stop the project. If I where to take on such a project today I'd go the GPU route and do as little as possible inside an FPGA. I think creating trigger engines for protocols and special signal shapes will be challenging enough already.
asmi:

--- Quote from: tom66 on November 16, 2020, 01:18:52 pm ---The bandwidth of this interface is less critical than it sounds,  for 8Gbit/s ADC (1GSa/s 8-bit) then just 10 LVDS pairs are needed.  A modern FPGA has 20+ on a single bank and on the Xilinx 7 series parts, each has an independent ISEREDESE2/OSERDESE2 which means you can deserialise and serialise as needed on the fly on each pin.   There are routing and timing considerations but I've not had an issue with the current block running at 125MHz,  I think I might run into issues trying to get it above 200MHz with a standard -3 grade part.

--- End quote ---
As you go into giga samples range, ADC quickly becoming jesd204b-only, which is itself a separate big can of worms. And many of them will happily send 12 Gpbs per lane and even more, for that you will need something more recent than 7 series (or using Virtex-7, I think they can go that high, though no personal experience).
Navigation
Message Index
Next page
Previous page
There was an error while thanking
Thanking...

Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod