Author Topic: Fast ADC sampling at random intervals with a Zynq? (Read 7460 times)

rhb · « **on:** February 26, 2018, 05:12:46 pm »

A key requirement for compressive sampling of a 1D data stream is taking samples at random delays from the previous sample. The effective BW is a function of the granularity of the timing. So randomly sampling one of the 8 clock phases of a random clock interval will increase the BW by a factor of 8.

Is this possible?

Can one setup a descriptor in the Xilinx DMA IP that specifies a particular clock cycle and phase for taking a sample?

Can one stream the interval to the next sample to the DMA engine fast enough?

Where should I look for that level of detail in the Xilinx documentation?

I've been through a good bit of the DMA documentation, but so far it has all been introductory material. It's introduced me to the nomenclature. The general DMA structure is what I would expect, but I've not seen anything I recognized as controlling sampling at that granularity.

nctnico · « **Reply #1 on:** February 26, 2018, 05:22:27 pm »

I think you'll need a 2 steps approach:
1) A clock domain which does the sampling. By varying the phase using the DPLL (if that allows realtime adjustments) you can sub-divide the sampling interval.
2) A FIFO which takes the data from the sample domain into the regular clock domain.

The complexity depends on how fast you want to sample.

NorthGuy · « **Reply #2 on:** February 26, 2018, 05:32:04 pm »

Are you asking about XADC?

rhb · « **Reply #3 on:** February 26, 2018, 05:39:38 pm »

I want to go as fast as possible. I've made an initial pass thru the ADC08DL502 datasheet, but have not looked into the details of triggering it. I'm still looking for likely obstacles. The XADC would be interesting only as a prototype.

The FIFO is an excellent point. I had thought of that for implementing run length compression of LA data, but hadn't thought about it in the context of random sampling. That will greatly simplify things as it doesn't require complex DMA descriptors.

rhb · « **Reply #4 on:** February 26, 2018, 06:08:40 pm »

I was just looking at the ADC08DL502 datasheet some more and found this juicy bit of verbiage:

"Fine Phase Adjust. The phase of the ADC sampling clock is adjusted monotonically by the value in this field.
00h provides a nominal zero phase adjustment, while FFh provides a nominal 50 ps of delay. Thus, each code
step provides approximately 0.2 ps of delay.

Coarse Phase Adjust. Each code value in this field delays the sample clock by approximately 65 ps. A value of
00000b in this field causes zero adjustment.

Intermediate Phase Adjust. Each code value in this field delays the sample clock by approximately 11 ps. A
value of 000b in this field causes zero adjustment. Maximum combined adjustment using Coarse Phase Adjust
and Intermediate Phase adjust is approximately 2.1ns."

Of course, I'm likely to be tormented for such ambitions like Tantalus. I'd need to be able to write the register very rapidly and it seems unlikely that TI is expecting that. Whatever the outcome, it won't be boring. 0.2 pS would imply a 2500 GHz Nyquist which rather obviously is NOT going to happen. Even with compressive sampling that would require at least 250 GB/S to DDR. But it might be that all of the random delay timing can be handled by writing a PRN to the ADC sample phase registers and feeding the output to a FIFO which gets unloaded by the DMA engine to DRAM. If that works then the bottleneck moves to the NEONs to solve an L1 minimization which is it's own special place in Hell.

nctnico · « **Reply #5 on:** February 26, 2018, 06:55:17 pm »

How about clocking the ADC from a spread-spectrum oscillator if you want random intervals? A spread spectrum oscillator is basically a VCXO FM modulated with noise. What could give problems is when the ADC has an internal cleanup PLL. ADCs with high samplerates usually need clocks with a very low frequency jitter. What you want to achieve is the opposite. Perhaps this problem is better solved in the analog domain (noise source + VCO).

NorthGuy · « **Reply #6 on:** February 26, 2018, 06:59:02 pm »

Quote from: rhb on February 26, 2018, 05:39:38 pm

I want to go as fast as possible. I've made an initial pass thru the ADC08DL502 datasheet, but have not looked into the details of triggering it.

Then it entirely depends on the ADC. Because the ADC does the sampling, not FPGA. The phase adjustments through ADC registers will have some lag, so adjusting on a per-edge basis is most likely not doable.

It is possible to generate the ADC sample clock with the FPGA. When you want to change phase of individual edges, you can try to use routing delays or figure a way to use IDELAY elements to delay the clock's edges. I think this is possible, but requires certain degree of familiarity with the FPGA fabric. Then this is a question how ADC is going to react to such clock - I don't think you can find the answer to this in the datasheet, so the only way is to experiment.

Once you get the data into the FPGA you can de-serialize it as wide as you can. You pass the data through a FIFO which brings it into the clock domain with a regular clock (as opposed to your random ADC sampling clock), then you can pass it to your DMA (if it has enough bandwidth).

hamster_nz · « **Reply #7 on:** February 26, 2018, 09:48:11 pm »

It may be worth considering using a high speed transciever to drive the ADC clock. That way you could adjust the clock within a bit time without any clocking or IDELAY magic.

It won't be perfectly random, but with a 6Gb transciever you will be able move the clock edge with 133ns granularity, and you will be able to accurately timestamp things, and it will not rely on assumed behaviours and latency of PLLs and delay blocks.

rhb · « **Reply #8 on:** February 26, 2018, 10:34:33 pm »

I have to either know in advance or measure the sample times. Aside from being difficult, measuring adds to the bandwidth I need to write to DRAM.

The time interval between each sample must be random (Mersenne Twister quality PRNG). The datasheet for the ADC08DL502 is fairly vague about the phase adjustment. It's intended to allow adjusting the timing when multiple ADCs are used. They say that it degrades performance slightly if enabled. However, it's not at all clear if that is the case for compressive sensing as there is a lot of exotic mathematics involved. It may well be that generating a clock with deterministic jitter is what is needed. This is obviously not something a datasheet will tell you. The most the datasheet can do is suggest some experiments.

The idea is to solve the matrix equation Ax=y where the y vector has been sampled at random times. The A matrix is a Fourier basis evaluated at those times. If and only if the x vector is "sparse" an L1 solution obtained is provably the optimal L0 solution. If it is not sparse the problem is NP-hard and not solvable in practice. Most of the time it is sparse.

That short paragraph summarizes what I think is the most important piece of applied mathematics since Norbert Wiener's work in the late 30's and early 40's. Anyone who is interested should look at the 2004 papers on David Donoho's website at Stanford. The proofs are fairly painful. The proof for one theorem is 15 pages long, but Donoho writes well and the introduction to each paper describes things very clearly.

Even L1 solutions are very compute intensive. Thirty years ago in grad school L1 was something we simply could not do for practical size problems. Even L2 was a struggle for large problems on a 4 MB VAX 11/780. And "large" is pretty small on a time shared machine with 4 MB of memory.

I've only done L1 solutions using the GLPK solvers which are probably much too slow for the ARM cores in the Zynq. However, there are other algorithms which are much faster. Compressive sensing is in routine use for MRI imaging, at least at Stanford, but probably quite a few places by now as it dramatically speeds up the data acquisition.

The structure appears to be something like this:

create a PRNG in the Zynq PL which writes to the phase registers on the ADC as fast as possible
feed the ADC output to a FIFO implemented in the PL
transfer the data from the FIFO to DRAM by AXI DMA
at the end of a sweep, trigger a PS interrupt to solve Ax=y (or use the PL with a suitable algorithm)
sum all Fourier series with non-zero coefficients at regular sample intervals using a PL block
display the sum on the screen

by using the same seed, the PRNG sequence is known so the A matrix can be precomputed.

This is all seriously non-trivial. Georgia Tech awarded a PhD for doing this 4 years ago. I've played this game of catch up before, so I'm under no illusions about the effort required. In this particular instance I already know the math as I've been studying it for the last 4-5 years. I know what I have to do. I just need to work out an implementation. To be useful I need to either reduce the trace buffer size by a factor of 5-10 or increase the BW by a factor of 5-10. I rather expect that it will be easier to collect 500 MHz Nyquist data by sampling at an average rate of 50-100 MS/S than it will be to collect 2.5-5 GHz Nyquist data sampling at an average rate of 1 GS/S

Note, I am abusing the notion of Nyquist. Because of the random sampling there is no aliasing. It seems really weird if you've been doing traditional DSP for 30 years, but it's true. The Fourier transform of a regular spike series is another regular spike series. I still don't have a strong sense of what the transform of a random spike series looks like other than that there is only a single spike of significant size in the transform.

My standoffs came in the mail so it's time to start assembling some hardware.

nctnico · « **Reply #9 on:** February 26, 2018, 11:11:52 pm »

Look back at my analog solution: if you sample the signal modulating the VCO then you also have the information about when the sample was taken. It is going to involve more than that (phase delays and so) but you probably get the gist of it. Another way would be to drive the VCO with a DAC which is fed from the PNRG.

langwadt · « **Reply #10 on:** February 26, 2018, 11:57:12 pm »

is it just a matter of reducing the amount of data that needs to be stored?

rhb · « **Reply #11 on:** February 27, 2018, 01:28:30 am »

@nctnico But the analog approach would require measuring and storing the delays. It would also require constantly generating a new A matrix. Based on the conversation so far, I think that clocking the ADC using a clock with random jitter is probably what will actually work. With regular sampling jitter degrades the result. But if accounted for mathematically it helps. It should not be difficult to feed a PNRG output to a numerical comparator and generate a clock edge based on the sum of a series of PRNG outputs.

@langwadt One can reduce the amount of data stored OR increase the effective Nyquist. Very likely so long as one met the sampling requirements you could do a bit of both. I shall NOT go into the mathematics of the sampling requirement. They are very ugly and I'm not that good a mathematician. In fact, my only virtue in mathematics is a high threshold of pain.

The sampling requirement is governed by what is called the "Restricted Isometry Property". The RIP is a function of the maximum coherence (i.e. the crosscorrelation) among any P of the N columns of A. That problem is NP-hard. So in practice you solve the L1 problem and check the result. If it's correct you're done. If it's not, it's NP-hard and you can't solve it. Probably the biggest benefit is that random sampling eliminates the need for anti-alias filters.

I just stumbled into the mathematics by accident. When I realized I had routinely been doing things I "knew" were impossible I got very interested and went looking for why this was possible. That led to "A Mathematical Introduction to Compressive Sensing" by Foucart and Rauhut. And that blew my mind.

I haven't actually mounted anything yet in the Dell Vostro carcass I'm using. I stripped out the CD-ROM bracket and the old MB mounts and made cardboard templates for the boards this afternoon. So except for the PSU there is just a 2 bay 3.5" mount. That gives me a 9" x 14" open area which should make it reasonably simple to mount the Zybo Z7, BeageleBoard X15, a MicroZed, Gigabit switch and 2-3 USB hubs. One objective is to see how fast I can log LA data to disk via eSATA and USB 3.0.

KrudyZ · « **Reply #12 on:** February 27, 2018, 04:49:26 am »

Your requirements seem a bit vague.
In order to get any meaningful suggestions, you would need to at least specify the longest and shortest desired interval between two consecutive samples, as well as the required resolution and accuracy.
As you are probably aware of, ADCs clock circuits are usually designed to MINIMIZE jitter, not to intentionally add to it.
Most high speed, high resolution converters have pipeline architectures and rely on each stage getting the same amount of time.
If you mess with that the conversion results will no longer be linear.
Furthermore, the data outputs are usually phase locked to the sample clock. A little bit of jitter there is OK, but if you are looking for big cycle to cycle variations the data transmission will no longer work.

I would also be interested to hear what type of signals you are trying to work with.
I started reading through the Foucart / Rauhut book you linked to and while I don't pretend to understand the math involved, some of the applications seem very interesting...

rhb · « **Reply #13 on:** February 27, 2018, 03:26:41 pm »

Quote from: KrudyZ on February 27, 2018, 04:49:26 am

Your requirements seem a bit vague.
In order to get any meaningful suggestions, you would need to at least specify the longest and shortest desired interval between two consecutive samples, as well as the required resolution and accuracy.
As you are probably aware of, ADCs clock circuits are usually designed to MINIMIZE jitter, not to intentionally add to it.
Most high speed, high resolution converters have pipeline architectures and rely on each stage getting the same amount of time.
If you mess with that the conversion results will no longer be linear.
Furthermore, the data outputs are usually phase locked to the sample clock. A little bit of jitter there is OK, but if you are looking for big cycle to cycle variations the data transmission will no longer work.

I would also be interested to hear what type of signals you are trying to work with.
I started reading through the Foucart / Rauhut book you linked to and while I don't pretend to understand the math involved, some of the applications seem very interesting...

Yes. ADCs are designed that way because all the classical mathematics based on Wiener's "Extrapolation, Interpolation and Smoothing of Stationary Time Series" requires regular sampling. The monograph appeared during WW II as a classified report bound in yellow to denote its classified status and was popularly called "the yellow peril" because of the heavy math it contains. It's kindergarten level relative to Foucart and Rauhut.

The spatial regularization of seismic data was a perennial topic at the Society of Exploration Geophysicists annual meeting for many years. There have been a great many MS & PhD theses written on it and innumerable proprietary algorithms. Then one day we woke up discover that irregularly sampled data is not a problem, but a virtue. But we were heavy sleepers and didn't wake up for 5 or 6 years.

For anyone interested in this lacking a high threshold of mathematical pain I strongly suggest reading the introductions of the papers at David Donoho's and Emmanuel Candes' websites from 2004 to 2009. Just skip the math proofs. Search generally on "compressive sensing" and read the non-mathematical sections of the papers. I read Foucart and Rauhut twice and Mallat's " A Wavelet Tour of Signal Processing" 3rd ed. because I wanted to understand why something I *knew* was impossible, was actually possible.

For many years I thought that you could regularize data by performing a discrete Fourier transform. But when I actually got around to trying it, I discovered that it doesn't work because of the L2 assumption built into the definition.

As applicable to a DSO, what is being done is a forward Fourier transform using an L1 (least summed absolute error) algorithm instead of the L2 (least squares) implicit in the DFT and then doing a normal inverse FFT. What Donoho and Candes discovered is that the L1 solution has magical properties.

The figures are from Foucart and Rauhut. The discrete Fourier spectrum shown at the top of Fig 1.2 is evaluated in the bottom part and then 16 of the 128 samples used to form the time domain trace are randomly chosen. The top part of Fig 1.3 is the result of attempting to recover the spectrum using a DFT as I did. The bottom is the spectrum recovered by solving an L1 problem. The example is a noiseless case, but it demonstrates a 16x reduction in sampling to exactly recover the 128 coefficient series from its Fourier transform. A sinc interpolator has been used for the figure in the bottom part of Fig 1.2.

I now feel pretty confident that what is needed is to generate a clock with Gaussian distributed, zero mean jitter generated by a PRNG and use that to clock the ADC. Because of the ADC conversion time, increasing the BW is probably more difficult than decreasing the data rate. In concrete terms, the shortest time to the next sample must be greater than 2 standard deviations less than the mean time. My inference that the BW is governed by the granularity of the clock timing may not be entirely correct. I've never had an opportunity to discuss this with anyone else who is familiar with it. A good friend who does won't wrestle with the math unless he's getting paid.

The first chapter of F&R is easy going as it just discusses the motivation and applications. It's really grim after that. In practice it's like the FFT. You just run the program once you have the software.

Search on "single pixel camera" for another application. It's the work of Mark Davenport, Richard Baraniuk et al at Rice and *very* cool. The math in F&R is also the key element in the algorithm that won the Netflix prize for predicting what movies people might like.

Explaining the details of the concept in response to people's comments has been a huge help. Without it I would have blundered down several blind alleys.

I started the thread thinking about ADCs in the manner of an MCU where you can set a timer to go off and trigger the ADC sample. However, the DDR mode of the ADC08DL502 samples on both edges of the clock. Hence what's needed is a clock with a lot of deterministic jitter. As @nctnico pointed out, the AXI DMA stream to DRAM will need to be fed by a FIFO that buffers the irregularly clocked output from the ADC.

KrudyZ · « **Reply #14 on:** February 27, 2018, 05:06:00 pm »

You still need to answer the basic questions regarding your requirements.
What is the maximum and minimum sample to sample delay that you will need for your application and what granularity (step size) can you tolerate in this delay?
You cannot take a modern ADC that is meant to run at 500 Msps and feed it with a clock that has a cycle to cycle jitter of 1 ns.
The only ADCs you could possibly do this with would be a fully parallel FLASH converter without any pipeline stages and a parallel interface.
These are not very common anymore. Designing the clock driver with dynamic delay function for this sounds like a fun project...
With regular multi-stage pipeline ADCs, anything, but a simple random decimation of a continuous acquisition stream will be very difficult to pull off.

langwadt · « **Reply #15 on:** February 27, 2018, 05:49:46 pm »

Quote from: KrudyZ on February 27, 2018, 05:06:00 pm

You still need to answer the basic questions regarding your requirements.
What is the maximum and minimum sample to sample delay that you will need for your application and what granularity (step size) can you tolerate in this delay?
You cannot take a modern ADC that is meant to run at 500 Msps and feed it with a clock that has a cycle to cycle jitter of 1 ns.
The only ADCs you could possibly do this with would be a fully parallel FLASH converter without any pipeline stages and a parallel interface.
These are not very common anymore. Designing the clock driver with dynamic delay function for this sounds like a fun project...
With regular multi-stage pipeline ADCs, anything, but a simple random decimation of a continuous acquisition stream will be very difficult to pull off.

yeh, if the aperture window isn't short enough and timed accurately enough to match the higher rate you are screwed and the converter will have to able to convert in the shortest time between samples. So in the end you need a converter that works at the higher rate and you can then throw away samples

nctnico · « **Reply #16 on:** February 27, 2018, 05:50:07 pm »

Quote from: KrudyZ on February 27, 2018, 05:06:00 pm

You still need to answer the basic questions regarding your requirements.
What is the maximum and minimum sample to sample delay that you will need for your application and what granularity (step size) can you tolerate in this delay?
You cannot take a modern ADC that is meant to run at 500 Msps and feed it with a clock that has a cycle to cycle jitter of 1 ns.

I'm wondering why the pipeline needs a steady clock at all. A single sample & hold is used to sample the input signal. After this there are several internal sample & hold stages but these don't have to rely on the clock at all. IMHO everything should work just fine for a long as the maximum clock frequency doesn't exceed the specification of the ADC. I think this is something which needs to be tested before jumping to conclusions.

KrudyZ · « **Reply #17 on:** February 27, 2018, 07:10:45 pm »

Multi-stage ADCs running at high data rates don't have enough time to transfer the full charge between their stages.
Each stage has a SHA and they are all timed off of the sample clock.
Any variation in the stage to stage transfer timing will affect the percentage of charge transferred (different points on the RC slope) causing linearity errors.
When delays are constant the partial charge transfer can be compensated for.
Using internal timing for the conversion is only used for slower (sub 10 MHz) converters.

NorthGuy · « **Reply #18 on:** February 27, 2018, 07:13:29 pm »

This particular ADC has some characterization of the acceptable jitter in the datasheet:

Quote

The maximum jitter (the sum of the jitter from all sources) allowed to prevent a jitter-induced reduction in SNR is found to be

tJ(MAX) =(VINFSR/VIN(P-P)) x(1/(2(N+1)x?xfIN))

where tJ(MAX) is the rms total of all jitter sources in seconds, VIN(P-P) is the peak-to-peak analog input signal, and VINFSR is the full-scale range of the ADC,"N" is the ADC resolution in bits and fIN is the maximum input frequency, in Hertz, at the ADC analog input.

rhb · « **Reply #19 on:** February 27, 2018, 07:50:40 pm »

Quote from: nctnico on February 27, 2018, 05:50:07 pm

Quote from: KrudyZ on February 27, 2018, 05:06:00 pm
You still need to answer the basic questions regarding your requirements.
What is the maximum and minimum sample to sample delay that you will need for your application and what granularity (step size) can you tolerate in this delay?
You cannot take a modern ADC that is meant to run at 500 Msps and feed it with a clock that has a cycle to cycle jitter of 1 ns.
I'm wondering why the pipeline needs a steady clock at all. A single sample & hold is used to sample the input signal. After this there are several internal sample & hold stages but these don't have to rely on the clock at all. IMHO everything should work just fine for a long as the maximum clock frequency doesn't exceed the specification of the ADC. I think this is something which needs to be tested before jumping to conclusions.

On the assumption of a 5-10x compression and an effective sample rate of 2 nS, a rough guess is that the samples will be at intervals in the range of 2-100 nS. I'll need to do some numerical modeling to nail it down more precisely. The Gaussian distributed sample intervals is just a guess. A Poisson distribution might work. I've not read the Georgia Tech dissertation yet. I just skimmed though it to see what had been done in a general sense. The comment by @KrudyZ about the stage to stage charge transfer would make regular sampling and decimation very desirable. Otherwise I'd need to track the charge transfer at each stage for each clock period and apply a correction factor That might be quite a task. . In any case, I need to generate a Mersenne Twister quality PRNG in the PL fabric and I need to create a AXI DMA engine fed by an interrupt from a fabric FIFO. Both of those should be pretty easy to find as examples.

Because this is a very different mode of operation than TI's designers had in mind, the only way to find out is to try it in hardware. That's what I bought the GDS-2072E for and why I'm limited to the TI ADC08DL502. I don't see much point in speculating about whether it will work or not with the hardware I have. The dissertation used a custom board which is more work than I want to undertake.

The mathematics say it will work, but the hardware may decide it does not agree to the terms required by the mathematics. I had not thought of implementing it by running the ADC at a constant rate and randomly decimating the output. That should be very straight forward to implement. I don't think that would get a BW gain, but it would eliminate the need for anti-alias filters and reduce the amount of memory needed to store a long trace. But it might not be worth the computational overhead.

The acceptable jitter in the datasheet is based on a Wiener-Shannon-Nyquist analysis. This is *very* different.

Now if the rest of my parts (power connectors, gigabit switch, etc) will just show up so I can assemble my dev platform. I was going to mount the Zybo and BeagleBoard yesterday, but decided to hold off until I had the ethernet switch and USB hubs on hand. I do *not* want to repeat my usual mistake of not allowing enough space.

In any case, it should be fun. It's a not quite a dissertation grade project (it's already been done) but it is state of the art.

nctnico · « **Reply #20 on:** February 27, 2018, 09:18:49 pm »

If you already have a scope then why not simply retrieve the data from it and skip samples randomly? It can sample at 1Gs/s so you have a 1ns granularity.

langwadt · « **Reply #21 on:** February 27, 2018, 10:57:35 pm »

Quote from: rhb on February 27, 2018, 07:50:40 pm

Quote from: nctnico on February 27, 2018, 05:50:07 pm
Quote from: KrudyZ on February 27, 2018, 05:06:00 pm
You still need to answer the basic questions regarding your requirements.
What is the maximum and minimum sample to sample delay that you will need for your application and what granularity (step size) can you tolerate in this delay?
You cannot take a modern ADC that is meant to run at 500 Msps and feed it with a clock that has a cycle to cycle jitter of 1 ns.
I'm wondering why the pipeline needs a steady clock at all. A single sample & hold is used to sample the input signal. After this there are several internal sample & hold stages but these don't have to rely on the clock at all. IMHO everything should work just fine for a long as the maximum clock frequency doesn't exceed the specification of the ADC. I think this is something which needs to be tested before jumping to conclusions.

On the assumption of a 5-10x compression and an effective sample rate of 2 nS, a rough guess is that the samples will be at intervals in the range of 2-100 nS. I'll need to do some numerical modeling to nail it down more precisely. The Gaussian distributed sample intervals is just a guess. A Poisson distribution might work. I've not read the Georgia Tech dissertation yet. I just skimmed though it to see what had been done in a general sense. The comment by @KrudyZ about the stage to stage charge transfer would make regular sampling and decimation very desirable. Otherwise I'd need to track the charge transfer at each stage for each clock period and apply a correction factor That might be quite a task. . In any case, I need to generate a Mersenne Twister quality PRNG in the PL fabric and I need to create a AXI DMA engine fed by an interrupt from a fabric FIFO. Both of those should be pretty easy to find as examples.

Because this is a very different mode of operation than TI's designers had in mind, the only way to find out is to try it in hardware. That's what I bought the GDS-2072E for and why I'm limited to the TI ADC08DL502. I don't see much point in speculating about whether it will work or not with the hardware I have. The dissertation used a custom board which is more work than I want to undertake.

The mathematics say it will work, but the hardware may decide it does not agree to the terms required by the mathematics. I had not thought of implementing it by running the ADC at a constant rate and randomly decimating the output. That should be very straight forward to implement. I don't think that would get a BW gain, but it would eliminate the need for anti-alias filters and reduce the amount of memory needed to store a long trace. But it might not be worth the computational overhead.

The acceptable jitter in the datasheet is based on a Wiener-Shannon-Nyquist analysis. This is *very* different.

Now if the rest of my parts (power connectors, gigabit switch, etc) will just show up so I can assemble my dev platform. I was going to mount the Zybo and BeagleBoard yesterday, but decided to hold off until I had the ethernet switch and USB hubs on hand. I do *not* want to repeat my usual mistake of not allowing enough space.

In any case, it should be fun. It's a not quite a dissertation grade project (it's already been done) but it is state of the art.

I'd like a ELI5 on how throwing away 90% of the data will not lose any information in the general case like an oscilloscope

random interleaved sampling, sequential sampling, equivalent-time etc. sampling scopes work when you know you have a
repetitive signal

rhb · « **Reply #22 on:** February 27, 2018, 11:45:27 pm »

Quote from: nctnico on February 27, 2018, 09:18:49 pm

If you already have a scope then why not simply retrieve the data from it and skip samples randomly? It can sample at 1Gs/s so you have a 1ns granularity.

What would that get me? The question isn't whether compressive sampling works. It's does it provide something of practical use in the context of a DSO?

My first phase project is to implement a Zynq IP that provides:

regular ADC sampling of 1 8 bit channel to 1 GS/S
regular ADC sampling of 2 8 bit channels to 500 MS/S
run length compressed LA sampling to 1 GS/S for 8 bits
run length compressed LA sampling to 500 MS/S for 16 bits
compressive sampling of 1 or two ADC channels equivalent to the regular sampling

very likely this will require more than one IP block in the PL. But I don't know that yet. The first step is to prototype it on the Zybo. If it performs reasonably well I'll load it on the 2072E and test it there with a real ADC. The compute requirement is significant. So it may not be viable on the Zynq. But it doesn't entail a huge amount of time or money and so far as I know, only one person has done this. But if a company like Keysight has, they won't say until they announce a product.

There are two drivers here. I want complete control of my scope. The other is the desire to do some serious R&D level work. I was doing some very exotic, cutting edge work for a super major oil company. When I finished the project I terminated my contract and moved from Houston to Arkansas to look after my aging parents. Dad passed away a year later and Mother in 2015. I'm hoping that a successful implementation of a compressive sensing DSO might get me some work. In the current state of the oil patch there is no work for people over 55. I've finally accepted that I'll never get to do the stuff I used to do again. I have friends who also want to work because they enjoy it, but no one will hire them.

@lanfwadt What the hell is an ELI5? As for whether what I'm talking about is valid, I didn't make this up. Emmanuel Candes, now a department chair at Stanford, but at the time at CalTech, and David Donoho of Stanford made the major breakthroughs. As for the limitations on what kind of signal, read the section on basis pursuit in Mallat's 3rd ed. Candes first experiment was to try to recover a signal which was a sparse combination of sinusoids and spikes. I was solving problems involving the heat equation which has infinite sums of exponentials. Those are not repetitive. When I realized that I was successfully solving problems that were impossible by all that I had been taught and done for 30 years I got very curious how that was possible. The price tag for finding out was reading 2000-3000 pages of the most complex mathematics I've ever read. But I feel well rewarded even if I never make a nickel from the effort. It's *really* cool. It's like seeing a beautiful sunset or a gorgeous woman. Just the experience is wonderful.

langwadt · « **Reply #23 on:** February 28, 2018, 12:04:45 am »

Quote from: rhb on February 27, 2018, 11:45:27 pm

Quote from: nctnico on February 27, 2018, 09:18:49 pm
If you already have a scope then why not simply retrieve the data from it and skip samples randomly? It can sample at 1Gs/s so you have a 1ns granularity.

What would that get me? The question isn't whether compressive sampling works. It's does it provide something of practical use in the context of a DSO?

My first phase project is to implement a Zynq IP that provides:

regular ADC sampling of 1 8 bit channel to 1 GS/S
regular ADC sampling of 2 8 bit channels to 500 MS/S
run length compressed LA sampling to 1 GS/S for 8 bits
run length compressed LA sampling to 500 MS/S for 16 bits
compressive sampling of 1 or two ADC channels equivalent to the regular sampling

very likely this will require more than one IP block in the PL. But I don't know that yet. The first step is to prototype it on the Zybo. If it performs reasonably well I'll load it on the 2072E and test it there with a real ADC. The compute requirement is significant. So it may not be viable on the Zynq. But it doesn't entail a huge amount of time or money and so far as I know, only one person has done this. But if a company like Keysight has, they won't say until they announce a product.

There are two drivers here. I want complete control of my scope. The other is the desire to do some serious R&D level work. I was doing some very exotic, cutting edge work for a super major oil company. When I finished the project I terminated my contract and moved from Houston to Arkansas to look after my aging parents. Dad passed away a year later and Mother in 2015. I'm hoping that a successful implementation of a compressive sensing DSO might get me some work. In the current state of the oil patch there is no work for people over 55. I've finally accepted that I'll never get to do the stuff I used to do again. I have friends who also want to work because they enjoy it, but no one will hire them.

@lanfwadt What the hell is an ELI5? As for whether what I'm talking about is valid, I didn't make this up. Emmanuel Candes, now a department chair at Stanford, but at the time at CalTech, and David Donoho of Stanford made the major breakthroughs. As for the limitations on what kind of signal, read the section on basis pursuit in Mallat's 3rd ed. Candes first experiment was to try to recover a signal which was a sparse combination of sinusoids and spikes. I was solving problems involving the heat equation which has infinite sums of exponentials. Those are not repetitive. When I realized that I was successfully solving problems that were impossible by all that I had been taught and done for 30 years I got very curious how that was possible. The price tag for finding out was reading 2000-3000 pages of the most complex mathematics I've ever read. But I feel well rewarded even if I never make a nickel from the effort. It's *really* cool. It's like seeing a beautiful sunset or a gorgeous woman. Just the experience is wonderful.

sorry, "Explain Like I'm 5 years old"

I can see how you can use math to compress data when you have prior knowledge that the signal has less entropy than the
raw regular samples can contain

but I don't see how you can do that for the general case like an oscilloscope

NorthGuy · « **Reply #24 on:** February 28, 2018, 12:26:09 am »

Quote from: langwadt on February 28, 2018, 12:04:45 am

but I don't see how you can do that for the general case like an oscilloscope

Me neither. Some data are compressible, some are not. If a signal can be characterized by regular sampling containing N samples, which, in general case, are random and uncorrelated, then it is absolutely impossible to represent the same signal with N/16 samples keeping the same level of accuracy. Such compression may only be possible if the original samples are not purely random and uncorrelated, but rather restricted in some way, such as limited in bandwidth, periodic etc.

For example, if you compress VHDL code into a ZIP file, you can make it much smaller, but if you try to do the same with random bytes, it will not compress at all.

nctnico · « **Reply #25 on:** February 28, 2018, 12:56:32 am »

Quote from: NorthGuy on February 28, 2018, 12:26:09 am

Quote from: langwadt on February 28, 2018, 12:04:45 am
but I don't see how you can do that for the general case like an oscilloscope
Me neither. Some data are compressible, some are not. If a signal can be characterized by regular sampling containing N samples, which, in general case, are random and uncorrelated.

The thing is that samples from a signal are never uncorrelated. After all a signal can always be described as a series of frequencies at a certain point in time. You don't need all the points of a wave to reconstruct it. You only need enough information to be able to reconstruct it unambiguously.

langwadt · « **Reply #26 on:** February 28, 2018, 01:07:56 am »

Quote from: nctnico on February 28, 2018, 12:56:32 am

Quote from: NorthGuy on February 28, 2018, 12:26:09 am
Quote from: langwadt on February 28, 2018, 12:04:45 am
but I don't see how you can do that for the general case like an oscilloscope
Me neither. Some data are compressible, some are not. If a signal can be characterized by regular sampling containing N samples, which, in general case, are random and uncorrelated.
The thing is that samples from a signal are never uncorrelated. After all a signal can always be described as a series of frequencies at a certain point in time. You don't need all the points of a wave to reconstruct it. You only need enough information to be able to reconstruct it unambiguously.

and if you know nothing about the signal other than the bandwidth that is Nyquist–Shannon

nctnico · « **Reply #27 on:** February 28, 2018, 01:25:19 am »

Quote from: langwadt on February 28, 2018, 01:07:56 am

Quote from: nctnico on February 28, 2018, 12:56:32 am
Quote from: NorthGuy on February 28, 2018, 12:26:09 am
Quote from: langwadt on February 28, 2018, 12:04:45 am
but I don't see how you can do that for the general case like an oscilloscope
Me neither. Some data are compressible, some are not. If a signal can be characterized by regular sampling containing N samples, which, in general case, are random and uncorrelated.
The thing is that samples from a signal are never uncorrelated. After all a signal can always be described as a series of frequencies at a certain point in time. You don't need all the points of a wave to reconstruct it. You only need enough information to be able to reconstruct it unambiguously.
and if you know nothing about the signal other than the bandwidth that is Nyquist–Shannon

Perhaps you should try to read about compressive sampling first before succumbing to Pavlov. A long time ago I have done some research myself into signal reconstruction and the idea behind compressive sampling isn't that far fetched for me.

NorthGuy · « **Reply #28 on:** February 28, 2018, 01:32:55 am »

Quote from: nctnico on February 28, 2018, 12:56:32 am

The thing is that samples from a signal are never uncorrelated.

If you sample white noise they are.

You introduce some sort of correlation by input filtering, or by otherwise limiting the bandwidth. If you then sample at much higher rate (say your input has 100MHz bandwidth and you sample at 1 Gs/s) then you introduce a lot of redundancy - you'll get essentially the same result if you sample at lower frequency. However if you lower your frequency to somewhere close to the Nyquist frequency. I understand that with random sampling you can remove aliasing by sampling at average frequencies way below Nyquist, but removing aliasing doesn't give you an ability to reconstruct the signal. Say, if you have a spike which is two samples wide, you may entirely miss it if you lower your sample frequency (whether you do regular or random sampling).

Of course, things change if you sample something periodic and must combine the data from different waves. Regular sampling may happen to be a multiple of the signal period, so you need to tweak your sampling frequency to get good coverage. Random sampling lets you acquire the multi-period sample at any frequency. But how can it be of any advantage with arbitrary non-periodic signals?

langwadt · « **Reply #29 on:** February 28, 2018, 01:44:55 am »

Quote from: nctnico on February 28, 2018, 01:25:19 am

Quote from: langwadt on February 28, 2018, 01:07:56 am
Quote from: nctnico on February 28, 2018, 12:56:32 am
Quote from: NorthGuy on February 28, 2018, 12:26:09 am
Quote from: langwadt on February 28, 2018, 12:04:45 am
but I don't see how you can do that for the general case like an oscilloscope
Me neither. Some data are compressible, some are not. If a signal can be characterized by regular sampling containing N samples, which, in general case, are random and uncorrelated.
The thing is that samples from a signal are never uncorrelated. After all a signal can always be described as a series of frequencies at a certain point in time. You don't need all the points of a wave to reconstruct it. You only need enough information to be able to reconstruct it unambiguously.
and if you know nothing about the signal other than the bandwidth that is Nyquist–Shannon
Perhaps you should try to read about compressive sampling first before succumbing to Pavlov. A long time ago I have done some research myself into signal reconstruction and the idea behind compressive sampling isn't that far fetched for me.

I have no problem with compressive sampling when you have prior knowledge that the signal is compressible, then it is no more magic than bandpass sampling

but how can you know that is the general case of an oscilloscope?

rhb · « **Reply #30 on:** February 28, 2018, 02:48:02 am »

An analog scope only presents a comb of frequencies determined by the timebase settings. A digital storage scope acquires a BW determined by the sampling rate. In typical usage a DSO displays only the same narrow range of frequencies that an analog scope displays. It is only in single shot mode that you see the full bandwidth acquired by the DSO. The rest of the time the display is dominated by the periodic component.

What a DSO displays in "normal" mode is a set of boxcars in time specified by the sweep rate, display width and trigger rate. This is a set of sinc functions in frequency. This is basic Wiener-Shannon-Nyquist mathematics. Relative to the Fourier spectrum set by the sample rate, that signal may be "sparse" within the requirements of compressive sensing. However, it may not. But overwhelmingly the odds are that it is. Donoho and Candes have presented rigorous proofs. Should you desire rigorous proofs, you must read their papers. But be warned, after spending 15 pages proving a single theorem, Donoho remarks that doubtless the reader will be relieved to know that the proofs of theorems 2 and 3 are much shorter. In fact both were 2-3 sentences.

In the figures I posted from F&R, there are 64 frequencies. However, only 5 frequencies have non-zero coefficients. There is a very sharp threshold between sparse and solvable and not sparse and not solvable. In the latter case one must revert to Wiener-Shannon-Nyquist.

If you know nothing about the signal you can only acquire it with a single shot sweep of sufficient duration to capture the entire signal. DSOs have made things much easier, but consider an analog storage scope. Do you really expect to see a 1 nS pulse on a 1 mS duration sweep? That pulse is one millionth of the sweep length. A modern low end DSO will acquire 10 million samples at 1 GS/S. But the display is typically less than 1000 samples. Zoom mode might let you find that in a one shot, but the only way you will see it in normal mode is if it is repetitive.

In practice, the signal of interest usually dominates the spectrum and is sparse. The coefficients of the transform at the other frequencies are noise and very low amplitude. This is the basis of all the lossy compression algorithms such as JPEG, MP3, etc. Compressive sensing simply merges the sampling and the compression into a single operation.

DSOs display thousands of waveforms per second. Is your eye able to discern that? No. Even if one uses a probability density display a single spike will be undetectable.

I am not in the same league with Donoho and Candes. So I cannot explain this as well as they can. Both write well and provide all the rigor you can stand. Unless you *really* care about the fine print, I suggest reading the introductions and skipping the proofs.

http://statweb.stanford.edu/~donoho/reports.html

https://statweb.stanford.edu/~candes/publications.html

There are references on their home pages one level up about lay press discussion of the subject.

Yes, we were taught that all this is wrong. Which is why when I ran into it by accidentally doing it I *had* to know why. It cost me a few thousand hours of effort over most of 5-6 years. If after you have read their papers you still think it's wrong, take it up with them. I've cited numerous peer reviewed papers.

To repeat an earlier comment. The gist of the matter is solving Ax=y using an L1 criterion where y is a randomly sampled series and x is the positive frequency Fourier transform coefficients and then back transforming using the inverse FFT. A 5 year old probably would not understand that, but it's as simple as I know how to make it. A search on "compressive sensing" will turn up a very large number of examples in the form of graphs and images. The big breakthrough was showing the L1 has *very* different properties from L2.

If you're a computer geek and know what NP-hard is, then I strongly recommend the 2004 papers by Donoho on the equivalence of L0 and L1 solutions. To the best of my knowledge, that is a major milestone. I presume that the computational complexity crowd has been working feverishly on this, but I've not looked into the matter. So far as I know it is the first and only instance where large NP-hard problems have been solved.

NorthGuy · « **Reply #31 on:** February 28, 2018, 03:40:49 am »

Quote from: rhb on February 28, 2018, 02:48:02 am

In practice, the signal of interest usually dominates the spectrum and is sparse. The coefficients of the transform at the other frequencies are noise and very low amplitude. This is the basis of all the lossy compression algorithms such as JPEG, MP3, etc. Compressive sensing simply merges the sampling and the compression into a single operation.

I see. So, you reconstruct the signal based on dominant frequencies and dismiss everything else. Sine wave would be displayed the best. Signals which are not "sparse" (such as perfect square wave) will exhibit some distortions. White noise will not show at all. Just as with JPEG, most of the stuff looks perfect, but there will be some barely-noticeable distortions where you see sharp edges. This probably should work well for a human eye, and scope is a scope - the tool to "see" things. JPEG may indeed look just as good as, or sometimes even better than the underlying raw file.

KrudyZ · « **Reply #32 on:** February 28, 2018, 04:02:07 am »

But then, people pay the big bucks for oscilloscopes that show them the odd one out, the runt pulse, the one in a million event.
These are the ones you would be willing to discard, which would make the scope useless for many real life debugging purposes.

hamster_nz · « **Reply #33 on:** February 28, 2018, 04:07:20 am »

Quote from: NorthGuy on February 28, 2018, 12:26:09 am

Quote from: langwadt on February 28, 2018, 12:04:45 am
but I don't see how you can do that for the general case like an oscilloscope

Me neither. Some data are compressible, some are not. If a signal can be characterized by regular sampling containing N samples, which, in general case, are random and uncorrelated, then it is absolutely impossible to represent the same signal with N/16 samples keeping the same level of accuracy. Such compression may only be possible if the original samples are not purely random and uncorrelated, but rather restricted in some way, such as limited in bandwidth, periodic etc.

For example, if you compress VHDL code into a ZIP file, you can make it much smaller, but if you try to do the same with random bytes, it will not compress at all.

Not that I know much of this topic, but... on a 1GS/s scope like a (low end Rigol) does not have anything close to 500MHz of bandwidth. So quite a lot of the information that could be in the 1GS/s stream can not be used. So maybe there is room for compression...

Someone · « **Reply #34 on:** February 28, 2018, 09:09:12 am »

Quote from: rhb on February 28, 2018, 02:48:02 am

DSOs display thousands of waveforms per second. Is your eye able to discern that? No. Even if one uses a probability density display a single spike will be undetectable.

I'm not sure you understand the fundamentals of oscilloscopes if you hold this view, phosphor and persistence of vision allowed it to work with analog scopes and now with adjustable persistence in DSOs at the extreme infinite setting you hold all captured events until cleared. Yes its a probability thing to capture it in the first place with the deadtime restrictions but rare (even singleton) events are easily visible.

You really need to simulate your idea with realistic signals to get an idea of its limitations before investing the time and effort in a hardware implementation.

coppice · « **Reply #35 on:** February 28, 2018, 09:32:37 am »

Quote from: langwadt on February 27, 2018, 10:57:35 pm

I'd like a ELI5 on how throwing away 90% of the data will not lose any information in the general case like an oscilloscope

random interleaved sampling, sequential sampling, equivalent-time etc. sampling scopes work when you know you have a
repetitive signal

Throwing away 90% of the data will lose a lot of information from a completely arbitrary signal. However, just like MP3 throws away a lot of information, but keeps what is important for the particular case of what the human ear can detect, compressive sampling can keep adequate information for signals which fit particular constraints.

There are basically two types of lossy compression - source and sink. Source compression avoids wasting bits encoding what the source can't produce. Sink compression avoids wasting bits encoding what the sink can't detect. MP3 basically applies sink compression. Voice codecs (e.g. for cell phones) apply both source and sink compression, as a human voice can only produce a certain range of sounds. Compressive sampling is a form of source compression. Its lossy for an arbitrary signal, but if the signal fits certain constraints all the bits thrown away only encode things which are never in the signal of interest. The result can, therefore, be lossless for the signal of interest. Compressive sampling does well on signals which are spectrally sparse. It turns out a lot of real world signals are spectrally sparse. Consider that you can fully characterise any pure sine wave with just 3 samples, taken anywhere along the wave. Then, work up from there to things a little less sparse than a pure tone, and you might get an idea of the nature of compressive sampling.

coppice · « **Reply #36 on:** February 28, 2018, 10:15:33 am »

Quote from: hamster_nz on February 28, 2018, 04:07:20 am

Not that I know much of this topic, but... on a 1GS/s scope like a (low end Rigol) does not have anything close to 500MHz of bandwidth. So quite a lot of the information that could be in the 1GS/s stream can not be used. So maybe there is room for compression...

The low end Rigol offers 4 channels of 100MHz, sharing a single 1GS/s converter. That's 400MHz of total bandwidth, which isn't far behind the 500MHz that is theoretically possible.

rhb · « **Reply #37 on:** February 28, 2018, 02:27:46 pm »

Sparse L1 pursuits (which is my preferred term for this) allow having an arbitrarily large dictionary (the A matrix) . That was what I was doing when I stumbled through the looking glass.

So if you construct a dictionary which contains transient events of interest they will easily be found provided that on a statistical basis there is a high probability that there will be samples that coincide with the transient. The choice of the A matrix is *very* important. The main focus of F&R is the mathematical properties of the A matrix. To a large degree the choice of A matrix governs how many samples you need for a particular signal.

In a regular Fourier basis, square waves whether perfect or not are *very* sparse. The longer the trace, the sparser it gets.

Fig 1.1 in F&R is a pair of photos. One is at original size and the other is a JPEG at 99% compression (it's printed in a BW half tone). It is anything but a repetitive signal and I can find no discernible differences. Even by crossing my eyes and looking at them in stereo.

This is entirely a statistical wager. It is NOT guaranteed to work. In fact, it is guaranteed to fail if the x vector is not sparse. A very important variation on this is denoising data.

By comparing the reconstructed signal to the acquired data one can generate an error trace showing samples not well represented. That's far more useful than a 256 level histogram generated at 130,000 waveforms per second.

I do want to try for more BW from a 500 MS/S ADC, but as a first trial regular sampling and random decimation is clearly the way to go. It's much simpler to implement and will let me evaluate the computational costs of solving the L1 problem. The charge transfer errors can be quantified and incorporated into the A matrix, but I think that is best left for later.

One interesting aspect of this is that once one has the reconstruction, a great deal of analysis can be done. Consider a quasi square wave with jitter. Once one has solved for the Fourier spectrum, one can pick the modes of the harmonics, synthesize a jitter free representation and then compute statistics on the jitter and other errors.

In all cases, Shannon still applies. Nyquist does not apply in compressive sensing because we no longer have a regular spike series. But the Shannon information content is quantified in the basis chosen for the A matrix. For example, a chirp is much broader in a Fourier basis than it is in a basis which consists entirely of chirps. One could choose a chirp basis in which a single coefficient in the x vector completely described the chirp. This sort of thing is best covered by Mallat in the last few chapters. All of this merges almost seamlessly with wavelet transforms. Many theorems critical to compressive sensing were first proved by Mallat et al.

All of this is a consequence of a very subtle but important difference between L2 and L1. In the past L1 was almost as intractable as L0 which is NP-hard. Generally for an NP-hard problem you have to try *all* the possible answers to find the best one. So relatively little work was done on L1 solutions outside of operations research which only got started in the 40's.

hamster_nz · « **Reply #38 on:** February 28, 2018, 08:05:59 pm »

Quote from: coppice on February 28, 2018, 10:15:33 am

Quote from: hamster_nz on February 28, 2018, 04:07:20 am
Not that I know much of this topic, but... on a 1GS/s scope like a (low end Rigol) does not have anything close to 500MHz of bandwidth. So quite a lot of the information that could be in the 1GS/s stream can not be used. So maybe there is room for compression...
The low end Rigol offers 4 channels of 100MHz, sharing a single 1GS/s converter. That's 400MHz of total bandwidth, which isn't far behind the 500MHz that is theoretically possible.

The Rigol DS1000Z Spec sheets say "Analog channel: 1 GSa/s (single-channel), 500 Msa/s (dual-channel), 250 MSa/s (3/4-channel)", so I guessed that they had four 250Sa/S ADCs, which could be interleaved for the higher sample rates when using fewer channels.

However, as you say, 4 four channel / 250MSa/s setting you are not guaranteed to have much wiggle room left for compression.

langwadt · « **Reply #39 on:** March 01, 2018, 09:07:16 am »

Quote from: hamster_nz on February 28, 2018, 08:05:59 pm

Quote from: coppice on February 28, 2018, 10:15:33 am
Quote from: hamster_nz on February 28, 2018, 04:07:20 am
Not that I know much of this topic, but... on a 1GS/s scope like a (low end Rigol) does not have anything close to 500MHz of bandwidth. So quite a lot of the information that could be in the 1GS/s stream can not be used. So maybe there is room for compression...
The low end Rigol offers 4 channels of 100MHz, sharing a single 1GS/s converter. That's 400MHz of total bandwidth, which isn't far behind the 500MHz that is theoretically possible.

The Rigol DS1000Z Spec sheets say "Analog channel: 1 GSa/s (single-channel), 500 Msa/s (dual-channel), 250 MSa/s (3/4-channel)", so I guessed that they had four 250Sa/S ADCs, which could be interleaved for the higher sample rates when using fewer channels.

I guess it is something like this: http://www.analog.com/media/en/technical-documentation/data-sheets/hmcad1511.pdf

hamster_nz · « **Reply #40 on:** March 01, 2018, 09:16:12 am »

Quote from: langwadt on March 01, 2018, 09:07:16 am

Quote from: hamster_nz on February 28, 2018, 08:05:59 pm
Quote from: coppice on February 28, 2018, 10:15:33 am
Quote from: hamster_nz on February 28, 2018, 04:07:20 am
Not that I know much of this topic, but... on a 1GS/s scope like a (low end Rigol) does not have anything close to 500MHz of bandwidth. So quite a lot of the information that could be in the 1GS/s stream can not be used. So maybe there is room for compression...
The low end Rigol offers 4 channels of 100MHz, sharing a single 1GS/s converter. That's 400MHz of total bandwidth, which isn't far behind the 500MHz that is theoretically possible.

The Rigol DS1000Z Spec sheets say "Analog channel: 1 GSa/s (single-channel), 500 Msa/s (dual-channel), 250 MSa/s (3/4-channel)", so I guessed that they had four 250Sa/S ADCs, which could be interleaved for the higher sample rates when using fewer channels.

I guess it is something like this: http://www.analog.com/media/en/technical-documentation/data-sheets/hmcad1511.pdf

Yes - exactly like that!

Quote

The HMCAD1511 is a versatile high performance low power analog-to-digital converter (ADC), utilizing time-interleaving to increase sampling rate

rhb · « **Reply #41 on:** March 02, 2018, 05:09:11 pm »

Here's a crack at explaining compressiive sensing to a very smart 12 year old.

You need only 3 measurements to determine the amplitude and phase of a sine wave. So if the signal is the sum of 12 sine waves, you need 36 measurements. This is basically the Shannon information content. So if you have a signal which has a maximum frequency of 1 MHz, the Nyquist sampling interval needs to be 1/2 microsecond. However, if you collect 18 microseconds of data you will have a very poor frequency resolution. The window in time is a convolution with a very broad sinc(x) in frequency. So the result will be useless.

On the other hand, if you randomly collect 36 samples over a longer interval, the window in time will be longer and the resolution in frequency will be better. If it is "long enough" the Fourier spectrum will have many near zero coefficients. They will not be zero because sinc(x) extends to infinity, but they will be so small that they are not of interest. They are just an artifact of the sampling window. The correct answer is the amplitude and phase of 12 sinusoids. However, in general the sinusoids will not correspond exactly to a bin in the DFT. So one will have more than 12 coefficients with significant amplitude. But for enough samples and a long enough window, the DFT will be "sparse". If one tries every possible combination of P frequencies from the N frequencies of the DFT and chooses the one which when evaluated at the sample times best matches the data samples you will have the optimal DFT of the data series.

For any realistic example, evaluating the error for all 1 <= P <= N possible combinations is computationally impossible. It is NP-hard as there are N! possible combinations and the factoral grows very fast. Until 2004 that was what everyone believed to be the case. There were a few instances where this was done without mathematical proof much as Heaviside used operational methods for years before the mathematicians proved why it worked. It took quite a while to prove why and under what restrictions the Fourier transform works.

In these two papers:

https://statweb.stanford.edu/~donoho/Reports/2004/l1l0approx.pdf
https://statweb.stanford.edu/~donoho/Reports/2004/l1l0EquivCorrected.pdf

Donoho proves that an L1 solution of Ax=y is equivalent to an L0 (i.e. NP-hard) solution *if and only if* the solution is "sparse" subject to some restrictions on the nature of the A matrix. Skip the math, just read the introductions.

This is a completely general result. It applies to *any* arbitrary signal and *any* arbitrary transform. If a sparse representation exists in any domain, with sufficient samples collected in such a manner that all the columns of A in Ax=y have the Restricted Isometery Property such that the coherence of any combination of P of the N columns is small enough, then the x vector will be sparse and unique.

Suppose that a column is the sum or difference of two other column and the crosscorrelation of any pair is zero. That violates the RIP requirement and thus the problem cannot be solved.

The mathematical fine print for what are "enough" samples and what is a "sparse" solution is *very* painful. It makes the rules for the Fourier transform look simple and was a wild ride through things I'd never heard of.

In the end, the only practical way to find out if a problem meets the necessary criteria is to attempt to solve it. If it works the criteria have been met and if it doesn't they have not. Choosing a different A matrix might allow solving it. The only way to find out is to try it. In practice it turns out that for an overwhelming fraction of all possible problems it succeeds. Rigorous bounds have been derived for the sampling requirement, but in general evaluating the bounds is NP-hard.

This blew my mind when I stumbled into it. It still does. Which is why I want to try it out on real signals. A Zynq base DSO just happens to be a very convenient platform for testing it.

As an aside, randomly selecting samples collected at regular intervals has a problem. For a long enough series one can approximate the random series as the sum of several regularly sampled series. This means that there will be regular spikes in the FT which are closely spaced (large interval in time, == small interval in frequency). This will result in a lot of aliasing, however, it should be very low amplitude relative to the peaks found via an L1 solution of Ax=y. It is just an analog to quantization errors in a traditional analysis. Even with perfectly random sample times, the finite representation required to feed the problem to a computer will generate low amplitude noise artifacts.

Edit: added "such" to read "such that the coherence" as intended.

KrudyZ · « **Reply #42 on:** March 02, 2018, 06:04:57 pm »

While this sounds all very interesting, I'm not convinced how this could be applicable to a real time oscilloscope for the majority of use cases.
Very quick intuitive example:
You have the output of a geiger counter and want to measure the pulses coming out of it and keep count.
The pulse spacing is completely random and the pulse width is short.
Doing random sampling will miss many if not most of the pulses and the missing information cannot be recovered from the frequency spectrum of the locations where the sampling did occur.
Same thing with bit error rate testing. You need to sample on every bit for the measurement to have any value.
Real time scopes are used to find exactly the parts of a signal that deviate from the norm and are not repetitive. Since those would be rare, they would get dropped by the compressive sampling algorithms (unless I'm misunderstanding this)...
So again, what kind of signals do you have to measure, where this approach would have any clear advantage over a standard oversampled system?

rhb · « **Reply #43 on:** March 02, 2018, 07:49:22 pm »

The pulses are of finite duration. The sampling requirement is that some portion of the pulse is observed. Low pass filtering the pulses might reduce the required mean sample rate by broadening the pulses. However, it's important to keep in mind that Geiger counter measurements are purely statistical. No meaning is attached to an individual pulse. So over any meaningful period of time, dropping a small number of pulses would not significantly change the result.

Run length digital compression seems a pretty good analogy to analog compressive sensing in a lot of settings. In a bit error rate case when the mean sampling rate exceeds some factor times the bit rate CS *will* apply.

I don't know if there is any advantage or not. It's a trade off between sampling and compute. I am not claiming this is better. I'm merely commenting upon an ongoing experiment. I'm not working in a corporate R&D program, so I can comment whereas others cannot.

I feel very confident that Keysight, Tektronix and Rohde & Schwartz have staff looking at compressive sensing. Management would be grossly negligent if they were not investigating it. Take a look at the single camera pixel papers by Mark Davenport and other students of Richard Baraniuk at Rice. TI is using CS in their IR spectrometer product software library. After 14 years, there are tens of thousands of pages of peer reviewed professional papers in engineering and mathematics on the subject. Unfortunately, many of them are in IEEE publications and difficult or expensive to access. Look at Nicholas Tzou's dissertation at Georgia Tech. He built a compressive sampling DSO for his dissertation. If your PhD project fails you have to start over. This is not something I made up.

There is a rather long list of mathematicians at Stanford, Rice, Oxford and other places of similar rank who vouch for the statement in the quote below. What I've said is probably not *exactly* correct, but I don't think anyone wants to endure the 400 pages of math needed for a more accurate statement. I surely don't. Reading it twice was enough. I'll read it again if I run into problems that suggest I missed something. I've done enough numerical experiments that I doubt I've missed any show stoppers. I've just not tried doing it on a Zynq.

Quote from: rhb on March 02, 2018, 05:09:11 pm

This is a completely general result. It applies to *any* arbitrary signal and *any* arbitrary transform. If a sparse representation exists in any domain, with sufficient samples collected in such a manner that all the columns of A in Ax=y have the Restricted Isometery Property such that the coherence of any combination of P of the N columns is small enough, then the x vector will be sparse and unique.

NB I corrected a phrase to read "such that the coherence" in the quote as was intended.

KE5FX · « **Reply #44 on:** March 02, 2018, 10:31:42 pm »

A lot of his links are dead now, unfortunately, but I found Terence Tao's 2007 blog post to be a good introduction to compressive sensing. Tao is one of those rare mathematicians (a Fields Medalist, no less) who can communicate effectively with us uninitiated muggles.

NorthGuy · « **Reply #45 on:** March 02, 2018, 11:04:45 pm »

Quote from: KE5FX on March 02, 2018, 10:31:42 pm

A lot of his links are dead now, unfortunately, but I found Terence Tao's 2007 blog post to be a good introduction to compressive sensing. Tao is one of those rare mathematicians (a Fields Medalist, no less) who can communicate effectively with us uninitiated muggles.

Interesting article. Unlike the OP, the author doesn't seem to imply that the method produces better results, but merely saves the storage space at the expense of increased processing effort.

rhb · « **Reply #46 on:** March 02, 2018, 11:14:16 pm »

Good article. I'd not seen it. Thanks for pointing it out. It connects nicely to Mallat and wavelet transforms. I was doing basis pursuit for the simple reason it produces a globally optimal answer and I did not have to write the solver. I just wrote a program that generated an input for the glpsol program in GLPK and then ran it.

I only went back and read all of Mallat after I made the first pass the F&R. I'd had the 2nd of Mallat and bought the 3rd, but had never been motivated enough to suffer through all that very unfamiliar math. They just sat on the shelf awaiting a day when I might need them.

Of course, 10 years later there is a staggering amount of additional work that has been done including multiple graduate level mathematics texts such as Foucart and Rauhut.

Sparse L1 pursuits go far beyond just compressive sensing. Whether acquiring data with an effective sample rate of say 10 GS/S with a slower ADC is "better" or not depends upon your application. I've still not sorted the relationship between the sample time granularity and BW limit. The fact that it eliminates the need for anti-alias filters seems to me a fairly significant advantage. But TANSTAFL.

Someone · « **Reply #47 on:** March 02, 2018, 11:51:15 pm »

Quote from: rhb on March 02, 2018, 07:49:22 pm

Look at Nicholas Tzou's dissertation at Georgia Tech. He built a compressive sampling DSO for his dissertation.

All of that work relied on sampling a signal that was sparse and/or repetitive, its not a general purpose signal acquisition system. These techniques have useful applications but not as a general purpose oscilloscope and the benefits compared to the well established methods of equivalent time sampling are questionable.

coppice · « **Reply #48 on:** March 03, 2018, 10:32:43 am »

Quote from: rhb on March 02, 2018, 05:09:11 pm

Donoho proves that an L1 solution of Ax=y is equivalent to an L0 (i.e. NP-hard) solution *if and only if* the solution is "sparse" subject to some restrictions on the nature of the A matrix. Skip the math, just read the introductions.

This is a completely general result. It applies to *any* arbitrary signal and *any* arbitrary transform. If a sparse representation exists in any domain, with sufficient samples collected in such a manner that all the columns of A in Ax=y have the Restricted Isometery Property such that the coherence of any combination of P of the N columns is small enough, then the x vector will be sparse and unique.

No. This doesn't apply to any arbitrary signal. It applies to any arbitrary signal which is sparse. Leave out the word sparse and you've lost most of the audience. The distinction here is that we have been lossy compressing specific forms of sparse signal for decades - voice, images, etc. - but compressive sampling works for any arbitrary sparse signal. You don't need to know the characteristics if the signal in advance, and apply a carefully crafted compression which suits the signal's characteristics.

Compressive sampling will not make an oscilloscope that will answer questions like "what the heck is happening on that pin" because what is happening on that pin may not be sparse. It might permit useful instruments which allow people to look in depth at a signal which is known to be sparse. It should be possible for those instruments to also give an indication that compressive sampling isn't working for the particular signal being observed, and that some other tool may be needed.

rhb · « **Reply #49 on:** March 03, 2018, 02:37:06 pm »

Quote from: coppice on March 03, 2018, 10:32:43 am

Quote from: rhb on March 02, 2018, 05:09:11 pm
Donoho proves that an L1 solution of Ax=y is equivalent to an L0 (i.e. NP-hard) solution *if and only if* the solution is "sparse" subject to some restrictions on the nature of the A matrix. Skip the math, just read the introductions.

This is a completely general result. It applies to *any* arbitrary signal and *any* arbitrary transform. If a sparse representation exists in any domain, with sufficient samples collected in such a manner that all the columns of A in Ax=y have the Restricted Isometery Property such that the coherence of any combination of P of the N columns is small enough, then the x vector will be sparse and unique.
No. This doesn't apply to any arbitrary signal. It applies to any arbitrary signal which is sparse. Leave out the word sparse and you've lost most of the audience. The distinction here is that we have been lossy compressing specific forms of sparse signal for decades - voice, images, etc. - but compressive sampling works for any arbitrary sparse signal. You don't need to know the characteristics if the signal in advance, and apply a carefully crafted compression which suits the signal's characteristics.

The counter example is that I may craft a dictionary which contains *any* arbitrary signal as one of the columns of A.

Consider an A matrix composed of random values and a y vector which is a single column from A. I had that fail to produce an x vector with a single non-zero value and thought I had found a bug in GLPK. I was quickly told to use a better PRNG such as the Mersenne Twister. Once I had a properly random A it worked just fine. I should note that with the standard C PRNG I got a sparse result. I just wasn't a single non-zero result because of periodicities in the C PRNG.

Quote from: coppice on March 03, 2018, 10:32:43 am

Compressive sampling will not make an oscilloscope that will answer questions like "what the heck is happening on that pin" because what is happening on that pin may not be sparse. It might permit useful instruments which allow people to look in depth at a signal which is known to be sparse. It should be possible for those instruments to also give an indication that compressive sampling isn't working for the particular signal being observed, and that some other tool may be needed.

Except for one shots, scopes are designed to show signals which are repetitive. What is displayed is an overlay of a short window in time whose start is either periodic in the case of recurrent sweep "auto" mode or which is periodic relative to some arbitrary trigger.

This is an engineering R&D project. The mathematics are well established. The only unknowns are implementation details related to the behavior of the ADCs and the computational burden of solving Ax=y.

My original question has been well addressed by the suggestion to just throw away all but randomly selected values. For an initial prototype that is quite sufficient. That suggestion also provided me with insight into how to analyze the Fourier transform of a random spike series. I have struggled on and off for 10 years to get a clear mental picture of the Fourier transform of a random spike series.

My original interest in the transform of a random spike series was sparked by reading a dissertation on the regularization of seismic data by one of Maurice Sachio's students in Alberta. It had been suggested to me as a potential commercial software product. I spent several months working on it but concluded that the method had flaws and dropped the project. I now question whether the perceived flaws were real or a misunderstanding on my part. The "flaws" were very similar to some of the arguments raised here.

KrudyZ · « **Reply #50 on:** March 03, 2018, 05:31:41 pm »

Quote from: rhb on March 03, 2018, 02:37:06 pm

Except for one shots, scopes are designed to show signals which are repetitive. What is displayed is an overlay of a short window in time whose start is either periodic in the case of recurrent sweep "auto" mode or which is periodic relative to some arbitrary trigger.

That is of course the crux here. All modern scopes are "one shot" or real time sampling.
The only exception are equivalent time sampling scopes for the highest possible bandwidth requirements, in which case the signal needs to be periodic, since the actual sample rate is rather low.
So what you are proposing is really a subsampling scope that will be able to restore a repetitive waveform from a smaller set of samples than traditional sampling scopes.
While this is definitely an interesting approach, I don't see a commercial market for it. Existing sampling scopes do an excellent job and don't have any need to reduce ADC or storage speed requirements. All the magic is in the sampler and the time base. The conversion and data processing takes place at a very leisurely pace.
More importantly for any market interest, sampling scopes are a tiny fraction of all scope, precisely because they lack universal usefulness. They simply can't see the glitches that we need to see in digital systems.
The argument that compressive sampling will see some of the glitches some of the time is true, but it would take a much longer measurement time to even find a single one.

rhb · « **Reply #51 on:** March 03, 2018, 06:56:37 pm »

Quote from: KrudyZ on March 03, 2018, 05:31:41 pm

That is of course the crux here. All modern scopes are "one shot" or real time sampling.

They are only one shots if you press the "single" button. In "normal" mode they try to emulate an analog scope which is very much limited to repetitive signals. Back in the day an analog Tektronix storage scope was the only way to capture an aperiodic signal. Those were very expensive so very few people had them.

Try an Amazon advanced search with " compressive sensing " as the keywords. There are roughly 3-4 pages of monographs after allowing for repetitions that have come out in the last 5 years. That's a lot of graduate level courses in sparse L1 pursuits. This is a cutting edge topic in academia.

You are making assumptions about compressive sensing which are not valid. You are imposing sampling conditions which do not meet the requirements for compressive sensing and then saying it won't work. If you insist on putting water in the tank of your car, it won't run. But it's not because there is anything wrong with the car. You have violated the operating requirements of the engine.

If you read the literature you will learn how it works and what the correct requirements are. If you rely upon your interpretation of a very simplified overview by me you will not.

KrudyZ · « **Reply #52 on:** March 05, 2018, 05:01:17 pm »

Quote from: rhb on March 03, 2018, 06:56:37 pm

Quote from: KrudyZ on March 03, 2018, 05:31:41 pm

That is of course the crux here. All modern scopes are "one shot" or real time sampling.

They are only one shots if you press the "single" button. In "normal" mode they try to emulate an analog scope which is very much limited to repetitive signals.

Not at all. The only difference between single and normal mode on a real time scope is that in single mode it stops after the first trigger.
In normal mode it continues to trigger and on modern scopes it then builds a data base based on the persistence settings, which will show single shot events of ANY of the triggered traces.
They are overlayed for equivalent time correlation, but the odd one out is easily visible, which is what makes this useful in practice.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Fast ADC sampling at random intervals with a Zynq? (Read 7460 times)

Share me