Your requirements seem a bit vague.
In order to get any meaningful suggestions, you would need to at least specify the longest and shortest desired interval between two consecutive samples, as well as the required resolution and accuracy.
As you are probably aware of, ADCs clock circuits are usually designed to MINIMIZE jitter, not to intentionally add to it.
Most high speed, high resolution converters have pipeline architectures and rely on each stage getting the same amount of time.
If you mess with that the conversion results will no longer be linear.
Furthermore, the data outputs are usually phase locked to the sample clock. A little bit of jitter there is OK, but if you are looking for big cycle to cycle variations the data transmission will no longer work.
I would also be interested to hear what type of signals you are trying to work with.
I started reading through the Foucart / Rauhut book you linked to and while I don't pretend to understand the math involved, some of the applications seem very interesting...
Yes. ADCs are designed that way because all the classical mathematics based on Wiener's "Extrapolation, Interpolation and Smoothing of Stationary Time Series" requires regular sampling. The monograph appeared during WW II as a classified report bound in yellow to denote its classified status and was popularly called "the yellow peril" because of the heavy math it contains. It's kindergarten level relative to Foucart and Rauhut.
The spatial regularization of seismic data was a perennial topic at the Society of Exploration Geophysicists annual meeting for many years. There have been a great many MS & PhD theses written on it and innumerable proprietary algorithms. Then one day we woke up discover that irregularly sampled data is not a problem, but a virtue. But we were heavy sleepers and didn't wake up for 5 or 6 years.
For anyone interested in this lacking a high threshold of mathematical pain I strongly suggest reading the introductions of the papers at David Donoho's and Emmanuel Candes' websites from 2004 to 2009. Just skip the math proofs. Search generally on "compressive sensing" and read the non-mathematical sections of the papers. I read Foucart and Rauhut twice and Mallat's " A Wavelet Tour of Signal Processing" 3rd ed. because I wanted to understand why something I *knew* was impossible, was actually possible.
For many years I thought that you could regularize data by performing a discrete Fourier transform. But when I actually got around to trying it, I discovered that it doesn't work because of the L2 assumption built into the definition.
As applicable to a DSO, what is being done is a forward Fourier transform using an L1 (least summed absolute error) algorithm instead of the L2 (least squares) implicit in the DFT and then doing a normal inverse FFT. What Donoho and Candes discovered is that the L1 solution has magical properties.
The figures are from Foucart and Rauhut. The discrete Fourier spectrum shown at the top of Fig 1.2 is evaluated in the bottom part and then 16 of the 128 samples used to form the time domain trace are randomly chosen. The top part of Fig 1.3 is the result of attempting to recover the spectrum using a DFT as I did. The bottom is the spectrum recovered by solving an L1 problem. The example is a noiseless case, but it demonstrates a 16x reduction in sampling to exactly recover the 128 coefficient series from its Fourier transform. A sinc interpolator has been used for the figure in the bottom part of Fig 1.2.
I now feel pretty confident that what is needed is to generate a clock with Gaussian distributed, zero mean jitter generated by a PRNG and use that to clock the ADC. Because of the ADC conversion time, increasing the BW is probably more difficult than decreasing the data rate. In concrete terms, the shortest time to the next sample must be greater than 2 standard deviations less than the mean time. My inference that the BW is governed by the granularity of the clock timing may not be entirely correct. I've never had an opportunity to discuss this with anyone else who is familiar with it. A good friend who does won't wrestle with the math unless he's getting paid.
The first chapter of F&R is easy going as it just discusses the motivation and applications. It's really grim after that. In practice it's like the FFT. You just run the program once you have the software.
Search on "single pixel camera" for another application. It's the work of Mark Davenport, Richard Baraniuk et al at Rice and *very* cool. The math in F&R is also the key element in the algorithm that won the Netflix prize for predicting what movies people might like.
Explaining the details of the concept in response to people's comments has been a huge help. Without it I would have blundered down several blind alleys.
I started the thread thinking about ADCs in the manner of an MCU where you can set a timer to go off and trigger the ADC sample. However, the DDR mode of the ADC08DL502 samples on both edges of the clock. Hence what's needed is a clock with a lot of deterministic jitter. As @nctnico pointed out, the AXI DMA stream to DRAM will need to be fed by a FIFO that buffers the irregularly clocked output from the ADC.