Electronics > FPGA

SerDes headache on Xilinx FPGA

(1/6) > >>

Alex:
Do you have any experience debugging SerDes inputs? Apologies for the lengthy post, this is proving a particularly difficult bug.  |O

This concerns a high speed parallel DAQ.  Right now we are getting erroneous output data when sampling a sinewave (attached image).

We have 14 bits coming in from the ADC, serial, DDR at 580 Mb/s. Xilinx SerDes IP cores implement 1:7 decoding, we then implement 7:14 demultiplexing. The data from this is often stable, outputting correct portions of the input sinusoid (see attached).

The system 'bitslips', until the framing signal is correctly locked, i.e. '01111111000000' is bitslipped by one to obtain '11111110000000'. This works and remains locked for greater than 10 mins.

The user demultiplexing appears to work fine, as we get portions of the sinusoid and the boundary between the top nibble (7 bits) and bottom nibble (7 bits) seems to also work fine.

In the output sinusoid, we get periodic amplitude jumps that happen to be exactly 2^10, i.e. 1024 from an erroneous bit toggle on the 10th bit. Rather than being a single sample glitch, this lasts for approx 50 samples. Meanwhile the other bits correctly switch, allowing the shape of the sinusoid to be correct within that 50 sample period. The output sinusoid therefore looks to have 2^10 amplitude jumps within it (see attached).

Now, the 10th bit is in the middle of the top 7bit nibble. As the other bits of that nibble are correct, it is clearly not an issue with 7 bit to 14 bit time demultiplexing. Likewise if it was PCB trace routing delays, it would be more random rather than periodic and would probably effect other bits. It would also change with different physical constraints and firmware re-builds as logic is moved around on the FPGA, which it doesn't. I've checked the bitslip functionality and it appears to be fine. The ADCs are definitely configured, and their clock (160 MHZ) is stable.

As many of the bits are correct, and indeed bit 10 is often correct for 100s of samples, I don't think it is SerDes singaling (timing or voltages). I'm using a test widget to pretend it is the synchronisation module, allowing the RAW FIFO to be setup and to grab some data. With test data this is perfect. The test widget starts RAW acquisition 800-ish clock cycles after the widget is enabled from the embedded software, and crucially grabs data into the FIFO just once, hence no problem with data writing when we are reading the FIFO etc.

Now, there are other things going on. For example the input sinusoid has an amplitude of 200mVac and frequency 200kHz with signal generator DC bias set to the same as the ADC's natural DC biasing point. However with no change in hardware, if I increase amplitude to 400mVac, I get almost a flat-line with +/-200 DN noise, no sinusoid!

The issue doesn't seem to go away if we reduce the ADC speed, however I suspect it is not LVDS SerDes signalling anyway, but somewhere in the 1:7:14 deserialisation process. I've re-coded the 7:17 user demultiplexers for more robust timing, demultiplexing in time rather than two 7bit registers with asynchronous multiplexers. I.e. increasing the demultiplexing action from tclk/2 for the case of asynchronous multiplexer CLBs to the full tclk for synchronous 7:14 demultiplexing.

Now, there are other issues, one being single sample glitches, the other being another periodic bit issue with the 14bit signing bit (MSB), hence the offsets that go from positive to negative etc. But I expect those are symptoms of the same  issue, i.e. fix the issue with bit 10 and bit 13 is also likely to be fixed.

Any pointers would be much appreciated. This is truly doing my head in...  :-//

hamster_nz:
Any chance of seeing the raw data?

External causes:
- Signal integrity on the serial data
- Intersymbol interference - Too many '1's or zeros in a row cause the value to bleed into the following bit.

Maybe map out the raw data, as it is on the wire, and then 'corrected' raw data, and see what the differences look like.

Internal causes:
- Bitslip - does it really work as you are expecting. I never trust/understood bitslip, and have sometimes implemented my own logic to find the correct framing.
- Sampling phase - if not already doing so, try adding an IDELAY2 and some option to set the delay. That will allow you to fine-tune when the bits are sampled. This can look a lot like a signal integrity issue.

In 7-series parts you can sort of automatically detect and adjust sampling phase, but you can't use 'wide' Serdes. You use two 4-bit Serdes, one sampling on the inverted clock, and then tune the IDELAY so that for one of the serdes blocks the transitions are picked up 50% of the time.

At 580Mb/s You could also consider sampling twice as fast at 1:7, combining that into a oversampled 1:14 frame, and then tuning the delays so that you are sure that the transition occurs on close to the odd bits, then use the even bits as good data.

hamster_nz:
There also seems to be some sort of pattern of errors at a finer scale too.  maybe at the 256 or 128 counts.

langwadt:
I'm thinking that the timing or signal integrity is marginal so occasionally an odd and even bit gets swapped/duplicated

can you setup the ADC to output a test pattern?

BrianHG:

--- Quote from: hamster_nz on June 21, 2017, 10:28:45 pm ---There also seems to be some sort of pattern of errors at a finer scale too.  maybe at the 256 or 128 counts.

--- End quote ---
If the serial bits are being shifted to the left or right by 1, the MSB may be mapped in the previous or next LSB bit, and you could see the type of waveform you are getting since you see only the 7 LSBs on bits 7 through 1, with what should have been the MSB bit 7 on the next or previous LSB bit 0.  This may create that type of chop in the waveform when your signal crosses the level which triggers the true MSB of the ADC.

Otherwise the output of your ADC is 2's complement AC and you are reading a straight linear signal, or, visa-versa.  This means your serdes is good, you just need to invert the 7 LSBs when the MSB is set, or change your graphing from an UNSIGNED BYTE to a SIGNED BYTE, or visa-versa.

Navigation

[0] Message Index

[#] Next page

There was an error while thanking
Thanking...
Go to full version