Author Topic: Which Cyclone PLL implementation is better?  (Read 2485 times)

0 Members and 1 Guest are viewing this topic.

Offline MitiTopic starter

  • Super Contributor
  • ***
  • Posts: 1324
  • Country: ca
Which Cyclone PLL implementation is better?
« on: April 22, 2020, 07:35:09 pm »
I’m trying to upscale a two level PAL video signal from an HP SA to VGA 1024x768. For that I want to gen-lock the FPGA clock from H Sync. The VGA clock must be as clean as possible, with low phase noise.

So here comes the question, in an Altera Cyclone, does the PLL output clock have a lower or higher jitter than the input clock? Which is the better implementation in my attachment?

Thanks

Edit: Just a quick correction, the second implementation would require MK9173-01, not -15
« Last Edit: April 22, 2020, 08:47:04 pm by Miti »
Fear does not stop death, it stops life.
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7720
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #1 on: April 23, 2020, 02:06:17 am »
The 'MK9173-01' wide output frequency range may exceed the PLL input range settings of Cyclone's +/- 10% range as the sync locking takes place as the PAL video signal may come on and off.  To be safe, especially if you are using the Cyclone PLL to clock other peripherals like DDR Ram, you will need to attend to this.

I personally would use a cheap NTSC/PAL video decoder IC.  It will decode for you a clean 27MHz clock reference with HS,VS,and field ID, plus give you a 10bit YUV decoded sampled picture which you may genlock over-lay or underlay on your VGA picture as well, all at 1.8v to 3.3v IO.  These ICs readily provide the 27MHz even if the source video is lost without any large clock glitch since the 27MHz has a narrow tuning window due to it being crystal generated.  If source video is lost or scrambled, they will also give you a default a blue-screen, with an HS & VS they internally generate so your output and your FPGA clocking system wont go corrupt and die.

(Bonus:  The decoded HS from these ICs will not have the double pulsing during the VS and stabilization period found in PAL and NTSC signals before, during and after the VS which may mess up your PLL)
(Some of the newer ones also provide a sync & clock cleaning function at the end of each video field when provided a non-time-base corrected VHS video source signal where a giant tear in HS is present in the source when the final lines of video switch from the bottom of the video tape as the 1 head leaves the tape and the next head begins at the top of the video tape at the next field.)

From the video decoder IC's 27MHz, you may generate the 55.125 by setting the PLL to Multiply by 49 & divide by 24.

As for you initial question.  Either which way is OK, however, having a lower frequency outside the FPGA may help with EMI and range of the external PLL IC.
« Last Edit: April 23, 2020, 03:17:41 am by BrianHG »
 

Offline MitiTopic starter

  • Super Contributor
  • ***
  • Posts: 1324
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #2 on: April 24, 2020, 01:52:22 am »
I don't understand what MK9173-01 has to do with the Cyclone PLL range. Cyclone is used only as divider in the MK9173-01 PLL loop, that's it. The output from that chip is then fed to the FPGA PLL.
On the other hand, I can't use 27MHz, in HP8590 series, the pixel clock is 10.5MHz, divided by 2 from a 21MHz oscillator. So I need that 21MHz, well, technically 10.5MHz, for the video buffer input. The video is monochrome with low intensity for grid and menus and high intensity for trace.
Fear does not stop death, it stops life.
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7720
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #3 on: April 24, 2020, 02:48:17 am »
In your block diagram, you illustrate that the FPGA internal PLL would be used in conjunction with your external PLL clock generator.  IE: taking in 21Mhz, outputting 55.125MHz to the rest of the FPGA.  (Speaking specifically to Cyclone's internal PLL, not your custom VCO feedback divider which isn't using the Cyclone PLL)  So be warned that when you setup Cyclone's internal PLL to take in 21Mhz and output 55.125Mhz, that VCO setup will guarantee a stable lock within a narrow range.   In the past, I've gone up to +/-10% of my Cyclone's specified source clock range without issue.  All I'm saying is that since you are using the MK9173-01 as your 21Mhz clock source which the Cyclone's internal PLL is up-clocking to 55.125Mhz, and the MK9173-01 can output much higher and much lower frequencies depending on it's phase inputs (ie a video signal switching off and on), you need to make sure this wont have a detrimental effect on your design.  Cyclone does have a PLL lock indicator for it's internal PLLs, but no direct means of telling if that source clock is say for example, between 20Mhz and 22Mhz.
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7720
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #4 on: April 24, 2020, 03:12:17 am »
Take a look here:
https://www.eevblog.com/forum/microcontrollers/fpga-video-format-conversion/

He used the TVP7002 as a digitizer and as a clock generator to convert a CRT scope to a scaled VGA output.  Though he went through a number schematic entry and PCB assembly design mistakes, he eventually got it to work.

Trick, you can manipulate the TVP7002 sampler's PLL (IE Over sample the HP's video to 1024 pixels) and X&Y open window coordinates (The TVP can generate an active video enable window with the syncs) while using it's built in selected sampling low pass filter to simulate a raster scaling on the 'X' axis for feeding your 1024x768 LCD.  This means you only need a few 'Y' line buffers to scale the 'Y' axis from your HP8590 video source to the 768 lines.

If your HP video is 60Hz out and so is your LCD, you shouldn't need any buffer frame ram, just a bunch of line memories.
« Last Edit: April 24, 2020, 03:14:20 am by BrianHG »
 

Offline MitiTopic starter

  • Super Contributor
  • ***
  • Posts: 1324
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #5 on: April 24, 2020, 10:49:10 am »
BrianHG, thanks for your answers! I’m just learning FPGAs so I genuinely didn’t understand some things.
Initially I was thinking of tapping into the 21 MHz clock of the SA and call it a day and I may even do it if the clock recovery solution doesn’t work, but I want to reduce the number of taps if possible.
I will however tap into the digital video which is 2 bits as I said, 00 for black, 01 for low intensity and 11 for high intensity. That’s why I don’t need a digitizer, it’s already digital.
My frames are V-Synced so indeed I need only a line(s) buffer and since I intend to use a Cyclone 4 in TQFP, I have enough M4K RAM to make a simple 2 port buffer for half the frame and fill it twice.   
The HP image specs are 512x254 visible, and this would fit perfectly ( in my mind at least) into VGA 1024x768 with 2x pixels and 3x the lines. Since I’ll be using 4:3 LCD module the aspect ratio should be maintained.
The part that I don’t understand is why would the video signal turn on and off and why would theMK9173 output different frequency as long as the loop is locked? This circuit goes inside the HP SA and replaces the CRT display module, I think this is the part that wasn’t clear.

I really appreciate you help, thank you!
Fear does not stop death, it stops life.
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7720
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #6 on: April 24, 2020, 11:32:28 am »
BrianHG, thanks for your answers! I’m just learning FPGAs so I genuinely didn’t understand some things.
Initially I was thinking of tapping into the 21 MHz clock of the SA and call it a day and I may even do it if the clock recovery solution doesn’t work, but I want to reduce the number of taps if possible.
I will however tap into the digital video which is 2 bits as I said, 00 for black, 01 for low intensity and 11 for high intensity. That’s why I don’t need a digitizer, it’s already digital.
My frames are V-Synced so indeed I need only a line(s) buffer and since I intend to use a Cyclone 4 in TQFP, I have enough M4K RAM to make a simple 2 port buffer for half the frame and fill it twice.   
The HP image specs are 512x254 visible, and this would fit perfectly ( in my mind at least) into VGA 1024x768 with 2x pixels and 3x the lines. Since I’ll be using 4:3 LCD module the aspect ratio should be maintained.
The part that I don’t understand is why would the video signal turn on and off and why would theMK9173 output different frequency as long as the loop is locked? This circuit goes inside the HP SA and replaces the CRT display module, I think this is the part that wasn’t clear.

I really appreciate you help, thank you!
You wont have a problem since this isn't a 'PAL' signal.  You wont have a problem as your video sync signal is always there.  Also, you may want to adjust your feedback divider so that the clock for the FPGA data input is at 21 or 42 or 84 MHz and have a software select which phase you sample the 10.5MHz to avoid transition edge data errors.

Remember to use 'fast input' and 'fast output' assignments for everything in Quartus and 'D-latch' all your inputs and outputs.  Also, the clock input from the MK9173-01 should be on a fpga dedicated clock input.  This will guarantee the cleanest timing.
 

Offline MitiTopic starter

  • Super Contributor
  • ***
  • Posts: 1324
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #7 on: April 24, 2020, 02:30:07 pm »
You wont have a problem since this isn't a 'PAL' signal.  You wont have a problem as your video sync signal is always there.  Also, you may want to adjust your feedback divider so that the clock for the FPGA data input is at 21 or 42 or 84 MHz and have a software select which phase you sample the 10.5MHz to avoid transition edge data errors.

Take a look at the attached scope screen shots. I think that between 21 and 10.5MHz, I can find the right edge to sample.
The problem is that MK9173-01 is only available from Digikey and such but is kinda expensive so I ordered MK9173-15 from ebay, US vendor though, I hope it is a genuine chip.
MK9173-15 cannot do more than 37.5MHz from 15KHz. If you look at this product from Simmconn labs http://simmconnlabscom.ipage.com/tech/NewScope-0Jr_OM.pdf he's using the same chip -15. If it works for him, it should work for me.

Remember to use 'fast input' and 'fast output' assignments for everything in Quartus and 'D-latch' all your inputs and outputs.  Also, the clock input from the MK9173-01 should be on a fpga dedicated clock input.  This will guarantee the cleanest timing.

For development I'm using a Terasic DE1 Cyclone II dev board, the clock is applied at the external clock input and that goes to Pin L1 CLK0/LVDSCLK0p. I'm trying to understande what fast input/output does.

I have a couple of questions about the 2 ports RAM, the Intel documentation is confusing for a noob like me.
1. I understand that the rising edge of wrclk latches the data and address in and the falling edge writes it to the RAM, is that correct? So my video sampling clock should be basically the rising edge of wrclk?
2. Can I avoid using the wren and rden and just tie them to Vcc? I could point to an address out of the visible space when the counter is out of the visible area.

Edit: In the screen shots, CH1 is clock, CH3 is one pixel. I've added one more screen shot that shows the two digital signals (PIX and INT) that make the analog video (VID).
Edit1: I corrected a mistake in question #1, wrclk instead of wren.
« Last Edit: April 25, 2020, 07:37:34 am by Miti »
Fear does not stop death, it stops life.
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7720
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #8 on: April 24, 2020, 11:26:24 pm »

I have a couple of questions about the 2 ports RAM, the Intel documentation is confusing for a noob like me.
1. I understand that the rising edge of wrclk latches the data and address in and the falling edge writes it to the RAM, is that correct? So my video sampling clock should be basically the rising edge of wrclk?
2. Can I avoid using the wren and rden and just tie them to Vcc? I could point to an address out of the visible space when the counter is out of the visible area.

Edit: In the screen shots, CH1 is clock, CH3 is one pixel. I've added one more screen shot that shows the two digital signals (PIX and INT) that make the analog video (VID).
Edit1: I corrected a mistake in question #1, wrclk instead of wren.

Use the dual port ram in dual clock mode.  Don't worry about when the write takes place, just that the data will be in the ram by the time your next clock rises.  In other words, if the maximum theoretical clock for the Cyclone ram is 200MHz and you are running at 200 MHz, the byte will be written by the next clock.  If you are running the ram at 1 MHz, just rest assures that the data will be in the ram with 5ns after the rising clock. (This doesn't have anything to do with the IO pins input setup and hold times...  I expect you are D-latching those and using the 'fast input registers' assignment for the best performance.)

Just make note that if you are reading the same address at the same time you are writing to the same address, this is when you will need to worry about timing and memory contents.

Instead, make your design so that the video output is at least 1 line of video behind the source video coming in and you will never have to worry about a write to read collision since you will never be reading the same address which you may be writing to.  In other words, completely avoid the problem all together.

Now, if you want to do linear filtering on the vertical stretching of the source video, this means you need access to adjacent lines of video to variably blend/mix together with the appropriate proportions, this is how I would do it:

a) Make 2 line buffer banks with enough storage for multiple lines of video in each.  (8 lines of video in each buffer should work, where you keep on filling the lines in circles in the buffer as the picture continues to come in.  8 lines is good if you have a clean ratio of input lines to output line on on the LCD display timing.  You may need to increase this buffer is the beginning and ending input and output times from source to LCD stretches out to an odd fraction of time.)

b) On the source video sampling side, incrementally write to 1 bank only the even source video lines and use the other line buffer bank for the odd line of source video.

c) On the video output side, read the appropriate odd and even line buffers in parallel simultaneously, proportionally selecting the right blending mix in real time rendering a smooth filtered Y transition as you draw multiple output lines for the 768 lines of the LCD module for every pair of source sampled video input data.

Hard wiring the read enable is valid.
To help visualize how a typical dual port ram may look, see here:
(This memory has the clocked/latched address inputs and latched data outputs feature on.  It is the fastest and cleanest configuration of memory for FPGA offering the fastest clock performance)


However, in your design, you will configure 2 of these, each with 1 write port and 1 read port so that 1 of the 2 will store exclusively all of the source even lines of video and the other will store only the odd lines of video.  (Only if you want to do a linear/cubic Y filtering feature, otherwise, you only need one of these ram buffers which will store both the even an odd video lines from the source video) 

For the write side, it's clock, address, data & write enables will operate exclusively on your source video's sampling clock.

For the read side, it's clock & address generator will operate exclusively on your VGA 1024x768 55Mhz output clock.

When rendering the VGA output, remember to start the picture output only after the first 2-3 lines of video have already been captured.  This means to vertically center your output picture on the LCD, you may need to artificially shift the V-sync output depending on the LCD interface specifications.  If you have a really odd fraction of timing between the 2 display between the beginning and ending a frame, you will need to further offset this number of video lines of lag between the source video and output video.

« Last Edit: April 25, 2020, 01:23:18 am by BrianHG »
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7720
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #9 on: April 25, 2020, 02:36:58 am »
The problem is that MK9173-01 is only available from Digikey and such but is kinda expensive so I ordered MK9173-15 from ebay, US vendor though, I hope it is a genuine chip.
MK9173-15 cannot do more than 37.5MHz from 15KHz. If you look at this product from Simmconn labs
Can you not just tap the 10.5Mhz or 21Mhz oscillator clock within the HP8590 and forgo the external PLL?
To minimize RFI, just use a series resistor of something like 220ohm at the HP oscillator output and then feed a clk input line into the Cyclone with a wire and a 1k terminator on the Cyclone side, or if you are a little worried about noise, you may use a small coax wire.
« Last Edit: April 25, 2020, 02:39:34 am by BrianHG »
 

Offline MitiTopic starter

  • Super Contributor
  • ***
  • Posts: 1324
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #10 on: April 25, 2020, 03:12:53 am »
The problem is that MK9173-01 is only available from Digikey and such but is kinda expensive so I ordered MK9173-15 from ebay, US vendor though, I hope it is a genuine chip.
MK9173-15 cannot do more than 37.5MHz from 15KHz. If you look at this product from Simmconn labs
Can you not just tap the 10.5Mhz or 21Mhz oscillator clock within the HP8590 and forgo the external PLL?
To minimize RFI, just use a series resistor of something like 220ohm at the HP oscillator output and then feed a clk input line into the Cyclone with a wire and a 1k terminator on the Cyclone side, or if you are a little worried about noise, you may use a small coax wire.

To answer your question, yes, I can, but I want to do a minimum invasive surgery.  :-DD
If I’m not happy with the PLL, I will tap the clock.
Regarding the VGA stretching, blending, filtering, my intention is to generate a VGA 1024x768 output and use one of those LCD modules that have VGA input and let the LCD driver chip do the blending.
I know that the clock should be 65MHz for a standard VGA output, but I tried 2 different LCD modules (with VGA, not bare LCD) and they have no problem auto adjusting to my signal. A PC monitor doesn’t work though.
Fear does not stop death, it stops life.
 

Offline OwO

  • Super Contributor
  • ***
  • Posts: 1250
  • Country: cn
  • RF Engineer.
Re: Which Cyclone PLL implementation is better?
« Reply #11 on: April 25, 2020, 03:16:48 am »
For development I'm using a Terasic DE1 Cyclone II dev board, the cock is applied at the external clock input and that goes to Pin L1 CLK0/LVDSCLK0p. I'm trying to understande what fast input/output does.
;)
Email: OwOwOwOwO123@outlook.com
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7720
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #12 on: April 25, 2020, 03:59:58 am »
I know that the clock should be 65MHz for a standard VGA output, but I tried 2 different LCD modules (with VGA, not bare LCD) and they have no problem auto adjusting to my signal. A PC monitor doesn’t work though.
Many PC monitors need a good strong 5v TTL drive on the sync wires, or they might provide haywire random actions.

The 'Fast Input' and 'Fast Output' assignments in Quartus tells the compiler when feeding a signal to an IO pin, to use the closest flipflop on the silicon die to that pin when driving it.  Though not important, this will force the cleanest possible output and input timings.  Otherwise, a synthesized net may be generated or registered on one side of the FPGA, then then fuse routed throughout the FPGA fabric before it reaches the IO pins transistor drivers.  Using the flipflop right by that IO pins transistor drivers leads to that flipflop being clocked by it's special global clock network generates the cleanest pin to pin IO timing and drive at that IO pin.  That is unless you want to learn about setting up all the timing restrictions for the time-driven synthesis/fitting and let the compiler rout out anything as long as the end results are within your set restrictions.
« Last Edit: April 25, 2020, 04:15:29 am by BrianHG »
 

Offline MitiTopic starter

  • Super Contributor
  • ***
  • Posts: 1324
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #13 on: April 25, 2020, 07:40:20 am »
For development I'm using a Terasic DE1 Cyclone II dev board, the cock is applied at the external clock input and that goes to Pin L1 CLK0/LVDSCLK0p. I'm trying to understande what fast input/output does.
;)

Good catch! That would have guaranteed a lot of jitter ... :-DD
Fear does not stop death, it stops life.
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7720
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #14 on: April 26, 2020, 04:24:45 am »
For development I'm using a Terasic DE1 Cyclone II dev board, the clock is applied at the external clock input and that goes to Pin L1 CLK0/LVDSCLK0p. I'm trying to understande what fast input/output does.

The dedicated clock inputs on the Cyclone, (which may be called fast input, i don't remember), are dedicated inputs only which have a dedicated global wiring withing the FPGA fabric so that the signals fed to those input pins reach the PLL inputs and a set of dedicated clocking traces throughout the FPGA for minimal delay and timing consistency to all the logic cells.

Photographic block illustrations of how the FPGA's IOs and logic cells are wired internally is in the CycloneII handbook.  Looking here may help you see what's going on inside the FPGA..
« Last Edit: April 26, 2020, 05:35:17 am by BrianHG »
 

Offline MitiTopic starter

  • Super Contributor
  • ***
  • Posts: 1324
  • Country: ca
Re: Which Cyclone PLL implementation is better?
« Reply #15 on: April 26, 2020, 01:45:01 pm »
I try to simulate a very simple design, just to understand how simulation works. For that, I use a stripped down Terasic example where I added a simple lpm_counter that counts to 3 to make a frequency divider. It compiles fine, the block diagram in the RTL Viewer looks fine, but when I go to simulation I get "Instantiation of 'lpm_counter' failed. The design unit was not found." Now, I find all kind of solutions, add the libraries to the project, compile the libraries, add a path to the libraries, nothing seems to work for me or, most likely, I'm doing something wrong. I use Quartus II 13.0 SP1, the latest that supports Cyclone II from what I see.

A step by step how-to add/compile those libraries would be of real help for me.

Thanks!
Fear does not stop death, it stops life.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf