Ok, so the rules are simple, a soft serializer for 'DVI' output. Meaning, no differential drivers, yet DVI needs differential. At least 25MHz pixel clock for VGA output. Using a -C8 IC. 8 outputs are required, 2 for pixel clock, 2 for red, 2 for green, 2 for blue. To get the best timing, we will need to output these on 1 IO bank, preferably the higher speed one if your FPGA has IO banks suited for DDR IOs, use them.
I setup this code in Quartus with a Cyclone III -C8 FPGA to see what happens.
module sw_serialize (
input wire reset,
input wire sclk,
input wire pclk,
input wire [9:0] r,g,b,
output reg [7:0] serial_dvi );
wire [9:0] c;
assign c=10'b0000011111;
reg [9:0] ser_out[8];
reg [9:0] c_reg,r_reg,g_reg,b_reg;
reg last_pclk,pclk_trigger;
integer i;
always @ (posedge pclk) begin
r_reg <= r;
g_reg <= g;
b_reg <= b;
c_reg <= c;
end
always @ (posedge sclk) begin
last_pclk <= pclk;
pclk_trigger <= ~last_pclk && pclk;
if ( pclk_trigger ) begin
ser_out[0] <= c_reg;
ser_out[1] <= ~c_reg;
ser_out[2] <= r_reg;
ser_out[3] <= ~r_reg;
ser_out[4] <= g_reg;
ser_out[5] <= ~g_reg;
ser_out[6] <= b_reg;
ser_out[7] <= ~b_reg;
end else begin
for ( i=0 ; i<8 ; i=i+1 ) ser_out[i][8:0] <= ser_out[i][9:1];
end
for ( i=0 ; i<8 ; i=i+1 ) serial_dvi[i] <= ser_out[i][0];
end
endmodule
With the top diagram looking like this:
It compiled with an FMAX of 402MHz. Whoa, hey, I think this is great. Let's do a functional simulation:
Ok, the reference clock and 's_div[0] & s_dvi[1]' works, while the red and green 's_dvi[2..5]' also hold true to their 10 bit 10101010 input patterns, as well as the blue deliberate inverted clock pattern on the 's_dvi[6..7]'.
Ok, let's do a timing simulation:
Whoa, what the crap... My vertical grid is 1ns and everything is crapped out of time by up to 2 ns. Now I realize that did not specify any IO timing restrictions, but I just want to see what's possible here.
Next step, some secret magic sauce. In Quartus's 'Assignment Editor', I add an assignment to the 's_dvi[]' output group. The assignment is called 'Fast Output Registers'. This particular assignment tells the Fitter in Quartus that the 'flip-flop' driving the IO pin must be the one of the 2 or 4 which is right at that IO pin, and any logic before generating the output is forced to that particular flip-flop driving the IO pin. (The 2-4 flip-flops at the IO pin using 'fast-output-enable' on a 180 degree phased clock is how the DDR IO system works if used, the other 2 flip-flops are for the DDR input.)
Now, after running a timing simulation, I get this:
Wow, a crap load better. In fact, for VGA 640x480, this is well timed enough to drive DVI directly. In fact, here is a closeup with a 100ps time grid:
Our pin-2-pin error is now within 50ps, IE +/-25ps timing error, and this is on a -C8 Cyclone III. Let me just re-check the FMAX.
Still 402MHz, however, there are some timing hold violations between the 'pclk' feeding the last_pclk and pclk_trigger which are clocked by the 'sclk' To fix this, I had to delay the PLL's pclk output by 1ns.
This is actually really good as the commercial broadcast HDMI 480P with digital audio requires 270MHz and we clear it no problem. Without any DDR trick. The same FPGA with a -C6 gives me a FMAX of 437MHz. Though the outputs seem to come out 700ps sooner, the +/-25 ps error between each output is still a match.
Now, I know everyone says, where are you going to get an HDMI core with digital audio... Well, here is a Verilog open core of exactly that:
https://www.eevblog.com/forum/fpga/fpga-vga-controller-for-8-bit-computer/msg2783700/#msg2783700Well, should I try for higher frequencies, like 720p. We would need 743MHz serial out, or using the DDR trick, 372MHz. The -C8 just might pull it off, however, I would need to read the data sheet to make sure.
(Note: I did not use serdes IOs, or LVDS, or differential as you can see with my Verilog code. This was 100% 2.5v IO output setting in Quartus)
Additional LOL, I just added the assignment of maximum current strength, and the timing error between pins has improved around another 50%.
I've attached a copy of the Quartus project. It just needs re-compile to build everything.