Wow - thank you so much!!
Okay, so I've got some documentation to do and some work on the Z80 interface (the mux with the RS232_debugger interface is nowhere near done yet) and it's done.
Merry Christmas to you too!! (and anyone else reading this!)
(Attachment Link)
(Attachment Link)
In the image above, the white fade-out on the right-side of the Z80 image isn't a photographic artefact - could that be caused by capacitance in the output wires?
Strange. Since I'm using a real DAC, I see nothing but a perfect image on my end. Maybe something happened with the IO current strength settings, or, too much on screen for the the method of the analog dac.
When you have time, see if you can get a scope shot.
BTW, I now have all 15 layers working now on my side, but, your Cyclone IV is too small. Maybe I will come up with an idea. 1 Cyclone larger and 15 would be no problem on your side too as you are ran out of logic cells and ram needs to shrink to fit the design as well.
Check your IOs as I have made the outputs 24 bits instead of 12 bits. You only want the upper 12 bits [7:4] wired to the DAC pins. Maybe check IO assignments as you may be using the lower IOs or having new adjacent IOs new interfering with the video.
Check your IOs as I have made the outputs 24 bits instead of 12 bits. You only want the upper 12 bits [7:4] wired to the DAC pins. Maybe check IO assignments as you may be using the lower IOs or having new adjacent IOs new interfering with the video.
Hmm.. I knew this Cyclone IV board was cheap, but it's got the smallest Cyclone IV imaginable on it!
Well, I'm guessing the Lattice LFE5U won't have these issues.Check your IOs as I have made the outputs 24 bits instead of 12 bits. You only want the upper 12 bits [7:4] wired to the DAC pins. Maybe check IO assignments as you may be using the lower IOs or having new adjacent IOs new interfering with the video.
Hmm.. will go check this in a bit.
Well, 15 layers as the current project stands takes 11K logic gates and your CycloneIV has 6K.
To turn back on transparency between a number of layers, add 1K-2k more gates as it is all a bit table of 3x3x3 multiply adds.
With the addition of a 1$ DDR ram chip on you PCB, you can reserve 1 MAGGIE channel to that ram controller and get a 32 bit graphics layer with 16mb of addressable space. Making a DDR ram controller which only has to deal with a single 27million pixels a second read and a Z80 at 2-8 million transactions a second is a joke compared to the all the cross-reads going between all the other 14 MAGGIEs which read all random memory locations every-which-way as I designed the core to do so.
Basically you would treat the onboard 220kb as texture and sprite memory as the bulk DDR ram would be for background and swapping of large chunks of additional animation graphics.
The rest of the FPGA logic would be 75% empty for a cool hardware accelerated drawing engine.
Random question - would there be any mileage in a hardware scrolling capability for text mode? Specifically, I was thinking of using the sub-pixel offset capabilities to scroll text upwards (or downwards) when a new line is written to a single-line buffer which would be 'off the top or bottom of screen' and not visible? Or having a 'viewport sub-pixel' setting, to allow the entire screen (of text) to be smoothly scrolled by offsetting the viewport up to a single line of text? Too much work for little payoff?
Latest memory layout.
24576 = size of current GPU ram
0 - 511 = All HW_Regs shared with generic GPU RAM.
00 thru 07 = H&V triggers for 4 yellow test cursors
12..15 = H&V triggers which is the H&V reset coordinates for all 15 MAGGIE_Layer#s
16..19 + 2*MAGGIE_Layer# = H&V Top left edge of each MAGGIE_Layer# window.
96 + 16*MAGGIE_Layer# = 16 byte controls for each of the 15 MAGGIE layers.
512 - 4607 = Default IBM VGA 8x16 Font
4608 - 7007 = Default ASCII text buffer, 80 characters x 30 lines (MAGGIE 0&1)
7008 - 8207 = Color text buffer, 2 bytes per character, 40 characters x 15 lines. (MAGGIE 2&3)
13056 - 24575 = 16 color Z80 CPU graphic image. (Rendered at 3 sizes across 3 different palettes, MAGGIE 4,5,6)
31744 - 32255 = Primary palette, ARGB 4444 style.
32256 - 32767 = Secondary palette, RGB 565 style.
New GPU project parameters:
NUM_LAYERS = 2 through 15 = 2 layers through 15 layers.
PALETTE_ADDR = Sets the base address for the 2 palettes.
This one is automatically set to (2**ADDR_SIZE - 1024), so the palettes are the last 1024 bytes.
Ok, here is the GPU, with the core ram at 250MHz. Though I selected the -C7 FPGA, in -C8 it is so close to 250MHz, it should still work fine. (My Cyclone III -C8 works fine and it compiles with even a slightly slower FMAX (220MHz) than the Cyclone IV -C8 (235MHz) The -C7 compiles with a 270MHz FMAX.)
Though you lost a little ram because of the all 16bit core, and memory allocation size parameter, you now have 7 active layers. And with this final code, all you need to do is change the parameter setting on the block diagram to 15 to get 15 layers as well as increase memory size once you get a larger FPGA as the HW Regs is now at the bottom of the ram, it will always powerup to the right default settings.
The only thing missing is the semi-translucent layer feature which I will finish off tomorrow. However, turning the feature on may eat more logic cells, lowering your layer count since now you are at 94% utilization with 7 layers.
In person, the CRT obviously looks best!
I'm trying to make some progress on the data_mux to gate the RS232 and Z80_bridge I/O to the GPU RAM. I'm making baby steps, and testing as I go, but something really odd is going on.
In the attached project, the RS232_debugger should NOT be able to read or write to the GPU RAM. I've only attempt to implement reading anyway, but the data_mux should be disabled, yet when I test the project it works normally (reading AND writing)...
Note that this isn't the latest version of the project, but it is the latest version of the data_mux work.
You have too many cross versions of everything going around....
You will eventually need to upgrade everything to my 7-layer MAGGIE version as it is not backwards compatible as you have things now...
This means re-editing the top .bdf as well.
What you should better do is make 2 wires, run_portA and run_portB.
Assign a set of rules for each wire with simple Boolean terms.
And use those wires to drive the 1-2 if statements and read/write request flags...
Remember to set and clear busy status for each port in the 'if's, use those as status in the boolean selection and make sure the Z80 get's first choice/priority over the RS232.
Does this help any?
wire run_r_porta, run_r_portb,run_w_porta, run_w_portb;
assign run_r_porta = rd_req_a && ~portb_bsy;
assign run_w_porta = wr_ena_a && ~portb_bsy;
assign run_r_portb = rd_req_b && ~porta_bsy && ~run_r_porta && ~run_w_porta;
assign run_w_portb = wr_ena_b && ~porta_bsy && ~run_r_porta && ~run_w_porta;
Note that there is no latching or latching logic here. Everything is realtime.
gpu_wr_ena <= run_w_porta || run_w_portb;
always @ (posedge clk) begin
gpu_wr_ena <= run_w_porta || run_w_portb;
porta_bsy <= run_r_porta || run_w_porta;
portb_bsy <= run_r_portb || run_w_portb;
if (run_r_portb) begin
rd_sequencer_b[9:0] <= { rd_sequencer_b[8:0], 1'b1 };
gpu_address <= address_b;
end else begin
rd_sequencer_b[9:0] <= { rd_sequencer_b[8:0], 1'b0 }; // this line must always run no matter any other state
end
if (run_w_portb) begin
gpu_address <= address_b;
gpu_data_out <= data_in_b;
end
end
Does it work?
always @ (posedge clk) begin
// *** UPDATE GPU_WR_ENA AND FLAGS EACH CLOCK ***
gpu_wr_ena <= run_w_porta || run_w_portb;
porta_bsy <= run_r_porta || run_w_porta;
portb_bsy <= run_r_portb || run_w_portb;
// *** HANDLE PORT A READ REQUESTS AND SEQUENCER ***
if (run_r_porta) begin
rd_sequencer_a[9:0] <= { rd_sequencer_a[8:0], 1'b1 };
gpu_address <= address_a;
end else begin
rd_sequencer_a[9:0] <= { rd_sequencer_a[8:0], 1'b0 }; // this line must always run no matter any other state
end
// *** HANDLE PORT A WRITE REQUESTS ***
if (run_w_porta) begin
gpu_address <= address_a;
gpu_data_out <= data_in_a;
end
// *** HANDLE PORT B READ REQUESTS AND SEQUENCER ***
if (run_r_portb) begin
rd_sequencer_b[9:0] <= { rd_sequencer_b[8:0], 1'b1 };
gpu_address <= address_b;
end else begin
rd_sequencer_b[9:0] <= { rd_sequencer_b[8:0], 1'b0 }; // this line must always run no matter any other state
end
// *** HANDLE PORT B WRITE REQUESTS ***
if (run_w_portb) begin
gpu_address <= address_b;
gpu_data_out <= data_in_b;
end
end
So next, wire in the Z80,...