I have a couple of questions on the project so far:
- Fmax has dropped again (see red above) - any tips/advice on how to find bottlenecks?
- The pixel collision counters - why are they 8-bit and are they for detecting sprite collisions and/or something else?
I have a couple of questions on the project so far:
- Fmax has dropped again (see red above) - any tips/advice on how to find bottlenecks?
- The pixel collision counters - why are they 8-bit and are they for detecting sprite collisions and/or something else?
1. Give me a sec. For now, those figures are functional. (YOU MODIFIED THE COMPILER SETTINGS...)
2a. I can switch them down to 1 bit (no collision, or yes, a hit) like a normal 8 bit computer if you like.
2b. We can also remove these collision counters completely since 8 bit computers since cant detect collisions when writing pixels in the first place. They have player missile and sprite collision detectors. We have yet to add any collision detectors to the MAGGIE layers. These are different. Each layer will have a 16 bit register where each bit, 0 through total layer number which will go from 0 to 1 depending on a collision with each associated layer's pixel set to that bit#. Once again, if you do not like this, we can make each layer have only a single bit collision detector telling you if it collided with anything at all like any 8 bit computer or Amiga.
I have no preference, really. I'll go with whatever you think is best. They sound handy - I was just asking as I wasn't sure if they were for detecting sprite collisions or something else that I couldn't grasp.
Here is the updated .zip.
Don't worry about how close the FMAX is, everything is set to area and you still have good clearance for the geometry unit with is the most mathematically intense part of your entire design.
I also patched the address generator back a revision as a simple 2 layer carry buffer was better than a FIFO.
The running plot line example I asked you for was to be able to send as many different line commands as possible. The z80 is 8 bit, so, to send as many different x&y coordinates as possible, a simple add, 5 different add figures, 1 for each coordinate and 1 for a different color would plot a lot of different lines very quickly compared to generating 5 different random numbers for each line drawn. This test would at least get you close to the speed of drawing geometry from cpu ram.
Ready to go on the filled triangle.
Tell me what you have learned so far on the triangle.
Take a look at the basic code and recount the line generators.
Separate out the steps involved in setting up and running each of the line generators.
Take a look at the basic code and recount the line generators.
D'oh. There's only two. One is re-used for the third line.Separate out the steps involved in setting up and running each of the line generators.
1) Order the lines so that the two line generators run on the first and third lines, with one line stopping and that line generator then drawing the second line.
2) Each line is setup by calculating its bounding box's width and height
3) Signs are calculated for the stepping increments according to the direction of the line
4) Key values are assigned to magic, errd and is_done to allow the line generator to run
5) Line generators are run, filling in horizontal raster lines between the two lines being drawn if fill mode is on
I'm working on setting up your IO timing rules. It's a mess as some buses cross IO banks and is f---ng things up. I need to know what output pins you have set the led_rxd and led_txd.
When going to a higher pin count FPGA, you will need to read the datasheet and choose the IOs wisely, otherwise you wont get anywhere near good performance with external ram without a crap load of carry chains destroying the read-turn-around times. And carry chains will not work if you have IO on opposite sides of the FPGA silicon or use the any one of the 4 slower IO banks out of the 8.
My biggest headache is with 4 IO which are dual purpose pins as they have also the ability to be a 'VREF' input, there are 4 of them.
Ok, this one took all day. (Poor documentation from Altera/Intel)
I've setup your timing constraints. To do so, I had to add additional delay pipe chains in the Z80 bridge, video out and a few other places, however now, Quartus is respecting the IO timings, core clock frequency & the Fast IO assignments now appear to function correctly.
It should now compile quickly & achieve the timing requirements filling the FPGA to only 76%.
If it were'nt for those 4 IOs on the vref pins, all your outputs would be within 300ps of each other. Those 4 IOs are 1.3 ns slower than your slowest output. Though, your DAC clk output is the fastest output at 25MHz.
You'll need to test everything. On my side, my DAC on my scaler wasn't designed to operate down to 25MHz, so I have a shaded bar on the left hand side of the screen. Make sure you do not see this.
Let me know if everything works.
I had a look around at various packages and models in the Cyclone range last night. I had pretty much decided on the EP4CE22F17 as the next step - though I'm wondering if the jump straight to an EP4CE40 might be worth it. That's an FBGA-484 package, though - perhaps that should wait until I've developed some skill with the FBGA-256 first.
I'm not going to cry over 1.3ns unless it causes issues later, but this design has outgrown the EP4CE10 a lot faster than I thought it would - it really does appear that BGA is the way to go.
Ok, lets see your triple line drawing engine.
4'd1 : begin // draw line from (x[0],y[0]) to (x[1],y[1])
case (geo_sub_func1) // during the draw line, we have multiple sub-functions to call 1 at a time
4'd0 : begin
errd <= dx + dy;
geo_sub_func1 <= 4'd1; // set line sub-function to plot line.
end // geo_sub_func1 = 0 - setup for plot line
4'd1 : begin
draw_cmd_func <= CMD_OUT_PXWRI[3:0]; // Set up command to pixel plotter to write a pixel,
draw_cmd_data_color <= geo_color; // ... in geo_colour,
draw_cmd_data_word_Y <= geo_y ; // ... at Y-coordinate,
draw_cmd_data_word_X <= geo_x ; // ... and X-coordinate.
if ( ( geo_x >= 0 && geo_x <= max_x ) && (geo_y>=0 && geo_y<=max_y) )
draw_cmd_tx <= 1'b1; // send command if geo_X&Y are within valid drawing area
else
draw_cmd_tx <= 1'b0; // otherwise turn off draw command
if ( geo_x == x[1] && geo_y == y[1] ) geo_shape <= 4'd0; // last pixel - step to last sub_func1 stage, allowing time for this pixel to be written
// On the next clock, end the drawing-line function
// increment x,y position
if ( ( errd << 1 ) > dy ) begin
geo_x <= geo_x + geo_xdir;
if ( ( ( errd << 1 ) + dy ) < dx ) begin
geo_y <= geo_y + geo_ydir ;
errd <= errd + dx + dy ;
end else begin
errd <= errd + dy ;
end
end else if ( ( errd << 1 ) < dx ) begin
errd <= errd + dx ;
geo_y <= geo_y + geo_ydir ;
end
end // geo_sub_func1 = 1 - plot line
endcase // sub functions of draw line
end // geo_shape - draw line
Ok, lets see your triple line drawing engine.
Hmmm well, my first question in writing this function is that as it uses two line drawing engines, how best do I utilise the existing line-drawing engine in the geometry_xy_plotter module? Do I need to create it as a sub-module and instantiate it for use in the triangle function, or can I somehow re-use the existing function?Code: [Select]4'd1 : begin // draw line from (x[0],y[0]) to (x[1],y[1])
case (geo_sub_func1) // during the draw line, we have multiple sub-functions to call 1 at a time
4'd0 : begin
errd <= dx + dy;
geo_sub_func1 <= 4'd1; // set line sub-function to plot line.
end // geo_sub_func1 = 0 - setup for plot line
4'd1 : begin
draw_cmd_func <= CMD_OUT_PXWRI[3:0]; // Set up command to pixel plotter to write a pixel,
draw_cmd_data_color <= geo_color; // ... in geo_colour,
draw_cmd_data_word_Y <= geo_y ; // ... at Y-coordinate,
draw_cmd_data_word_X <= geo_x ; // ... and X-coordinate.
if ( ( geo_x >= 0 && geo_x <= max_x ) && (geo_y>=0 && geo_y<=max_y) )
draw_cmd_tx <= 1'b1; // send command if geo_X&Y are within valid drawing area
else
draw_cmd_tx <= 1'b0; // otherwise turn off draw command
if ( geo_x == x[1] && geo_y == y[1] ) geo_shape <= 4'd0; // last pixel - step to last sub_func1 stage, allowing time for this pixel to be written
// On the next clock, end the drawing-line function
// increment x,y position
if ( ( errd << 1 ) > dy ) begin
geo_x <= geo_x + geo_xdir;
if ( ( ( errd << 1 ) + dy ) < dx ) begin
geo_y <= geo_y + geo_ydir ;
errd <= errd + dx + dy ;
end else begin
errd <= errd + dy ;
end
end else if ( ( errd << 1 ) < dx ) begin
errd <= errd + dx ;
geo_y <= geo_y + geo_ydir ;
end
end // geo_sub_func1 = 1 - plot line
endcase // sub functions of draw line
end // geo_shape - draw line
I guess I was just wondering what's in the ball-park of realistic home-DIY soldering (and PCB design)? Is an FBGA-484 too much? Then there's the Cyclone V range. The smallest one appears to have over 150 KB of RAM and 40% lower power usage, and I can get it in FBGA-256 (5CEBA4F17 - https://uk.rs-online.com/web/p/fpgas/8303565/).
I am talking about you making a module which takes in X[0..3] or Y[0..3] and returns all the point which are equal, greater than and less than in 3 arrays which can easily be checked/used to select which lingen should be used (IE: if all X & Y coordinates on the 3 vertices are equal, no lingen will be used, just draw a point), each of those linegen's source and destination X[?]&Y[?] coordinates and select whether to add or subtract the x&y counter (x&y dir) in each linegen module's arithmetic.
I guess I was just wondering what's in the ball-park of realistic home-DIY soldering (and PCB design)? Is an FBGA-484 too much? Then there's the Cyclone V range. The smallest one appears to have over 150 KB of RAM and 40% lower power usage, and I can get it in FBGA-256 (5CEBA4F17 - https://uk.rs-online.com/web/p/fpgas/8303565/).The '5CEBA2F17C8N' may actually be identical to the 5CEBA4F17 just like the EP4CE6 is identical to the EP4CE10...
Potentially the only difference in the chips is the model identifier or some fuse setting? Interesting.