Author Topic: FPGA VGA Controller for 8-bit computer  (Read 41214 times)

BrianHG and 5 Guests are viewing this topic.

Offline nockieboy

  • Frequent Contributor
  • **
  • Posts: 879
  • Country: gb
Re: FPGA VGA Controller for 8-bit computer
« Reply #1125 on: June 29, 2020, 04:48:04 pm »
Hmm.. I have to call it on this ellipse business.   :palm:

Should be easy in theory - I should be able to tweak the circle function to do it, but I'm having trouble finding anything reliable on ellipses that I can translate to Free Basic, and the function I've got isn't working because... well, I can't get it to work.  :-//
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1126 on: June 29, 2020, 04:52:38 pm »
Remember, we don't want to make 2 codes in the FPGA for circles and ellipses.  When you call a circle, the Verilog will just copy and use the X1 coordinate as the X&Y radius while calling the ellipse.  When you want an ellipse, the same ellipse will be called using the separate X&Y figures stored in X1&Y1.  However, this will not allow you to rotate the ellipse.  The calculation speed in the FPGA for the ellipse will be just as fast as the optimized circle algorithm.

May I suggest. You can create just a simple drawer which draws an arbitrary curve given three points. Such curves can then be used to draw any sort of lines - straight, circles, ellipses, true-type fonts etc. Then you can organize everything into a pipeline:

drawing commands (circle, line etc.) ---> FIFO ---> parser which dissects shapes into simple curves ---> FIFO --> curve drawer ->

This way you need only a single drawing algorithm and save some logic. Later you can add other drawers, such as area fillers or blitters in parallel with the curve drawer, each with its own FIFO queue from the parser.
Yes, this is what I am looking for, as a circle just has those 3 points set to the canter coordinates, +/- the radius as a perfect circle.  With a curved line algorithm, with the 4 source coordinates being use as 4 center quadrant points on a circle, you can draw the ellipse rotated to any angle.

I believe think it is called a 'Rational B├ęzier curve'.

For simple circle & ellipse, the all integer 'Bresenham's Ellipse' algorithm is fine.
It also comes in the filled variant as well, though the filled variant is nothing more than a horizontal for X loop to draw from left half to right half of the circle.
__________
BrianHG.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 1135
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1127 on: June 29, 2020, 04:55:49 pm »
Yes, this is accurate, plus another 10 IOs max to fill out the buffer (level converter) direction controls for the command and address buses as well, so that I have the option of being able to run the whole show from the FPGA with a soft-core CPU if I (or anyone else) wanted to go that route.
Looks like whoever designed their pinouts was on something serious. Check out the pinout diagram of their 381 pin package in attachment. To me that matrix looks easier to break out than 256 ball one. Even Lattice themselves showing an example pinout of that package with only 4 layers, 4 mil traces and 7 mil drills (that might be a problem).

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1128 on: June 29, 2020, 06:19:30 pm »
Hmm.. I have to call it on this ellipse business.   :palm:

Should be easy in theory - I should be able to tweak the circle function to do it, but I'm having trouble finding anything reliable on ellipses that I can translate to Free Basic, and the function I've got isn't working because... well, I can't get it to work.  :-//

Ok, hang on.  Look here for notes on drawing an ellipse inside a rectangle: http://members.chello.at/easyfilter/bresenham.html

Too bad that algorithm uses floats.  But, it is easy to modify if you want to fill the ellipse by adding a for x loop when drawing between the left and right side.

I need to look a little more tonight when I wake up.
For now, star a new dummy Quartus project to simulate the Geometry_xy_plotter.sv.
Code: [Select]
You will need inputs:
clk        // system clock
reset      // force reset
cmd_ready  // load the 16 bit data command
cmd_data[15:0] // data bus
draw_busy      // when high, the pixel writer is busy, so the geometry plotter will pause before sending any new pixels

You will need outputs:

load_cmd       //  output high when ready to receive the next cmd_data[15:0] input

draw_cmd_rdy   //  Pulsed high when the data on the draw_cmd[35:0] is ready to send to the pixel writer module.
draw_cmd[35:0] //  bits [35:32] hold an aux function number 0 through 15.
               //  When AUX=0,  do nothing
               //  When AUX=1,  write pixel,                             bits 31:24 hold the color bits 23:12 hold the Y coordinates, 11:0 hold the X coordinates
               //  When AUX=2,  write pixel with color 0 mask,           bits 31:24 hold the color bits 23:12 hold the Y coordinates, 11:0 hold the X coordinates
               //  When AUX=3,  write from read pixel,                   bits 31:24 ignored        bits 23:12 hold the Y coordinates, 11:0 hold the X coordinates
               //  When AUX=4,  write from read pixel with color 0 mask, bits 31:24 ignored        bits 23:12 hold the Y coordinates, 11:0 hold the X coordinates
               //  When AUX=6,  read source pixel,                       bits 31:24 ignored,       bits 23:12 hold the Y coordinates, 11:0 hold the X coordinates
               //  When AUX=7,  Set Truecolor pixel color                bits 31:24 8 bit alpha blend mixe value, bits 23:0  hold RGB 24 bit color.
               //                                                        Use function Aux3/4 to draw this color, only works if the destination is set to 16 bit true-color mode

               //  When AUX=14, set destination mem address,  bits 31:24 hold the bitplane mode and bits 23:0 hold the destination base memory addres for write pixel.
               //  When AUX=15, set source mem address,       bits 31:24 hold the bitplane mode and bits 23:0 hold the source base memory address for read source pixel.

When copying, yes, having a different bitplane mode for source and destination will work as the pixel writer will have basic conversion logic.
Can you think of anything else?

« Last Edit: June 29, 2020, 06:22:53 pm by BrianHG »
__________
BrianHG.
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1129 on: June 29, 2020, 06:43:05 pm »
Yes, this is accurate, plus another 10 IOs max to fill out the buffer (level converter) direction controls for the command and address buses as well, so that I have the option of being able to run the whole show from the FPGA with a soft-core CPU if I (or anyone else) wanted to go that route.
Looks like whoever designed their pinouts was on something serious. Check out the pinout diagram of their 381 pin package in attachment. To me that matrix looks easier to break out than 256 ball one. Even Lattice themselves showing an example pinout of that package with only 4 layers, 4 mil traces and 7 mil drills (that might be a problem).
:scared:  WTF????????????!?!?!!?!?!!?!?!?!?!?!?!!?!?!?
That's some scary shit.
__________
BrianHG.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 1135
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1130 on: June 29, 2020, 07:27:33 pm »
:scared:  WTF????????????!?!?!!?!?!!?!?!?!?!?!?!!?!?!?
That's some scary shit.

I think I figured it out. That big-ass ground section at the bottom is where SERDESes would be bonded out in UM devices. So it kind of make sense. Now I'm actually curious if it's possible to fully break it out using JLCPCB's 4 layer process (3.5 mil traces, 0.2/0.4 mm drills). Their example uses 4 mil traces and 7 mil drills (no via pad size is given).

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1131 on: June 29, 2020, 11:27:47 pm »
I figure you are asking what low end JLPCB/PCBWay manufacturing constraint metrics you may get away with a 0.8mm BGA.  See attached photo for what I use.  It boils down to a 7.9mil drill, 15.8mil via (better read & properly set in Metric), 4mil trace between BGA pads, 6 mil trace for PWR/GRND, 17.7mil BGA pad.

Photo has metric unit scale listed:
[attach=1]

If you are going to copy, use the Metric figures as they are exact.  If I remember, JLPCB considers the vias as 8mil drill just squeezing into their lower tier PCB fab.  Also, that IC has Deep Color HDMI inputs an I had no problem with it's LVDS.  Still inspect PCBs before assembly.  When receiving 12 PCBs from an order of 10pcbs, 2 pcbs has slight drill alignment error.  They were still functional, but, the other 10 had dead center drilling.

To begin placement, I place the BGA on a 0.4mm metric grid (well, I work with 0.1mm electrical grid), then paint a complete grid of VIAs offset at 45 degree by 0.4x0.4mm.  Add a grid copy of top layer tracks at 45 degrees to all VIAs pointing outward from the center of the IC.  Then work my way backwards by erasing vias & tracks which are not needed.  I also group resize the top PWR&GND tracks from 4mil to 6 mil.  Then finish fanning out the top layer, then going to the bottom layer.

For power decoupling caps, I use 0402s (imperial) compact footprint on the bottom layer with 0.45mm (17.7mil) tracks across the PWR vias.  The 0402 allows for 1 4mil data trace in-between the pads.

Here is the bottom layer.  It's a little messed up as this IC has 6 separate power supplies voltages (PVDD, CVDD, TVDD, AVDD, DVDD, DVDDIO & GND), but this was a 6 layer PCB.
[attach=2]
The FPGA should be much neater with fewer caps and only 2 or 3 supply volatages.
Those caps will most likely sit in a square in under the inner most row of IOs.
« Last Edit: June 30, 2020, 01:06:41 am by BrianHG »
__________
BrianHG.
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1132 on: June 29, 2020, 11:46:56 pm »
Oooops, forgot the clearance rules, see below:

[attach=1]

Basically, 4mil between tracks, 10mil for power plane and polygon fills.
Once again, use the Metric figures in my attached photo.

With a 4 layer board, and the LEF5U45 caBGA-256, you wont be able to route around 30 IOs of the available 197 resulting in 167 available routed IOs.  With a 6 layer PCB, you will have access to every IO.

You might reclaim some of those final difficult to route 30 IOs with a few short traces on the GND polygon layer, but they will go through additional vias, so use them for signals in the 25MHz and below range.  I would usually place the bprom configuration & reset signals here.
« Last Edit: June 30, 2020, 01:13:00 am by BrianHG »
__________
BrianHG.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 1135
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1133 on: June 30, 2020, 01:19:10 am »
With a 4 layer board, and the LEF5U45/85 caBGA-256, you wont be able to route around 30 IOs of the available 197 resulting in 167 available routed IOs.  With a 6 layer PCB, you will have access to every IO.
No, I'm more interested in caBGA-381 as it seems to me that it's possible to fully break it out on 4 layer board.

You might reclaim some of those final difficult to route 30 IOs with a few short traces on the GND polygon layer, but they will go through additional vias, so use them for signals in the 25MHz and below range.  I would usually place the bprom configuration & reset signals here.
The fab I usually use can do 3/3 mil (75um) tracks and 0.15/0.3 mm vias (and the quality of their boards is a whole new level compared to the likes of JLCPCB who tries to cut every corner in the process to shave off another cent off production cost, even if quality suffers), so for me 2 tracks between pads/vias is not a problem, but I'm genuinely intrigued by their higher pin count packages (caBGA-554 and 756) because they are not full matrices and seems to be designed for low layer count boards, while providing a ton of IO pins. If I am to take Lattice word on it, they promise full breakout of 756 ball package (365 user IO!) on just 2 signal layers! I can't think of any other FPGA which would allow such a feat. Too bad the fabric in those devices is about two times slower than in 7 series, otherwise it would be one hell of a device!

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1134 on: June 30, 2020, 02:08:52 am »
The fab I usually use can do 3/3 mil (75um) tracks and 0.15/0.3 mm vias (and the quality of their boards is a whole new level compared to the likes of JLCPCB who tries to cut every corner in the process to shave off another cent off production cost, even if quality suffers).
Now, for this project, nockieboy wants the cheapest functional PCB which will be quickest time to anyone as it is an open-source project where online users will most likely default their orders to JLPCB/PCBway.  Make no mistake, for some of my mass production designs, we have below 2 mil trace with lazer drill vias and 22GB/s LVDS.  I do not use JLPCB for this work.  My recommendations offer a sweet spot for these cheap fab houses while still trying to route as many of the BGA IOs as possible with 4 layers.

Yes, with a finer FAB, you can push this to a 2 layer board, but you wont really save on FAB cost until you mass produce PCBs.

« Last Edit: June 30, 2020, 02:13:18 am by BrianHG »
__________
BrianHG.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 1135
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1135 on: June 30, 2020, 02:40:43 am »
Now, for this project, nockieboy wants the cheapest functional PCB which will be quickest time to anyone as it is an open-source project where online users will most likely default their orders to JLPCB/PCBway.  Make no mistake, for some of my mass production designs, we have below 2 mil trace with lazer drill vias and 22GB/s LVDS.  I do not use JLPCB for this work.  My recommendations offer a sweet spot for these cheap fab houses while still trying to route as many of the BGA IOs as possible with 4 layers.
The fab I use is not that much more expensive than JLCPCB. So I'm not talking about some kind of high-end fab, just the one which cares about quality. And I (and my clients) certainly appreciate that and are prepared to pay a bit of premium.

Yes, with a finer FAB, you can push this to a 2 layer board, but you wont really save on FAB cost until you mass produce PCBs.
2 layer board is going to be garbage because of signal integrity no matter what.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1136 on: June 30, 2020, 02:55:09 am »
LOL, a 4 megabyte (36 megabit) ZBT ram for 15$.  Too bad you need to purchase 800 of them.
https://www.verical.com/pd/integrated-silicon-solution-sram-IS61NVF102436B-6-5TQL-TR-1194781?utm_campaign=verical_findchips_2019&utm_currency=&utm_medium=aggregator&utm_source=findchips&utm_content=inv_listing

That would be every ounce of the Z80's accessible 4 megabytes all directly addressable picture, program and blitter sprite memory, though with 133Mhz speed, you are talking 3 MAGGIES which can access 4 megabytes, 4 MAGGIES with a little more effort and Z80 & Geometry engine with a potential 75MHz Z80 0 wait-state access.  And you would still have access to the 15 other MAGGIES channels as well which would have around 192kb for their address range.  The 3-4 ZBT MAGGIES can still drive the 15 FPGA MAGGIES and vice-versa.  You would have a hell of a lot of room for audio now as well.  Well worth placing in the FPGA an accelerated multichannel MP3 decoder/encoder on top of the FM synthesizer for the audio so you can playback mix in compressed high quality sound effects for games.

Here is a 2 megabytes ZBT, 16$ for 1, however, this one supports 200MHz: (7 MAGGIES)
https://www.mouser.com/ProductDetail/ISSI/IS61NLP51236B-200TQLI?utm_term=IS61NLP51236&qs=rnuWwpjhhVu7EtaUBW4FOg%3D%3D&utm_campaign=IS61NLP51236B-200TQLI&utm_medium=aggregator&utm_source=findchips&utm_content=ISSI

These guys require 32 bits for the data bus, not 16.  But, they are more available than the 18bit versions.

2 Megabyte, 18 bit version, fewer available: (7 MAGGIES, you can always place 2 of them for 4 megabytes)
https://www.mouser.com/ProductDetail/ISSI/IS61NLP102418B-200TQLI?utm_term=IS61NLP102418&qs=UxK%2FDTs6dB6rHPCxdOLQcw%3D%3D&utm_campaign=IS61NLP102418B-200TQLI&utm_medium=aggregator&utm_source=findchips&utm_content=ISSI

In quantity, the 2 megabyte versions go below 10$.

Even Micron makes the 2 megabyte ZBT Srams (36 bit version):
https://www.rocelec.com/part/CYPMCNMT55L512Y32PT-6?utm_medium=buyNow&utm_source=findChips
18 bit version:
https://www.rocelec.com/part/CYPMCNMT55L1MY18PT-6?utm_medium=buyNow&utm_source=findChips
« Last Edit: June 30, 2020, 03:06:23 am by BrianHG »
__________
BrianHG.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 1135
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1137 on: June 30, 2020, 03:23:04 am »
LOL, a 4 megabyte (36 megabit) ZBT ram for 15$.  Too bad you need to purchase 800 of them.
I think you guys will need to modify your video core to work with longer latency memory. Modern videocards use GDDR6 memory which have hundreds of cycles-long latency, and that doesn't stop video cores to saturate these interfaces despite insane bandwidth(which nowadays is measured in 100's of gigabytes per second) afforded by these interfaces. One thing to think about - video card resources don't have to be stored in video memory as flat contiguous chunks of memory, but can have any structure which facilitates algorithms utilized by the core.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1138 on: June 30, 2020, 03:43:33 am »
I need to read up on GSI Technologies '18Mb Pipelined and Flow Through Synchronous NBT SRAM'.  They are more available, a bit cheaper and supposed to have a mode similar to ZBT as well as up to 333MHz speed, though, your 15$ FPGA has a 200MHz limit on the IOs in SDR, bit in DDR maybe with a proper layout, you will achieve 11-12 MAGGIES on all 2 megabytes.

36Mb Pipelined and Flow Through Synchronous NBT SRAM at 200MHz is available at under 27$ for 1 while the 18Mb versions are around 14$ for 1.
__________
BrianHG.
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1139 on: June 30, 2020, 04:09:19 am »
LOL, a 4 megabyte (36 megabit) ZBT ram for 15$.  Too bad you need to purchase 800 of them.
I think you guys will need to modify your video core to work with longer latency memory. Modern videocards use GDDR6 memory which have hundreds of cycles-long latency, and that doesn't stop video cores to saturate these interfaces despite insane bandwidth(which nowadays is measured in 100's of gigabytes per second) afforded by these interfaces. One thing to think about - video card resources don't have to be stored in video memory as flat contiguous chunks of memory, but can have any structure which facilitates algorithms utilized by the core.
     Those cards have megabytes of smart cache just for their ram controller on their GPU die.  All sprites and overlays are now software rendered and stored in separate chunks of ram.  Now if we were to redesign the GPU for all software rendering windows form multiple source ram images to a destination screen display ram, with a software configurable rendering display manager, with a full instruction set for the GPU to perform all these tasks with a software stack, yes.  With 1 beginner developer, a few years from now, we will have re-engineered a semi modern obsolete 2D video card accelerator.  A lot of tricks were designed in to give him hardware video windows or sprties (the MAGGIES) which can be configured anyway what-so-ever with multiple palette chunks assigned to each layer, each one with different bits/color and different resolutions.  And there is only a little bit of code left to add to get hardware pixel collision through each layer identifying if any 2 pixels which don't contain the transparent color 0 sit on top of each other.  It has really been designed to be something like the ultimate 8bit CPU gaming display engine where complete character animations and background scrolls can be achieved only by updating 4-8 bytes per display element & reading 2 bytes per display element/layer to determine 16 possible collisions.

nockieboy just wants to tie in something and go.  I think the 2-4$ 512kb ram chip is fine for an 8 bit 8MHz Z80 while it requires 0 verilog coding experience to get it to work.

Remember, most old Z80 type 8bit personal computers only has to worry about 320x240 at 15Khz (140ns per pixel), 1 bit color per pixel, 160x240 at 4 colors in graphics mode.  Maybe 16 colors in text mode.  This GPU does 15 parallel superimposed windows at 640x480 @ 32Khz x 16 bits per pixel, or 8 bits palette color, with 0 wait states on the Z80 access, plus now a hardware accelerated geometry engine with blitter copy on a 15$fpga.  If I wanted only 1 screen with 640x480 @ 1 bit color, or 16 color text like nockieboy's original spec, or even just a graphics 32 bit color 640x480 with a geometry engine, yes, a single dram chip would be easy with the dumbest dram controller you could think of.  But it wouldn't have some of the goodies of of the old days 8-bit microcomputer style graphics, but now with full color and hi-res output.
« Last Edit: June 30, 2020, 04:20:26 am by BrianHG »
__________
BrianHG.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 1135
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1140 on: June 30, 2020, 06:17:41 am »
Those cards have megabytes of smart cache just for their ram controller on their GPU die.  All sprites and overlays are now software rendered and stored in separate chunks of ram. 
These cards also render billions of triangles per second. So let's scale it back by a factor of 1000 - can you render millions of triangles per second with kilobytes of "smart" cache (whatever that is)?

Now if we were to redesign the GPU for all software rendering windows form multiple source ram images to a destination screen display ram, with a software configurable rendering display manager, with a full instruction set for the GPU to perform all these tasks with a software stack, yes.  With 1 beginner developer, a few years from now, we will have re-engineered a semi modern obsolete 2D video card accelerator.  A lot of tricks were designed in to give him hardware video windows or sprties (the MAGGIES) which can be configured anyway what-so-ever with multiple palette chunks assigned to each layer, each one with different bits/color and different resolutions.  And there is only a little bit of code left to add to get hardware pixel collision through each layer identifying if any 2 pixels which don't contain the transparent color 0 sit on top of each other.  It has really been designed to be something like the ultimate 8bit CPU gaming display engine where complete character animations and background scrolls can be achieved only by updating 4-8 bytes per display element & reading 2 bytes per display element/layer to determine 16 possible collisions.
That is a lot of hand-waving. Let's consider facts - now you have a fully obsolete 2D video card accelerator. So devising just "semi-obsolete" will be a big step ahead ;) And no, I don't think it will take years. And even if it will - who cares, as long as it's fun to work on? ;) I, for one, would follow such a project really closely, and would find a way to contribute, because I have great interest in the the subject of 2D/3D rendering using modern approaches, even if they are implemented on a much smaller scale than what modern video cards can do. I'm not interested in using ancient techniques though. I have some experience developing 3D engines using 3D GFX APIs, so I have some knowledge on a subject, though of course it's mostly on the software side, which is why I have keen interest in what's going on on HW side of these APIs.

nockieboy just wants to tie in something and go. 
I can't read his mind, but I think this thread is a proof positive that that is absolutely not his intention. The way I see it, it's a typical hobby project where process is just as important as result, so once you achieve some goal, you invent another one, move the goal post and continue going. Infact if my memory serves me most of features you listed above were not in the original list of requirements, so they were added on the go, which again, is perfectly fine for a "permanently in-progress" hobby project.

I think the 2-4$ 512kb ram chip is fine for an 8 bit 8MHz Z80 while it requires 0 verilog coding experience to get it to work.
That is a classic manifestation of sunk cost fallacy. When your solution becomes inadequate to the task at hand, you don't design crutches for it, instead you re-engineer it so it becomes adequate, and if that doesn't work - you throw it away and start from the ground up. The fact that you need an expensive memory to scale your solution up from measly 640x480 is a proof that existing solution is hitting the wall, and needs to be either significantly re-engineered, or just thrown away and re-designed using a different approach which would scale better.

Remember, most old Z80 type 8bit personal computers only has to worry about 320x240 at 15Khz (140ns per pixel), 1 bit color per pixel, 160x240 at 4 colors in graphics mode.  Maybe 16 colors in text mode.  This GPU does 15 parallel superimposed windows at 640x480 @ 32Khz x 16 bits per pixel, or 8 bits palette color, with 0 wait states on the Z80 access, plus now a hardware accelerated geometry engine with blitter copy on a 15$fpga.  If I wanted only 1 screen with 640x480 @ 1 bit color, or 16 color text like nockieboy's original spec, or even just a graphics 32 bit color 640x480 with a geometry engine, yes, a single dram chip would be easy with the dumbest dram controller you could think of.  But it wouldn't have some of the goodies of of the old days 8-bit microcomputer style graphics, but now with full color and hi-res output.
That ship sailed a loooooong time ago in case you hadn't noticed, as Z80 systems did not have even remotely anything like what you have now. In fact if it's anything like my hobby projects, I absolutely wouldn't be surprised if tomorrow nockieboy comes around and says "you know what - that Z80 thingy is becoming too much of a limitation, so let's get rid of it and replace it with some 32 bit CPU, like RISC-V!" :) This kind of evolution is completely normal, and there is no need to hang on to existing solutions, and long as the process is still fun for those involved. At the end of the day, hobby projects are all about fun first and foremost. I don't think using this Z80 system is nearly as much fun as designing it :D
« Last Edit: June 30, 2020, 06:29:31 am by asmi »
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1141 on: June 30, 2020, 07:37:55 am »
@asmi, I think you are going off the deep end here.  If you want a 2D/3D accelerated GPU, you may start a thread on the subject and I will contribute.  A decade ago I touched on the subject matter and there is a lot to do and setup just to draw 100 million triangle a second, seeing that each triangle will need to be filled, each having potentially say an average of 4000 pixels each with some intelligence like read ahead cache and omitting hidden triangles beneath others still requires a degree of planning and design.

This project started with a wish to have 640x480 with 1 bit color, or 320x240 by 16 colors.  being a beginner project for nockieboy, we played around with a sync generator, then a 1 bit font and text.  Then, with what a small FPGA could do, we created a standardized method to address a few layers of video.  If nockieboy wants to abandon this project and begin anew with a simple 32 bit, large ram page flip-able display GPU, no MAGGIE, but a GPU accelerated programable core that renders a double buffered display from a stings of stored geometry data structures all on it's own like a modern video card, he is free to do so.  (Not to mention he will need to create his own compiler to construct the geometric code running in the video card)  Remember, his current cpu upgrade wishes is to add support for a 2-4MHz 6502, not a 100Mhz 486, hence the thread title 'FPGA VGA Controller for 8-bit computer'.

Now hooking DDRam to the current design is doable and still can offer everything that comes with large memory, and the limitations are minor as the existing core still offers additional internal ram for effects layers, but, it will take some time to get it right unless I program the whole thing myself.  I am offering trade-offs solutions and options as my free time is shrinking and nockieboy will have to do all the new development on his own.

Quote
"smart" cache (whatever that is)?
Smart cache reads ahead, does branch prediction, holds extra content which will be used after plotting a line or row, or simple write-back for writing into same memory space before sending it back to ram.  In fact, our 'pixel writer' coming after the geometry unit will have a smart cache, except it is only 2 words long for the pixel reader and 2 words long for the pixel writer.  It will speed up any horizontal drawn lines and all fills up to 8 fold.  Most of our objects are only drawing 1 byte, or even parts of a byte written one after the other and 1 word can contain up to 16 pixels.  The 2 word cache will allow these type of reads and writes to be done multiple times immediately and only offload the data to or from the ram once a new completely different word of memory is required.  With DDR2/3 or Hyperbus RAM, these 2 words would be extended to the burst size X 2, so, 16 - 32 bytes each.  Still doable with a a lot more more work.

There is obviously more to this to improve design and efficiency.  I guess this is why CPUs have a L1 cache, then a L2 cache, then a L3 cache, then system memory.

However, there is one thing I can agree upon.  If nockieboy wants the engineering headache, place 2 independently wired Hyperbus Rams for 32/64 megabytes.  It will take a lot longer to get it to function, but he will have a ton of memory space whether he makes the memory interface efficient and extra fast or basically good enough to function.
« Last Edit: June 30, 2020, 08:01:41 am by BrianHG »
__________
BrianHG.
 

Offline nockieboy

  • Frequent Contributor
  • **
  • Posts: 879
  • Country: gb
Re: FPGA VGA Controller for 8-bit computer
« Reply #1142 on: June 30, 2020, 10:58:30 am »
Ok, hang on.  Look here for notes on drawing an ellipse inside a rectangle: http://members.chello.at/easyfilter/bresenham.html

Too bad that algorithm uses floats.  But, it is easy to modify if you want to fill the ellipse by adding a for x loop when drawing between the left and right side.

I like that site, wish I'd found that earlier! :)  Have attached latest Geo.bas and compiled drawing.bin to show the working ellipse function.

I need to look a little more tonight when I wake up.
For now, star a new dummy Quartus project to simulate the Geometry_xy_plotter.sv.
...
When copying, yes, having a different bitplane mode for source and destination will work as the pixel writer will have basic conversion logic.

Okay, test project started in Quartus II with the inputs/outputs you've specified.  Nothing really to show yet, but I've attached geometry_xy_plotter.sv for info.

Can you think of anything else?

I'm hoping that's not a trick question, because right now I can't think of anything else.  ???

EDIT: Just a thought, but what about a general FILL function?  Specify a point coordinate and a colour and off it goes, filling in all pixels of the colour at the point coordinate with the new colour, until bounded by other colours?  Or would that be too far?

I guess as an alternate to this, I'm thinking about a filled polygon function?  Perhaps based off of the filled square, but being able to specify all four corner positions?


I need to read up on GSI Technologies '18Mb Pipelined and Flow Through Synchronous NBT SRAM'.  They are more available, a bit cheaper and supposed to have a mode similar to ZBT as well as up to 333MHz speed, though, your 15$ FPGA has a 200MHz limit on the IOs in SDR, bit in DDR maybe with a proper layout, you will achieve 11-12 MAGGIES on all 2 megabytes.

36Mb Pipelined and Flow Through Synchronous NBT SRAM at 200MHz is available at under 27$ for 1 while the 18Mb versions are around 14$ for 1.

I know you guys have discussed this all already whilst I've been sleeping, but (for this project at least) $27 (or even $14) for a RAM chip is a bit expensive and I'd need to be convinced of the benefits before pulling the trigger on that.

nockieboy just wants to tie in something and go.  I think the 2-4$ 512kb ram chip is fine for an 8 bit 8MHz Z80 while it requires 0 verilog coding experience to get it to work.

Sounds good to me. Don't forget that I already consider the design and assembly of this card to be pretty advanced for a beginner - minimising the number of (relatively) expensive components that could be difficult to solder seems like a good strategy.


nockieboy just wants to tie in something and go. 
I can't read his mind, but I think this thread is a proof positive that that is absolutely not his intention. The way I see it, it's a typical hobby project where process is just as important as result, so once you achieve some goal, you invent another one, move the goal post and continue going. Infact if my memory serves me most of features you listed above were not in the original list of requirements, so they were added on the go, which again, is perfectly fine for a "permanently in-progress" hobby project.

To some degree you're right, asmi, but I do have a specific goal in mind and I outlined that near the start of this thread.  In more general terms, I wanted to free my little DIY computer from the host PC, breaking away from the serial console and giving it its own video output and keyboard interface so I could use it as a standalone system.  I don't recall if I ever admitted this, but my holistic aim was to be able to create a computer, from scratch, that I could plug into the living-room TV and play Pong (or my home-coded version of it) with the family.  That was my gold-standard end goal.

Now, technically I could write Pong for it at the moment and that would be that, but you're right in that it's a typical hobby project and I have become quite addicted to the process as much as the end result.  Feature creep is a very significant factor, as is learning about this world of technicalities and skills I never, ever, thought I'd wind up learning about and developing.  Originally the design for my computer was all through-hole components.  I'm now soldering TSSOPs, 0603 passives and QFP-144s without breaking a sweat and seriously planning for the move to BGA.  I'm not an electronics engineer.  I'm not a programmer.  I have no formal education in either domain.  But I do like to learn - and if I can build something fun and practical whilst I'm doing that, so long as the 'next step' isn't insurmountable, then I'll always want to push further as long as you kind folks are willing to put up with my ignorance and stupidity.  :)


Remember, his current cpu upgrade wishes is to add support for a 2-4MHz 6502, not a 100Mhz 486, hence the thread title 'FPGA VGA Controller for 8-bit computer'.

Err.. a 6502 isn't an upgrade from a Z80, it's more of a downgrade.  Just sayin'. ;)  There is a chance in the future that I'll move up to the 16-bit Motorola 68010 as I have one sitting in a box somewhere, but that's still a way off.

« Last Edit: June 30, 2020, 11:58:59 am by nockieboy »
 

Offline nockieboy

  • Frequent Contributor
  • **
  • Posts: 879
  • Country: gb
Re: FPGA VGA Controller for 8-bit computer
« Reply #1143 on: June 30, 2020, 11:36:42 am »
Have just added filled ellipses.  Updated Geo.bas, Drawing.asm and drawing.bin below.
« Last Edit: June 30, 2020, 11:59:12 am by nockieboy »
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1144 on: June 30, 2020, 03:47:12 pm »
Have just added filled ellipses.  Updated Geo.bas, Drawing.asm and drawing.bin below.
Nice.  Optimize converting the longs to ints like the line drawing algorithm shouldn't be a problem.

We were missing a few controls, so I added in the geo.bas: (Full files attached below.)
Code: [Select]
if func1<8 then
REM *************************************************************
REM **** Set 24 bit screen memory registers ************
REM *************************************************************
if func2=127 and func3=0   then destmem = x0*4096 + y0: Rem set 24 bit destination screen memory pointer for plotting
if func2=127 and func3=1   then destmem = x1*4096 + y1: Rem set 24 bit destination screen memory pointer for plotting
if func2=127 and func3=2   then destmem = x2*4096 + y2: Rem set 24 bit destination screen memory pointer for plotting
if func2=127 and func3=3   then destmem = x3*4096 + y3: Rem set 24 bit destination screen memory pointer for plotting
if func2=127 and func3=4   then srcmem  = x0*4096 + y0: Rem set 24 bit source screen memory pointer for blitter copy
if func2=127 and func3=5   then srcmem  = x1*4096 + y1: Rem set 24 bit source screen memory pointer for blitter copy
if func2=127 and func3=6   then srcmem  = x2*4096 + y2: Rem set 24 bit source screen memory pointer for blitter copy
if func2=127 and func3=7   then srcmem  = x3*4096 + y3: Rem set 24 bit source screen memory pointer for blitter copy

if func2=127 and func3=8   then destmem = x0*4096 + y0:srcmem  = x1*4096 + y1: Rem both source and destination pointers for blitter copy
if func2=127 and func3=9   then destmem = x1*4096 + y1:srcmem  = x0*4096 + y0: Rem both source and destination pointers for blitter copy
if func2=127 and func3=10  then destmem = x2*4096 + y2:srcmem  = x3*4096 + y3: Rem both source and destination pointers for blitter copy
if func2=127 and func3=11  then destmem = x3*4096 + y3:srcmem  = x2*4096 + y2: Rem both source and destination pointers for blitter copy

if func2=127 and func3=12  then dest_width   = x2  : Rem Sets the number of bytes per horizontal line in the destination raster
if func2=127 and func3=13  then src_width    = y2  : Rem Sets the number of bytes per horizontal line in the destination raster
if func2=127 and func3=14  then dest_width   = x3  : Rem Sets the number of bytes per horizontal line in the destination raster
if func2=127 and func3=15  then src_width    = y3  : Rem Sets the number of bytes per horizontal line in the destination raster

REM *************************************************************
REM **** Set screen width and height limits ************
REM *************************************************************

if func2=126 and func3=0  then max_x = x0: max_y = y0 : REM set the maximum width and height of the screen
if func2=126 and func3=1  then max_x = x1: max_y = y1 : REM set the maximum width and height of the screen
if func2=126 and func3=2  then max_x = x2: max_y = y2 : REM set the maximum width and height of the screen
if func2=126 and func3=3  then max_x = x3: max_y = y3 : REM set the maximum width and height of the screen

if func2=126 and func3=16 then draw_collision = 0     : REM Clear the pixel drawing collision counter
if func2=126 and func3=17 then blit_collision = 0     : REM Clear the blitter copy pixel collision counter

And the additions to the 'geometry_xy_plotter.sv':
Code: [Select]
module geometry_xy_plotter (

    input clk,              // System clock
    input reset,            // Force reset
    input cmd_ready,        // 16-bit data command ready signal
    input cmd_data[15:0],   // 16-bit data command bus
    input draw_busy,        // HIGH when pixel writer is busy, so geometry plotter will pause before sending any new pixels
   
    output load_cmd,        // HIGH when ready to receive next cmd_data[15:0] input
    output draw_cmd_rdy,    // Pulsed HIGH when data on draw_cmd[15:0] is ready to send to the pixel writer module
    output draw_cmd[35:0],  // Bits [35:32] hold AUX function number 0-15:
                            //  AUX=0  : Do nothing
                            //  AUX=1  : Write pixel,                             : 31:24 color         : 23:12 Y coordinates : 11:0 X coordinates
                            //  AUX=2  : Write pixel with color 0 mask,           : 31:24 color         : 23:12 Y coordinates : 11:0 X coordinates
                            //  AUX=3  : Write from read pixel,                   : 31:24 ignored       : 23:12 Y coordinates : 11:0 X coordinates
                            //  AUX=4  : Write from read pixel with color 0 mask, : 31:24 ignored       : 23:12 Y coordinates : 11:0 X coordinates
                            //  AUX=6  : Read source pixel,                       : 31:24 ignored       : 23:12 Y coordinates : 11:0 X coordinates
                            //  AUX=7  : Set Truecolor pixel color                : 31:24 8 bit alpha blend mix value : bits 23:0 hold RGB 24 bit color
                            //                                                    Use function Aux3/4 to draw this color, only works if the destination is set to 16 bit true-color mode

                            //  AUX=10 ; Resets the Write Pixel collision counter
                            //  AUX=11 ; Resets the Write from read pixel collision counter
                            //  AUX=12 : Set destination raster width in bytes    : 15:0 holds destination raster image width in #bytes so the proper memory address can be calculated from the X&Y coordinates
                            //  AUX=13 : Set source raster width in bytes,        : 15:0 holds source raster image width in #bytes so the proper memory address can be calculated from the X&Y coordinates
                            //  AUX=14 : Set destination mem address,             : 31:24 bitplane mode : 23:0 hold destination base memory addres for write pixel
                            //  AUX=15 : Set source mem address,                  : 31:24 bitplane mode : 23:0 hold the source base memory address for read source pixel

    output Write_col[7:0],  // An 8 bit saturation counter which counts the number of pixel write collisions
    output Copy_col[7:0],   // An 8 bit saturation counter which counts the number of blit write from read pixel write collisions
    output idle             // an output which goes high when the geometry plotter is finished and is doing nothing
);

Now I think you are ready to begin coding the 'geometry_xy_plotter.sv'.
« Last Edit: June 30, 2020, 03:49:05 pm by BrianHG »
__________
BrianHG.
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1145 on: June 30, 2020, 04:42:28 pm »
EDIT: Just a thought, but what about a general FILL function?  Specify a point coordinate and a colour and off it goes, filling in all pixels of the colour at the point coordinate with the new colour, until bounded by other colours?  Or would that be too far?
You only call the function twice in a row...
EG,


Code: [Select]
; *************************************************************************************
; ** Draw a filled ellipse (250,250)-(300,350) with palette color 9
; *************************************************************************************
set_x 0,d'175' ; set x0 register to 250 - The ellipse's top-left X position
set_y 0,d'175' ; set y0 register to 250 - The ellipse's top-left Y position
set_x 1,d'300' ; set x1 register to 300 - The ellipse's bottom-right X position
set_y 1,d'350' ; set y1 register to 350 - The ellipse's bottom-right Y position
plot_circle_fill d'9' ; plot a filled ellipse with palette color 9
        plot_circle d'15'         ; plot the outline of the same ellipse with palette color 15

Remember, my commands holding their screen coordinates were optimized to do this sort of thing.
You may make such a command in your Z80 graphics driver, but, it would just execute the above code.
The Geometry unit will draw between 25 million all the way up to 125 million pixels a second.  It wont break a sweat drawing the extra few pixels surrounding the edge of an ellipse, or rectangle a second time.

Give it a test in geo.bas.
Now, truly begin the 'geometry_xy_plotter.sv'.  Start with being able to load it's internal storage registers with the command input port.

Use the maggie.sv and rs232_DEBUGGER.v to get an idea of how to associate/assign labels from the source command & how to store those labels into the memory registers.  The labels and registers should be almost identical to the geo.bas labels.  Except for the x# & y#, make them 2 dimensional 12 bit registers for ease of design.  (I should have done that anyways within geo.bas.  Sorry, my bad.

« Last Edit: June 30, 2020, 05:05:00 pm by BrianHG »
__________
BrianHG.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 1135
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1146 on: June 30, 2020, 05:12:14 pm »
@asmi, I think you are going off the deep end here.  If you want a 2D/3D accelerated GPU, you may start a thread on the subject and I will contribute.  A decade ago I touched on the subject matter and there is a lot to do and setup just to draw 100 million triangle a second, seeing that each triangle will need to be filled, each having potentially say an average of 4000 pixels each with some intelligence like read ahead cache and omitting hidden triangles beneath others still requires a degree of planning and design.
I will do so at some point, but first I need to figure out what kind of hardware will I need, and design and build such HW. I do have quite a bit of boards with HDMI out, but they all need to overclock FPGA in order to get 1080p out, with that I plan to design a board with DisplayPort out so that I will be free to output just about any resolution I want or need, and it can also output sound at the same time (which might become useful at some point). I know GPU cores require crazy amounts of memory bandwidth, but the best I can do while keeping costs reasonable is 400 MHz DDR3, the question is if it's better to have a single wide interface, or several smaller ones. The latter approach allows for some cool tricks - like taking advantage of large capacity by cloning the same resource (say texture) into all memory devices, and effectively getting ability to have multi-port read access to several different memory locations at once.

However, there is one thing I can agree upon.  If nockieboy wants the engineering headache, place 2 independently wired Hyperbus Rams for 32/64 megabytes.  It will take a lot longer to get it to function, but he will have a ton of memory space whether he makes the memory interface efficient and extra fast or basically good enough to function.
If I were him, I'd go for DDR3 as it's a general purpose memory, and so learning how to work with it will be useful in other projects in the future, while HyperRAMs are pretty specialist parts and only suitable in limited scenarios. Or implement a two-tier memory architecture by connecting relatively small (but fast) SRAM chip and DDR3 at the same time, so SRAM can be used as off-chip cache for much larger DDR3 memory array.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1147 on: June 30, 2020, 06:01:39 pm »
If I were him, I'd go for DDR3 as it's a general purpose memory, and so learning how to work with it will be useful in other projects in the future, while HyperRAMs are pretty specialist parts and only suitable in limited scenarios. Or implement a two-tier memory architecture by connecting relatively small (but fast) SRAM chip and DDR3 at the same time, so SRAM can be used as off-chip cache for much larger DDR3 memory array.
It took me almost a 2 months full time to get DDR3/2 working perfectly for my scaler.  However, let it be known that I designed that ram controller from scratch, it has 8 asynchronous read and 8 asynchronous write ports (with adaptive priority) with read request post in advance commands and write cache on each channel and it had to be as compact and efficient as possible operating on the slowest CycloneIII with 128bits where Altera's had the DQS port limit on a CcycloneIII at 96 bits on the higher speed upper and lower IO banks.  This doe not include reading every single nuance of the FPGA data sheet and DDR2/3 datasheets to make sure I made 0 errors in wiring the devices and that I would achieve the clocking goals before the design was complete.

Now I do not see this project requiring such bandwidth or such a complex design but unless nockieboy were to use an off the shelf DDR3 controller and carefully read the LFE5U's DDR3 ram implementation guide, he will get stuck.  The DDR3 will eat up 2 IO banks on his FPGA due to operating at a lower voltage.  I would use a 1GB reduced latency DDR3 as the price is only around 1$ more than the regular one but the memory controller will have a little less latency.  Maybe 2 of them for 32 bits is allowed by the LFE5U's within 2 adjacent higher speed IO banks.  The slowest LFE5U45-6 at 15$ can run the ram at tops 312MHz, or, 624MTPS.  (Can DDR3 be run that slow?) This is not too bad that a 1080p requires a continuous stream of 150 million pixels a second with a 32 bit ram bus, or 300MHz for a 16 bit ram bus.  The LFE5U without hardware serdes will just squeeze out 720p, actually the -7 variant wouldn't have a problem as well as increase the DDR3 clk speed.  That video mode cuts the 150Mhz @ 32 bit bandwidth requirement into 75Mhz.

     If I started from day one making a GPU with the 15$ LFE5U, I would target 720P with 1 or 2 1gb DDR3s.  Run a 32 bit (ARGB) true-color display and make everything else drawn by a geometry engine every frame with optional edge anti-aliasing using the 'A' alpha channel when drawing the graphics.  The memory space with full true color would mean full accelerated anti-aliased tiles/fonts/rendering including soft edge since there is an 8 bit translucency stencil alpha channel.  But, an 8 bit CPU would never really be able to take advantage of the horse power under the hood.  However, there is also enough room for a softcore 32bit - 68K type of cpu or simple arm-risc which can be placed inside the LFE5U as well as a 64 channel full MIDI wavetable audio system with accelerated DCT for compressed audio streams.
« Last Edit: June 30, 2020, 06:26:33 pm by BrianHG »
__________
BrianHG.
 

Online asmi

  • Super Contributor
  • ***
  • Posts: 1135
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1148 on: June 30, 2020, 06:28:58 pm »
It took me almost a 2 months full time to get DDR3/2 working perfectly for my scaler.  However, let it be known that I designed that ram controller from scratch, it has 8 asynchronous read and 8 asynchronous write ports (with adaptive priority) with read request post in advance commands and write cache on each channel and it had to be as compact and efficient as possible operating on the slowest CycloneIII with 128bits where Altera's had the DQS port limit on a CcycloneIII at 96 bits on the higher speed upper and lower IO banks.  This doe not include reading every single nuance of the FPGA data sheet and DDR2/3 datasheets to make sure I made 0 errors in wiring the devices and that I would achieve the clocking goals before the design was complete.
Again, this depends on the goal. If the goal indeed is to make it work somehow and call it a day - then yea, do the bare minimum to get it going, and call it good. However if the goal is to learn how to work with the most "commodity" memory available with the aim to use this knowledge and experience in the future projects - then this is a great opportunity to do so.

Now I do not see this project requiring such bandwidth or such a complex design but unless nockieboy were to use an off the shelf DDR3 controller and carefully read the LFE5U's DDR3 ram implementation guide, he will get stuck.  The DDR3 will eat up 2 IO banks on his FPGA due to operating at a lower voltage.  I would use a 1GB reduced latency DDR3 as the price is only around 1$ more than the regular one but the memory controller will have a little less latency.  Maybe 2 of them for 32 bits is allowed by the LFE5U's within 2 adjacent higher speed IO banks.  The slowest LFE5U45-6 at 15$ can run the ram at tops 312MHz, or, 624MTPS.  (Can DDR3 be run that slow?) This is not too bad that a 1080p requires a continuous stream of 150 million pixels a second with a 32 bit ram bus, or 300MHz for a 16 bit ram bus.  The LFE5U without hardware serdes will just squeeze out 720p, actually the -7 variant wouldn't have a problem.  That mode cuts the 150Mhz requirement into 75Mhz.
I just checked Micron 2 Git datasheet, and the minimum frequency is set at 303 MHz, unless DLL is disabled - in this case it can be as low as 128 kHz (yes, you read it right - kilohertz!) - it mentions 7800 ns as max clock period.

If I started from day one making a GPU with the 15$ LFE5U, I would target 720P with 1 or 2 1gb DDR3s.  Run a 32 bit (ARGB) true-color display and make everything else drawn by a geometry engine every frame with optional edge anti-aliasing using the 'A' alpha channel when drawing the graphics.  The memory space and full true color wound mean full accelerated anti-aliased tiles/fonts/rendering including soft edge since there is an 8 bit translucency stencil alpha channel.  But, an 8 bit CPU would never really be able to take advantage of the horse power under the hood.  However, there is also enough room for a softcore 32bit - 68K type of cpu or simple arm-risc V which can be placed inside the LFE5U as well as a 64 channel full MIDI wavetable audio system.
This was part of the reason why I suggested implementing command lists that can be stored in video memory - this will completely decouple GPU performance from CPU, as latter can take it's merry time preparing next list while GPU works with existing one. This is how DirectX 11/12 API works. Those command lists can have external parameters, so that CPU can update things like object positions etc, without changing the actual command list. In DX parlance they are called constant buffers. CPU can update these buffers without changing command list, and then these are fed into the rendering pipeline automatically when command list references said buffer. All of that will make CPU performance almost non-factor, as pretty much all graphical stuff will be offloaded onto GPU.
Also I wonder if your existing core would be much simpler had it worked with full color objects and surfaces as opposed to packed-color ones, as I assume you will need additional HW to pack and unpack pixel information from data bytes stored in memory. While in case of full color this will be a trivial memory address calculation without any need to somehow transform data you get from memory.
« Last Edit: June 30, 2020, 06:31:55 pm by asmi »
 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 2087
  • Country: ca
Re: FPGA VGA Controller for 8-bit computer
« Reply #1149 on: June 30, 2020, 07:51:30 pm »
in this case it can be as low as 128 kHz (yes, you read it right - kilohertz!) - it mentions 7800 ns as max clock period.

You won't have enough time to refresh everything at that speed.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf