See the 45 degree angle hatchet...
Also notice the letters are larger...
Remember my line: 0.707:1, or, 1:1.41421 ... (See if you (Or anyone else here) could figure out why this magic down-sample figure is really important...)
Remember this A2 + B2 = C2...
The checker board issue, the letters being larger, my fancy numbers, and that Pythagorean, and the right scaling settings can be setup to solve the 45 degree issue perfectly.
Ah, of course.. If the image being blitted has dimension AxB, when it's rotated 45 degrees its side A will be 1.41421 times longer due to good old Pythagoras and his squared hypotenuse. Okay, no worries, I'll just need to downscale the blits by 0.707 to 1 to remove the chequer-boarding and make them less overweight.
Hitting the sack now - will hopefully have more time tomorrow.
Still getting blit corruption if I draw the quad before I run the blit tests. The character blits are fine after the quad is drawn now, but the half-screen blit is still badly corrupted.
Could this be due to an error in my test code, or something still needs to be tied up with the quad function in HDL?
Still getting blit corruption if I draw the quad before I run the blit tests. The character blits are fine after the quad is drawn now, but the half-screen blit is still badly corrupted.
Could this be due to an error in my test code, or something still needs to be tied up with the quad function in HDL?That corruption looks like a bad screen address & bitmap width setting.
Remember, reading a font or copying a half screen uses a different source address and bitmap width + a copy width and height. Like before when the text was corrupt, you may have missed or got one of these settings backwards.
Just had another little play - looks like sending command 0x0900 isn't turning off scaling?
When I run the test initially, all is fine. The X scales, program exits, great. When I run the test again immediately after, the character blits are all scaled by 25x, despite turning off scaling before the test program exits, and when the test program starts... (command 0x0900). Any ideas?
Just had another little play - looks like sending command 0x0900 isn't turning off scaling?
When I run the test initially, all is fine. The X scales, program exits, great. When I run the test again immediately after, the character blits are all scaled by 25x, despite turning off scaling before the test program exits, and when the test program starts... (command 0x0900). Any ideas?No, 0x0900 does nothing. You need to sen 0x0903 so that all the xy[0/1] regs take effect. You need to put the 0's into the xy[1/2] regs, then send a 0x0903 so both regs [0/1] are passed to the scaler controls.
Right at the bottom of the code, you need to change the 'drawLine_arc' from the current drawline into the Bézier curve just like we did before, but, make it so that it can be easily translated into verilog. The final code should look almost identical to the existing draw line, just the added +/- arc curve correction will be added.
Touch up the 'Geoarc.bas' and post it back.
As my province is now back on emergency quarantine, we I have a little time to finish ellipse, so, I setting up a freebasic tester for the current linegen to render a diamond based on mouse coordinates with the framework to allow you to play with a second parallel linegen which will be used to generate the arcs of the ellipse. Give me 30-45 minutes to upload it.
Again, the coding will need to easily translate to the current Verilog linegens. Once done, there wont be much left to do with the limitations of you current FPGA.
Though, you can drop the MAGGIE layers from 6 down to 5 or 4 to release a bunch a free space.
I'm thinking you should change the command structure into 2 byte, 3 byte and 4 byte commands.
Commands 0-127 would send 2 byte controls like now.
Commands 128-191, 3 bytes would just directly change a 16 bit integer setting, directly feed any control, or set the x/y[ # ] instead of piping everything through the XY regs. Now the XY regs, room for up to 64 of them will be 16bit each instead of 12bit. (IE Z80 sends 3 bytes total.)
Commands 192-160 would send 24 bit integer commands. (IE Z80 sends 4 bytes total).
Commands 161-255 would send 32 bit integer commands. (IE Z80 sends 5 bytes total).
You will no longer need to convert all your 16 bits into 12bit + shift part of the remainder into the command's LSB's or the next Y register for the 24 bit ints.
This way, you may increase all the 12bit regs to 16bit, making your new XY coordinates go from -32768 to 32767, as well as a scaler which would have a 16bit scales of N:65536, 16x the current 4096 division steps.
Also, you will now have access to many more x/y regs than just 4 if you need them.
With room for 64x 16 bit integer regs and even 32 x 32 bit integers,it is now feasible to implement a 32bit ALU, full multiply/divide & add/sub between the regs with so many spare commands in the first 128 to direct things like holding offset and scale factors for the line drawing engines making high quality accelerated geometry graphics possible.
The 32 32bit integers will allow for true floating point accelerated geometry. Though, we are getting into the realm of how can a Z80 feed all of this. Though, you can feed this command pipe from memory contents of the GPU ram itself.
Maybe just use 2 byte and 4 byte command modes since the FIFO pipe and GPU ram is organized in 16 bits.
To send a 32 bit word, you would need to transmit 2x 4 byte commands, or we can scrap 32 bit ALU and just use 24 bit limiting use to +/- 8million range integers.
So which is better - the TLV62130A or the AOZ1284? I'd be using them to provide a 5V and 3.3V rail to the entire uCOM system, as well as the GPU - which would have more chips supplying the 1.2 and 2.5V rails from the 5V rail. The uCOM draws about 120mA without the GPU.
The TI has far more suppliers, though it's double price the AOZ1284. Use it only for the 3.3v & VCC 1.8v core. The 2.5v PLL you may use a cheap linear 50ma regulator in sot23 package. The VCC analog for the DAC you may also use a 100ma regulator. The TI part doesn't require a diode on the output and it's inductor is far cheaper and smaller at 2.2uH 4 amp instead of the AOZ1284's required 10uH 5 amp inductor.
If I were you, look into the latest KiCad. It's public domain and it may already have the Cyclone V component in it's library, or online library. This is the biggest hurdle for you if you are worried about mistakes.
The Cyclone V schematics I sent you will tell you how to hook up the JTAG/Active serial programming ports & configuration lines and filters for the analog PLL core voltages, unless you can find an existing online Cyclone V project which you can load and edit.
Okay, I've spent a little time researching and designing the power rails for the Cyclone V E board using the TI part you recommended. If you have five seconds spare, would really appreciate your thoughts. I'll make a start on the clock, configuration and JTAG/AS interface next.
I've never got on with KiCAD - found its UI to be a brick wall to learning more about how to use it, so I started out with DipTrace which spoiled me, really. It's free for up to 500 pin designs, so I quickly outgrew it and moved on to EasyEDA (as you know), which is my go-to design tool now (and probably not too dissimilar from KiCAD, so I'm aware of the irony!) It has Cyclone V parts on it already, fortunately.
I'm thinking you should change the command structure into 2 byte, 3 byte and 4 byte commands.
Commands 0-127 would send 2 byte controls like now.
Commands 128-191, 3 bytes would just directly change a 16 bit integer setting, directly feed any control, or set the x/y[ # ] instead of piping everything through the XY regs. Now the XY regs, room for up to 64 of them will be 16bit each instead of 12bit. (IE Z80 sends 3 bytes total.)
Commands 192-160 would send 24 bit integer commands. (IE Z80 sends 4 bytes total).
Commands 161-255 would send 32 bit integer commands. (IE Z80 sends 5 bytes total).
You will no longer need to convert all your 16 bits into 12bit + shift part of the remainder into the command's LSB's or the next Y register for the 24 bit ints.
This way, you may increase all the 12bit regs to 16bit, making your new XY coordinates go from -32768 to 32767, as well as a scaler which would have a 16bit scales of N:65536, 16x the current 4096 division steps.
Also, you will now have access to many more x/y regs than just 4 if you need them.
With room for 64x 16 bit integer regs and even 32 x 32 bit integers,it is now feasible to implement a 32bit ALU, full multiply/divide & add/sub between the regs with so many spare commands in the first 128 to direct things like holding offset and scale factors for the line drawing engines making high quality accelerated geometry graphics possible.
The 32 32bit integers will allow for true floating point accelerated geometry. Though, we are getting into the realm of how can a Z80 feed all of this. Though, you can feed this command pipe from memory contents of the GPU ram itself.
Sounds like a big improvement in capability, especially as we're going to be able to leverage that extra capability when I get a Cyclone V board up and running.
Funny you should mentioned an ALU. I've been wondering about an FPU core. At the moment I have no idea how an FPU integrates and is utilised by the old processors like the 68020 etc, but I wonder if it would be a benefit to the Z80? There were a couple of 3D (wireframe) games for the old 8-bit computers (Elite, Starglider to name but two), and they managed that all on the Z80 processor. Having hardware that could perform the floating point maths and matrix transformations would surely be a big boost?Maybe just use 2 byte and 4 byte command modes since the FIFO pipe and GPU ram is organized in 16 bits.
To send a 32 bit word, you would need to transmit 2x 4 byte commands, or we can scrap 32 bit ALU and just use 24 bit limiting use to +/- 8million range integers.
No issue with sending 32-bit words, but I'm probably going to have to move from the current method of communicating with the GPU via IO ports to loading the commands/data into GPU RAM and letting the GPU off the leash to work on the command list. Interacting with the GPU via IO is expensive time-wise due to the extra WAIT-state the Z80 inserts and it can only send a byte at a time. A memory interface would speed things up slightly, I guess.
So far so good. Remember, the C8 variant is the cheapest, see here:
https://lcsc.com/product-detail/CPLD-FPGA_Altera-5CEBA2F17C8N_C568996.html
Just remember, reserve the high-speed IO banks for wiring to DDR3 ram & to directly drive HDMI out. Everything else is almost do as you please, but still, I would sector off sections for Z80, Analog VGA, (make it close to a high speed IO bank containing the HDMI outputs). Look in the data sheet, the 2 top & 2 bottom IO banks are high speed and the 2 left and 2 right ones are the slower ones. DDR3 ram requires that IO bank to be on a lower supply voltage and that bank needs a PLL differential dedicated output in that IO bank to drive the PLL clock.
Instead of command port & a command structure, we can make every word and control a memory address, but, you will no longer have an input fifo. Every sent command will need to wait for the last draw command to complete before you can touch any variables. This will slow down the Z80 when drawing multiple large screen elements and blits.
So far so good. Remember, the C8 variant is the cheapest, see here:
https://lcsc.com/product-detail/CPLD-FPGA_Altera-5CEBA2F17C8N_C568996.html
That's not a bad price at all - I've been looking at sourcing the A4 variant from Mouser, which is over £37 and not something I want to buy for an untested board or process (would be an expensive way to learn to solder BGA if I'm going to make mistakes!)
I suppose the A2/A4 versions of the Cyclone V are interchangeable on the same board if it's designed with that in mind? (EDIT: Yes, they are.) Thinking back to previous comments about the Cyclone IV CE6/CE10, I wonder if the A2/A4 are physically identical dies, just marketing spin?
(Attachment Link)
Okay, so DDR3 memory (I was looking at DDR for some reason) - could I stick a chip on the back of the PCB somewhere?
What about this ADV7513, though? It seems like it would be an excellent replacement for the TFP410 to output video and states it does HDMI which incorporates audio into the data stream. That would be a big win for me if I could use one of those (I could scrub audio DACs from the BOM and at least 10 IOs as I could just output audio to the TV via I2S) - but the datasheet is short on information regarding using it; do I need to get a licence for HDMI or is the licence more to do with tx-ing protected data (which I have no interest in doing)?