Author Topic: Retro Z80 project: Memory Layout and software management (Read 52472 times)

Ian.M · « **Reply #25 on:** December 13, 2015, 02:03:40 am »

If you go smaller there are too many page registers to write to remap stuff. 4K is tolerably convenient as you only need 32 bytes of dual port RAM )or RAM with separate Din and Dout at some cost in the complexity of the I/O mapped load circuit)to implement full independent read/write remapping of a 1MB address space, and done right, you can remap the whole 64K using OTDR to load the mapping table in 331 T cycles (Use the B register that appears on A8:15 to address which page register to update, and C on A0:7, decoded to three addresses for update write mode, read mode or both together.) If you've got a larger dual port RAM, you can implement multiple maps you can 'flip' between. Startup requires disabling the mapping after Reset so it can execute from ROM until a particular I/O address is touched to give the startup code a chance to load the map.

obiwanjacobi · « **Reply #26 on:** December 13, 2015, 09:03:56 am »

@C:

thanx!
I don't like to reissue the the CPU memreq - sounds all very complicated. Thanx for bringing this to my attention, I did not realize it until now.
If I understand you correctly then the 'very fast ram chip' is only used to translate an IO operation (OUT C,A) into an extended memory address (A16 and higher). Why would you need 256kx8 for that? I must say I don't think I got everything you are trying to tell me here... sorry.

@Bruce Abbott: Vectored graphics was just a brain fart. I though it would be cool to be able to scale stuff easily. I am thinking of doing three display modes: 1) character mode (simple to start with), 2) VGA graphics mode, 3) hi res monochrome mode.
I like the idea of using a different data bus width on the VGA side, although I don't see how you can let the CPU access the other bytes in the meantime. To my mind it would simply mean that the VGA controller is done sooner.
Yes, the 'x1' is one data byte for indexing the palette. The palette can be whatever color-depth you want (24 would be nice)

@Ian.M: I am confused. One says use big blocks (32k), you say use smaller blocks (4k), another says go even smaller (1k/2k)... So with these smaller blocks you're going to have to maintain page tables for each program, right? - in order to make sure that they all 'stay together'. What would be the benefit of that if the CPU does not support extended addresses natively like in the x86 and the first PC architectures (segmented/paged/virtual memory)? I can only think of more efficient memory usage at the cost of a more complex management. Is that the purpose?

I have been thinking a little more on using a double buffered video memory. While it is excellent from a hardware perspective because it would practically eliminate contention, from a software perspective it is not so good. It would require the software to redraw the entire display in the video memory each time (because after the memory is swapped it contains the previous scene) it wants to change something. It may not be a problem but it feels sub-optimal. It would take extra time and memory and add program (or bios) complexity.

Can someone explain to me how these dual port chips work? And if the 150ns/40ns is fast enough for a 20MHz clock?

bitslice · « **Reply #27 on:** December 13, 2015, 06:04:20 pm »

Quote from: obiwanjacobi on December 12, 2015, 09:27:10 am

I know about memory bank switching

Has anyone used the 74LS610 memory mapping chips?

Personally for a Z80 design, I'd be taking advantage of the Z80 DMA chip to communicate with a video display.

Maybe offload graphics commands to a separate processor running a v9958.
One could blat out graphic primitives from the Z80 via a FIFO, so the DMA chip was occupying the bus for as little time as possible.

C · « **Reply #28 on:** December 13, 2015, 06:24:16 pm »

Back in the 80's radio shack /tandy produced two lines of computers

The TRS-80 model 1 series
You had the Model I, Model III, Model IV, Model 4P
All have dual port video memory
The Model I & Model III used logic to create video out.
The Model IV & Model 4P used a 6845 80x24 is hard to do in logic, with a 6845 it is a lot easer.
All could have a memory map starting a 0 of 12k rom, 4k I/O(video & keyboard in here), 48k ram.
The 4 & 4P could change memory map to run CP/M and had option for 128k.
The 4P did not have the 12k rom and instead loaded a disk file to act like a rom.
Some of these used 16kx1 dram so 3x8 chips. when 64lx1 cost dropped was cheaper to use 64K chips and not use 16k
Cheap low cost logic used in these with pal's used in later versions.

The model II series
Model II, Model 12
The Model 16b was a model II with added cards containing a 68000.
The Model 6000 was a Model 12 with 68000 cards.
These were built using the add on Z80 chips( dma, sio, pio ).
Used a card cage to add more boards and could have added memory boards.
Not a down to a cost system.

http://www.classiccmp.org/cpmarchives/trs80/Library/Manuals/Hardware/

The Service Manual or Technical Reference Manual shows the circuits and describe how the sections work.

The LOBO MAX-80 could run most Model III & Model IV software and CP/M-3.
128k ram, 4 x 5.25" floppies, 4x 8" floppies. (single or double sided and single or double density). SCSI-1 interface. Only one rom in system the 512 byte boot rom.
A ram chip was used instead of a character rom, so a second block of dual port memory.
The manual for this talks about using the Z80 with a banked memory.

A little reading of above could save you a lot of time and help you build something better.
The 6845 data sheet is not bad about explaining character based video and even using this chip for graphics displays. Newer designs use a micro controller or fpga.

Fixing the timing of signals like /MREQ is not hard. Putting a signal through two inverters in series gives you a time delay. There are time delay chips where you have taps that have a fixed delay. You can also use clocked logic to do this.
In the dynamic ram examples There is a 2 to 1 buss switch on address lines while generating new /mreq delayed signals(/ras,mux,/cas). If you look at examples using the 64kx1 dram you will see a switch on A7 to extend Z80's 7-bit refresh address to 8-bits.

When you add bank switched memory or a memory mapper, both take time to function. The greater the delay the faster the down stream memory speed to keep the Z80 at same speed.

The 256kx8 chip is cheap & fast. nothing says you need to use all the space.

To have 32k blocks of memory you are replacing the Z80's A15 with a new version. with one 8 bit output port you could have 4 bits to specify the address of the lower 32k of memory and 4 bits to specify the address of the upper 32k of memory. This would let you have 32k blocks from 512k of memory. To this you would need to add a way to have some common memory. You could use two 8 bit output ports and you would gain 4 more bits for addressing at the cost of one I/O writes to change both banks. The Z80's A15 would be controlling a 2 to 1 switch like the 157. This adds two chips for one 8-bit port and four for the two 8-bit ports. You would be adding the 157's time delay for switching inputs to the Z80's memory access time.
If you change to 16k blocks you need more output ports and need to use a 4 to 1 switch.
With 8k blocks a 8 to 1 and with 4k blocks a 16 to 1 (74150) switch.
More and more chips are needed.

The ram chip is the 8-bit output ports and the 16 x 1 switch in one package.
The difference is the time delay of the 16 to 1 vs ram delay and saving a lot of chips.

Now in place of using a small ram for this you can use a larger ram. You could strap the extra address lines making large ram small or you could connect these to a 8 bit latch. This 8 bit port could switch between 1 of 256 different memory maps with one I/O output. Changing between banks like in picture is just one I/O output.
On power up you could load this ram one time. To get a 4k common at top of memory you just make the same entry for that block in each map.

You have one very fast ram. One 244 to connect the Z80 data buss to the data inputs of the ram chip while loading the maps. Using /IORQ to write the map prevents the expanded memory from responding as expanded memory uses /MREQ.
On power up the contents of this ram is unknown. If power-up prevents ram /OE and ram data lines have pull-up resistors, the first 4k of the Z80's memory will be mapped to the highest 4k of expanded memory.

To increase the expanded memory by 8-bits would require one additional very fast ram chip and small change while loading maps.

C · « **Reply #29 on:** December 14, 2015, 08:40:12 am »

Reading some of your thoughts in Z80 Memory, some of the problems are not easy to fix.

While a user program is running on a Z80 the user program has access to all it knows about or can find out. Easy for a user program wrong.

The 68000 that was designed with multi-users, multi-tasking & many 68000's on a buss could still get in trouble that the processor could not work around. The cure was to have a second 68000 fix the problem. The 68010 fixed the need for a second processor fix and added the capability to run virtual 68010s.
You might want to think of two z80's, one for the user and second for the system.

There was an operating system called TurboDos that could run most CP/M programs. Unlike Digital Research's MP/M that tried to put many users on one Z80, TurboDos gave each user a Z80. Their could be many Z80s on a buss and/or networked to more. The simplest user Z80 was a Z80 card with memory that plugged into a system buss. Most added a simple interface to improve communications with other Z80s. Another Z80 on the system buss would load the initial program into ram and then release the Z80 Reset. The manuals and software are on the net. The bios interface is described in the manual. It is a very simple and powerful.

Remember that hardware designs you find on net are around the Z80A or Z80B( 6Mhz) and will need to be adjusted as Z80 clock speed is increased. Logic timing becomes very important. Good practice is to buffer signals going off board. The Address buss & data buss would each have two buffers adding a time delay. Consider the time needed for a memory read of memory on a second board

grumpydoc · « **Reply #30 on:** December 14, 2015, 09:45:36 am »

A 20MHz Z80 has been mentioned

But, apart from "VGA" not much regarding what graphics is intended.

Are you aiming at 640x480x16 colours or would you be happy with less resolution?

Have you a graphics controller in mind?

The MSX graphic chips (yamaha 9938 & 9958) have been mentioned - probably a good solution but the video RAM is not shared between CPU and graphics. Was achieving shared memory important to your design?

westfw · « **Reply #31 on:** December 14, 2015, 09:50:05 am »

Quote

too many page registers to write to remap stuff.

There's a big difference between implementing a bank-switching scheme typical of the z80 era of computing, and implementing some sort of "real" paging/segmentation scheme.
The OP needs to figure out which they're doing.

I don't think that there are too many examples of fancy z80 memory management schemes (not counting the "modern z80s" that I already mentioned), because by the time it was practical to put more than 64k of RAM on a system, there were cpu chips that directly addressed more than 64k of memory. The original IBM-PC shipped with as little as 16k of RAM, even though the 8088 could address 1MB, and the 68000/68010 (addressing 16MB) was available in the same timeframe. (And the Z8000, too, I guess, but it flopped?)
You could look at whatever MP/M did, although it was a lot less popular than CP/M (which had very minimal support for memory beyond 64k.)

obiwanjacobi · « **Reply #32 on:** December 14, 2015, 11:16:12 am »

@C: I am planning to write a bios that contains basic service routines to work with the hardware. The user program does not have to be isolated and can use the bios to perform video output and memory bank switching (details not yet understood).
The higher speed is indeed a point of focus. I carefully picked out my sram chips to be fast enough. That is also why I do not want too many glue-logic chips.

I am also trying to understand FPGAs (VHDL) because the initial idea was to write the VGA controller (memory dumps) in a FPGA. Any glue logic will probably also be housed in the FPGA (should be fast enough). I envision the main board with the FPGA, Z80 CPU and some core mem. Then a central bus to extend and connect external modules (stacked pcbs). But nothing is set in stone except using the Z80. I aim to make a blend between the retro z80 software experience and modern hardware. I don't need to be period correct on every aspect.

@grumpydoc:
- The aim for VGA is indeed 640x480x1 byte that indexes a color palette.
- Whether to use shared video memory or a graphics controller is not decided yet. I initially dismissed a graphics controller but this thread made me rethink. No specific controller in mind (although I saw the new FTDI EVE2 chips recently...)
- Shared memory is not mandatory in this design - although I would get a kick out of it when I get it to work. > 64k is part of the design.

@westfw: I was aiming for the typical Z80 retro type of memory management. Not the 'modern' segmented/paged/virtual memory management type.

Ian.M · « **Reply #33 on:** December 14, 2015, 12:53:33 pm »

A brief historical review may help you find Z80 systems worth studying:

Later generation MSX systems had fairly fancy memory management and could run CP/M, I understand MSX was reasonably popular for a time in the Netherlands.

On the cheap side, the Sinclair (Amstrad) Spectrum +2A and +3 had limited 16K block paging, out of a pool of 128K RAM and 64K ROM, just sufficient to support CP/M+ with an 11K RAMDISK and 61K for user programs.

The Amstrad PCW512 (another CP/M+ machine) had 512K RAM and 48K ROM, again with a 16K block size, but with the capability to map reads ad writes to different pages.

The PCW512 might be a good starting point for a 'compatibility' mode to run CP/M+ and access a wide range of development tools (of dubious legality).

On the subject of RAM and paging: i80486 era cache RAM is still reasonably readily available. A typical example being the Winbond W24257A 32Kx8 high speed SRAM, which came in speed grades between 10ns and 20ns. 12ns and 15ns grades are still findable. It has 15 address lines and can be used with /OE and /CS held low, which will give a propagation delay from address line change to valid data out of no worse than the speed grade. If a W24257A was inserted between the high 4 bits of the Z80 address bus (to 4 of its address lines) and the high 8 bits of a 20 bit memory address bus (from the W24357A data bus), with a fast address decoder and lets say 512K or 768K of comparable speed 128Kx8 SRAM, (e.g. ISSI IS61C1024 which can still be found N.O.S. in PDIP), you could comfortably meet the worst case 80ns cycle time of a 12MHz Z80, which IIRC, was the fastest pin-compatibles went.

One more W24257A address line would be used to implement separate wriite and read paging, driven by the Z80 /RD signal, which is valid much sooner than /WR. Ground two of the remaining address lines for simplicity and connect the rest to an I/O mapped 8 bit latch to permit instant swapping between 256 complete page tables. You'd need a tristate bidirectional buffer linking the Z80 data bus to the high 8 bits of the memory address bus to permit the W24257A to be read or written in the I/O space, and an 8 bit latch for I/O operations to select the page table to access without smashing the current memory page table selection, + some glue logic to decode a single 8 bit I/O address, connect the read/write table select address line back to A8 , swap A15 down to A9 (so tables can be rapidly loaded with OTDR) and handle the W24257A /RD, and /WR signals.. Use A15 set on that port to access the page table I/O operation latch. Finally, another 8 bit port would be decoded to access the memory operation page table select latch so page tables could be selected and activated with a simple OUT (port), A instruction.

To permit bootup from ROM, A12:19 on the memory bus would have pullups and the W24257A /OE would be held high by a flipflop set on /RESET, and reset by the first access to the memory operation page table select latch. That would map the top 4K of the 1MB address space throughout the 64K address map, which would of course be the Boot ROM, running in RAMless mode, and its first job would be to set up a page table that mapped the ROM starting at address 0, and some RAM at 0x8000 upwards and activated it. From then on, normal Z80 operation with a stack in RAM and memory paging would occur.

Due credit to user 'C' for essentially the same concept above, I've just refined it to use fast real-world parts and optimised some stuff like loading it

Due to the swappable paging tables by a single OUT instruction, you can easily emulate any 'classic' Z80 system memory map and only have to patch the page selection code in your ripoff of that system's BIOS.

However, I think you are making a massive amount of trouble for yourself compared to using an opcode compatible Z80 successor with a MMU like the Hitachi HD64180 or one of the Zilog Z180 family.

grumpydoc · « **Reply #34 on:** December 14, 2015, 03:48:10 pm »

Quote from: Ian.M on December 14, 2015, 12:53:33 pm

On the subject of RAM and paging. i80486 era cache RAM is still reasonably readily available. A typical example being the Winbond W24257A 32Kx8 high speed SRAM, which came in speed grades between 10ns and 20ns. 12ns and 15ns grades are still findable. It has 15 address lines and can be used with /OE and /CS held low, which will give a propagation delay from address line change to valid data out of no worse than the speed grade. If a W24257A was inserted between the high 4 bits of the Z80 address bus (to 4 of its address lines) and the high 8 bits of a 20 bit memory address bus (from the W24357A data bus), with a fast address decoder and lets say 512K or 768K of comparable speed 128Kx8 SRAM, (e.g. ISSI IS61C1024 which can still be found N.O.S. in PDIP), you could comfortably meet the worst case 80ns cycle time of a 12MHz Z80, which IIRC, was the fastest pin-compatibles went.

A 20MHz part was mentioned which will tighten up timings to 50ns

In any case, given the target of 640x480x8bits per pixel a pixel clock of 25.175MHZ is needed (standard VGA timing @ 60Hz). That means a 40ns cycle for the video. This could be reduced by making the video RAM more than 8 bits wide but given that one can get large fast SRAMs these days it might be possible or even easier not to bother.

I suspect that the best way is to split the video clock into half for video system access and half for system access. That shrinks the cycle to 20ns unfortunately and we have to get some address multiplexing in there as well.

Something like a Cypress CY7C1049D-10VXI would do for video RAM - 10ns 512k x 8bit. For multiplexing you could probably get away with 74AC157s as the typical propogation delays are around 5ns, leaving you "plenty" of time in the 20ns cycle. Worst case propagation delays might leave you feeling a bit blue though as they are > 10ns

A 10ns part such as the Winbond should also be OK even at 20MHz as there's actually about half a clock of address set-up time before MREQ goes active.

Mind you, with the timings as they are I don't think I fancy getting it all going on a breadboard

Ian.M · « **Reply #35 on:** December 14, 2015, 04:51:53 pm »

You could shave a few ns by using low Rds_on analog switches for the glue logic to multiplex the page translator RAM's address lines between I/O access mode and memory address tranlation mode. That means that *AFTER* address translation, assuming a 10ns translation RAM you still have a 40ns cycle 'window' for the main memory. Using fast analog switches to multiplex the VRAM between the VGA controller and the memory bus could also shave some ns.

However, it might be a better idea to design an AT ISA bus interface allowing ISA bus SVGA cards and other goodies like super I/O cards with IDE and FDC controllers to be used. It might need a wait state for the Z80 while accessing the 'slot' area but that would be a small penalty to pay for predictable contention-free timing and no need to build hairy high speed video circuits or to have to invest time and money in FPGA development tools.
Stick an EIDE to SATA adapter on one of the super I/O card's IDE ports, and you should be able to hook up modern storage as well. The cards are getting a little rare nowadays, but they are still findable, and anything with Linux drivers can be assumed to have sufficiently well documented registers to be a candidate.

legacy · « **Reply #36 on:** December 15, 2015, 03:14:53 pm »

someone wants help me with my text-VDU ? it's written in vhdl && almost working!

someone interested in V99x* pcb with 4/6 DRAM ?

if so, contact me in PM!

AlxDroidDev · « **Reply #37 on:** December 15, 2015, 08:25:16 pm »

Quote from: Ian.M on December 14, 2015, 12:53:33 pm

However, I think you are making a massive amount of trouble for yourself compared to using an opcode compatible Z80 successor with a MMU like the Hitachi HD64180 or one of the Zilog Z180 family.

Since the MSX was my very first computer, back in 1988, it would be fun to have one.

I have an used HD64180ZFS6X (6MHz), taken from a printer controller board. How hard do you think it is to build an MSX around this chip? It definitely looks like n interesting project, if feasible at all.

Bruce Abbott · « **Reply #38 on:** December 15, 2015, 11:22:35 pm »

Quote from: AlxDroidDev on December 15, 2015, 08:25:16 pm

How hard do you think it is to build an MSX around this chip?

MSX architecture is very simple so I think it could be done without much trouble. Undocumented instructions might be problem for any software that uses them.

Some day I want to try out the HD64180 or Z180. So far I have made a simple SBC with CMOS Z80, 32k EEPROM, 32k static RAM, 16C550 serial and a breadboard for testing I/O chips. I have just tested an AY3-8910 (works perfectly!). Next up is AY-3-8912, 82C55, TMS9929 and V9958 (all purchased from eBay).

AlxDroidDev · « **Reply #39 on:** December 16, 2015, 01:02:59 am »

Quote from: Bruce Abbott on December 15, 2015, 11:22:35 pm

MSX architecture is very simple so I think it could be done without much trouble. Undocumented instructions might be problem for any software that uses them.

Some day I want to try out the HD64180 or Z180. So far I have made a simple SBC with CMOS Z80, 32k EEPROM, 32k static RAM, 16C550 serial and a breadboard for testing I/O chips. I have just tested an AY3-8910 (works perfectly!). Next up is AY-3-8912, 82C55, TMS9929 and V9958 (all purchased from eBay).

Very nice! Congrats!

My HD64180 is a QFP-80, but that shouldn't be a problem. Properly etching a board with 80 perfect tiny pads is more complicated than doing all the soldering.
Building a BIOS/CMOS and an EEPROM programmer shouldn't be a problem either, but I believe it is a requirement, right?

What handles the keyboard IO in your SBC ?

jwm_ · « **Reply #40 on:** December 16, 2015, 03:36:17 am »

Have you considered helping things out with software since presumably you are writing your own BIOS? Have an interrupt every 60hz (or whatever your refresh is) that lets the CPU halt properly, bank out the video memory and then raise a line telling the ramdac to go do its thing. If you bank switch in and out the video ram via BIOS routines, you could even have the CPU carry on its normal program until something actually tries to switch in the video memory then pause until it is done so you will get a degree of concurrency in the common case.

Bruce Abbott · « **Reply #41 on:** December 16, 2015, 03:52:41 am »

Quote from: AlxDroidDev on December 16, 2015, 01:02:59 am

Building a BIOS/CMOS and an EEPROM programmer shouldn't be a problem either, but I believe it is a requirement, right?

EPROM programmers are so cheap now I wouldn't bother building one. I use a MiniPro TL866CS, which goes for as little as US$45 on eBay. It does most (E)EPROMs, and also some PICs, AVRs, and GALs. I like using GALs during development to replace random logic gates (saves a lot of rewiring).

Quote

What handles the keyboard IO in your SBC ?

Keyboard input is via the serial port. I just use Hyperterminal in Windows as my console. To interface to the PC I use a USB to TTL serial adapter (no point in converting to/from RS232 levels when you don't need to!). The driver is very simple. Here is all the code you need to drive a 16550 UART:-

Code: [Select]

; UART I/O port addresses
uart_base       equ     $00
;
uart_register_0 equ     uart_base + 0
uart_register_1 equ     uart_base + 1
uart_register_2 equ     uart_base + 2
uart_register_3 equ     uart_base + 3
uart_register_4 equ     uart_base + 4
uart_register_5 equ     uart_base + 5
uart_register_6 equ     uart_base + 6
uart_register_7 equ     uart_base + 7

;--------------------------------------------------------------------------
; Set UART to 9600 baud, 8 bits, no parity, 1 stop bit.
;
INITUART:      ld        a, $80          ; select baud rate register
               out       (uart_register_3), a
               ld        a, 12           ; baud rate low byte  1843200 / (16 * 9600)
               out       (uart_register_0), a
               ld        a, 0            ; baud rate high byte
               out       (uart_register_1), a
               ld        a, $03          ; select data register, 8N1
               out       (uart_register_3), a
               ret

;------------------------------------------------------------------------------
; Send character to serial port
;
TXA:            push    af
tx_ready_loop   in      a, (uart_register_5)
                bit     5, a
                jr      z, tx_ready_loop
                pop     af
                out     (uart_register_0), a
                ret

;------------------------------------------------------------------------------
; get character from serial port
;
RXA:            in      a, (uart_register_5)
                bit     0, a
                jr      z, RXA
                in      a, (uart_register_0)
                ret

;------------------------------------------------------------------------------
; Check to see if character is available
;
CKINCHAR:       in      a, (uart_register_5)
                bit     0, a             ; NZ if char available
                RET

My operating system is a slightly modified version of Grant Searle's ROM BASIC. This has full source code so you can easily reconfigure it to suit your own hardware. The advantage of BASIC is that you can quickly do stuff like accessing I/O ports without having to compile and download code - just like we used to do 30 years ago!

BASIC programs can be saved and loaded via text transfer in Hyperterminal. If I want to run a complex machine code program I burn it into the EEPROM (Winbond W27C512) which only takes a few seconds.

westfw · « **Reply #42 on:** December 16, 2015, 07:04:55 am »

Quote

I like using GALs

What are you using to create the GAL "binaries"? The modern CPLD/FPGA tools are huge (they make AS7 look compact!), and the versions of PALASM I've found seem to still use DOS 8.3 filenames and such...

bson · « **Reply #43 on:** December 16, 2015, 07:47:39 am »

Quote from: obiwanjacobi on December 12, 2015, 09:27:10 am

The main issue I am trying to solve is memory contention between the VGA controller and the CPU. I know about two/dual-ported memory but that is really expensive.

About $7 for 8 VRAM chips, shipped.

http://www.ebay.com/itm/8-uPD41264C-DUAL-PORT-VIDEO-RAM-64Kx4-256x4-155-40ns-5V-0-4-DIP-24-DRAM-VRAM-/221594789095?hash=item33981424e7:m:mDXZfBiRIpdgA5b_TY1r5SA

The datasheet the seller links... http://exdwh.com/uPD41264C-12.pdf
Serial access for video, random access for a CPU.

Not entirely authentic though, quite a bit post-Z80...

There are more variants on the same theme (parallel CPU bus, serial video port).

Could probably have two of these fill the entire Z80 address space and use it as main memory. For ASCII character generation you'd need to clock the output into a 595 or other shift register and consider it byte wide and play around a little with the addressing.

grumpydoc · « **Reply #44 on:** December 16, 2015, 10:11:50 am »

Quote from: bson on December 16, 2015, 07:47:39 am

Quote from: obiwanjacobi on December 12, 2015, 09:27:10 am
The main issue I am trying to solve is memory contention between the VGA controller and the CPU. I know about two/dual-ported memory but that is really expensive.
About $7 for 8 VRAM chips, shipped.

http://www.ebay.com/itm/8-uPD41264C-DUAL-PORT-VIDEO-RAM-64Kx4-256x4-155-40ns-5V-0-4-DIP-24-DRAM-VRAM-/221594789095?hash=item33981424e7:m:mDXZfBiRIpdgA5b_TY1r5SA

The datasheet the seller links... http://exdwh.com/uPD41264C-12.pdf
Serial access for video, random access for a CPU.

Not entirely authentic though, quite a bit post-Z80...

There are more variants on the same theme (parallel CPU bus, serial video port).

Could probably have two of these fill the entire Z80 address space and use it as main memory. For ASCII character generation you'd need to clock the output into a 595 or other shift register and consider it byte wide and play around a little with the addressing.

The problem with the above is that they are a bit small - 64k x 4 - if you are aiming at VGA resolution; 640x480 x 8-bits per pixel is 300k of video RAM.

You can get large, fast dual port RAM - eg http://www.idt.com/document/dst/709099-datasheet but I haven't found any 512k x 8 chips so you would need multiple packages and TQFP-100 isn't exactly a friendly format for DIY.

I think contention-free access would just be possible with discreet parts using 10ns SRAMs and 74AC series logic as long as some care is taken over with timings, gate delays and probably layout (not sure it would work on protoboard).

If I have chance I will try to put a few schematics and timing diagrams together.

Ian.M · « **Reply #45 on:** December 16, 2015, 11:19:23 am »

The question should be: Is it worth implementing HiRes memory mapped 8 bit/pix video on a Z80 (or descendent) system.

To do anything significant without hardware assistance is *SLOW*. Lets assume that for simplicity you only support VGA 640x480. That's 307200 bytes of RAM. Lets further assume a 20MHz clock speed and that one is using extensive loop unrolling e.g. large blocks of LDI instructions. LDI takes 16 T states. (a 5 T state saving from LDIR by loop unrolling) To move one screen will therefore take 4,915,200 T states + loop overhead, which comes out to 1/4 second just to scroll the screen.

Unless you have a strong need for compatibility with a specific vintage Z80 system, this just doesn't make sense. Its Waltzing Bear territory. The system is obviously crying out for a powerful graphics coprocessor, but that will more than double the development effort required and make it much harder for anyone else wanting to follow the project to source the parts to do so.

One possible answer is a hosted Z80 system. This has a long and honourable tradition behind it going back to the Apple II Z80 SoftCard. Not only does it solve the graphics speed issue, it also provides bulk storage, easy cross-assembling/cross-compiling and if done right, no need for EPROMs, as the host can stuff the startup code into RAM before releasing the Z80 (or successor) reset line. If you put a large dual port RAM between the Z80 and the host, with a FPGA controlling it (with full access to all Z80 bus signals and a register interface to the host), you can do stuff like implement a NMI driven in-circuit debugger for the Z80 giving you single stepping and hardware breakpoints with very little Z80 code required in the monitor 'ROM', which can entirely be contained in the FPGA for minimum 'footprint' on the Z80 side.

Host it off a full GUI capable Linux SBC, and write the graphics engine, storage engine and the bulk of the debugger to run on the host.

I would also suggest giving the Z80 CPU card a 24 bit address bus (yes this involves page translation even on a Z180 card, but that can be done in the FPGA), lots of fast RAM and S100 Bus edge connector. Memory (and I/O) mapping the S100 bus would be handled in the FPGA.

If you ever want to run it standalone, simply reprogram the FPGA to use the dual port RAM to output video, and to provide a basic BIOS and a SD card interface or similar for bulk storage.

legacy · « **Reply #46 on:** December 16, 2015, 01:03:11 pm »

Quote from: Bruce Abbott on December 15, 2015, 11:22:35 pm

V9958

what do you think about realizing a PCB ?
in case, I have already started this project
no time to complete

grumpydoc · « **Reply #47 on:** December 16, 2015, 01:11:12 pm »

Quote from: Ian.M on December 16, 2015, 11:19:23 am

The question should be: Is it worth implementing HiRes memory mapped 8 bit/pix video on a Z80 (or descendent) system.

A very valid point.

I must admit I've been thinking much the same but it's 2015 - designing a Z80 system has to be for fun so why not try to give it VGA graphics?

Quote

To do anything significant without hardware assistance is *SLOW*. Lets assume that for simplicity you only support VGA 640x480. That's 307200 bytes of RAM. Lets further assume a 20MHz clock speed and that one is using extensive loop unrolling e.g. large blocks of LDI instructions. LDI takes 16 T states. (a 5 T state saving from LDIR by loop unrolling) To move one screen will therefore take 4,915,200 T states + loop overhead, which comes out to 1/4 second just to scroll the screen.

Z80's, even at 20MHz are, indeed galacial when you start to think about it.

However you wouldn't design any Z80 system with a pixel mapped graphical display which needed block moves for scrolling - at the very least you would design the graphics so that a start address for the frame buffer can be specified, scrolling then just becomes a question of updating the start address.

I'd be inclined to use a separate graphics controller, there are quite a few to choose from intended to drive LCD displays

I'd also use one of the more modern Z80's - the eZ80 line looks interesting, certainly something with DMA to improve the speed of just shifting things around in RAM.

legacy · « **Reply #48 on:** December 16, 2015, 01:13:26 pm »

Quote from: Ian.M on December 16, 2015, 11:19:23 am

storage engine and the bulk of the debugger to run on the host

I have started a simple network filesystem, which can work over the serial port
this reduces the target complexity in several orders of magnitude!

as far as for the debugger, I have developed a simple TAP (VHDL) && host code (ANSI/C)
even if the TAP can be implemented to look similar to a gdb-stub, so it can be rewritten in C or assembly

the problem is: tooooooooooooooooooooooooooo much effort for these toys!

I need a day made of 96 hours instead of 24, where
- 7 gone asleep
- 9 gone wasted when I have to work (avionics soooo boring sometimes)
- 2 used for jogging || swimming && to piss out the dog !

Bruce Abbott · « **Reply #49 on:** December 16, 2015, 06:55:24 pm »

Quote from: westfw on December 16, 2015, 07:04:55 am

What are you using to create the GAL "binaries"? The modern CPLD/FPGA tools are huge

Atmel WinCUPL - 11MB on disk.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Retro Z80 project: Memory Layout and software management (Read 52472 times)

Share me