Give your self a cookie.
At power-up contents of 74LS670 is unknown.
Disabling output and using pull up resistors puts top 16k of 4m space repeating 4 times.
A 74ls273 can be reset at power-up and one output can control output of 74LS670 leaving 7 for other uses.
This uses 5 output ports total for control ( 1 74Ls273,4 74ls670)
Power-up
4 writes to set 74LS670 and enable. now have full control.
Big problem with 74LS670 is address input to output delay.
Other option is cache ram chip that is faster then 74LS670.
Down side is A14 & A15 of Z80 must be directly connected to cache ram. This makes Z80 output more complicated but due able. You still need an additional chip to connect Z80 data bus to cache to write new map & proper logic to control.
Note a cache ram is just a very fast static ram that gives an output after ram delay.
One 138 decodes 4m into 8 512k spaces.
Reading other posts sense started this.
Memory timing becomes very tight and tighter getting up to 20M.
Think you are starting to see that fancy/powerful can be simple.
Note that one or more of the 512k spaces if not used gives future expansion.
One could be dual port.