Well, I've beat it to death by now, and got everything in 1.32m to compile without any horseplay. Something had been enabled that was sending out a 5us HIGH pulse on PD3, which wiped out reading the encoder (see the photo of the PD3 pin). I worked back and forth between the (heavily modified) config.h and config_328.h I got working compared to the distribution file and got it going without much modification. The ZIP file has the Makefile, config.h and config_328.h for the AY-AT with the precision LDO and resistors and the original 8MHz crystal. I've only tested the compile with WinAVR as that's all I have installed.
I still get erratic readings (10% variance) with cheap ceramic caps, stable with mica, film, aluminum electrolytics and tantalums, so I'm OK with it having 10% variation on cheezy caps that have over 110 ohms ESR. The other features that I tested work fine, so it's a GO for me. I did disable the 2.5V reference chip, and it didn't make any difference. I'd added some extra grounding to my board since the layout for ground is absolutely pitiful, but removing my extra grounds didn't change anything.
I modified the Makefile to use my USB ASP programmer under WinAVR; your configuration may vary. I used the -F switch (ignore ID bytes) with avrdude 'cos most of my testing was with an ATMEGA328 (no P), but the code works with 328 or 328P, no difference as far as I could tell. The last two pics are after an adjustment cycle with both 328 and 328P.
One suggestion: change the configuration in your standard config_328.h file to match the clone settings for the AY-AT. The default settings won't run the ST7735, which causes trouble for anyone with an AY-AT. Since it's the most popular board with that display, it makes sense to have the defaults match the AY-AT. Fewer complaints of "I did the build and my display doesn't work!" is a Good Thing.

I'd also recommend the
#define ENCODER_A PD2 to be
#define ENCODER_A PD1 for the same reason. If > 70% of the people here are using an AY-AT, then the defaults in the code should match the most common configuration. (I'm guessing on that 70% from the posts over the last 8 months).
EDIT: I recompiled for 16MHz and inserted a 16 meg crystal, and the inaccuracy with the ceramics doesn't happen, for whatever reason. The readings are stable, and roughly the same as what I remember Karl-Heinz' version reading. I'll do 1.13K next with the 16M crystal and update the results.
Update twice: I had to gut the menu out of 1.13K 'cos it was compiling to 112% of flash, and I couldn't find the wasted space where the bootloader was being added in. It was also unstable with the cheap ceramic caps at 16MHz, although Karl-Heinz did say that measurement accuracy would be effected in his code with a 16MHz crystal.
Thanks for all your work, everyone!