One year later... Now I have bought a Vishay VHP101 100 ohm resistor (Y4078100R000T9L), which should be the best in its price class with the advertised "typically 0 TCR". (I'm aware these specs can be debated.) I mounted it like the previous Vishay Z series (Y1453100R000V9L) in a shielded metal box. I have also upgraded from Keithley DMM6500 (6.5 digit) to Keithley 2010 (7.5 digit).
K2010 settings: Four-wire mode, offset compensation, 5 NPLC plus 20 samples averaging (12 seconds in total). I use scan card 2000-SCAN to alternate between the two 100 ohm resistors.
A three day run was telling. At first, there's a big dip for 1.5 hour while the K2010 warms up. The specs say the warm-up time is 2 hours so this is ok (although much slower than DMM6500).
After that, there's a clear correlation between the readings of the old Z series (yellow) and new VHP101 (green) resistors. The VHP101 starts out at roughly -6 ppm and ends at -1 ppm. The Z series starts at +2 ppm and ends at +7 ppm. I blame the K2010 for this overall +5 ppm drift. K2010 specifies the 24-hour accuracy as 15 ppm of reading + 9 ppm of the range. While this specification isn't the exact same thing as drift, I guess it's an indication of what to expect. (DMM6500 specifies 20+20ppm so almost the double).
I haven't calculated the short-term noise yet, as was the original topic of this thread. Logging the temperature could be helpful too, but so far the drift continues unabated so I don't think it's due to temperature variations over the day.
EDIT: The K2010 short-term noise is about +- 1 ppm peak-to-peak for both channels, which is roughly half the noise I originally measured with DMM6500.