I wouldn't worry about it.
1. MII isn't terrifyingly fast. It's just a CMOS signal. Routes much shorter than the edge time (a few nanoseconds?) do not need termination; long routes can be source terminated (if point-to-point routed, which is likely the case here).
2. With a large amplitude, and high threshold, signal quality isn't a big priority. (Do prioritize CLK quality, though.)
3. Parallel planes are great, to the point where capacitors tend not to have much local effect. You can dot around a dozen bypass caps and probably not worry about local bypass anywhere (ah, but better safe than sorry, of course -- you can always test DNP's later, if time allows).
Consider this: an edge of 1ns fits within a ~20cm span. The image current, of the wave front traveling along the trace, over the ground plane, and dipping (along the via) through a hole in the plane, and out the other side, acts like an impulse current applied between the edges of the ground plane holes. That current spreads out, radially, in the space between planes. As it goes, impedance starts low (10s ohms) and drops proportionally with distance. By the time that wavefront is 5 or 10cm away (reaching its peak current flow), the VCC-to-GND impedance is perhaps single ohms. For a peak current on the order of (delta 1.6V) / (100 ohms) = 16mA, expect a peak VCC-GND voltage, as measured at the edge of the holes in the planes (assuming you could probe such a location!), on the order of (16mA) * (single ohms), or well under 100mV. With a ~1V noise margin, the impact of a single via is negligible here.
(Assuming a ~100 ohm trace, source terminated, 3.3V supply. So the initial wavefront is 1.65V, followed by the 1.65V reflection, for a total 3.3V swing.)
Now, if you have, say, low noise, sensitive analog or RF circuitry right beside here -- you may need additional bypass, or even a split plane. But for just digital, medium speed, it's very noncritical.
Tim