The reason is fairly obvious, as already pointed out in this thread: the acoustic delay from a microphone mounted *on* the headphones, to the inside of your ear, is nowhere near long enough to do DSP, unless you gin up something really crazy with analog-based discrete-time processing. The A/D and D/A delays alone would kill you.
If it can be done in the analog domain then it can be done in the digital domain. The right converters can have a maximum latency of one sample and nothing prevents oversampling and decimation to reduce converter latency below that but diminishing returns are quickly reached because the latency of a single cycle converter starts out low. Then it becomes a matter of clocking data through a finite impulse response filter which has the same group delay as the analog filter it replaces.
Unfortunately the only thing this improves is the ability to tune the filter digitally and it requires more power and is noisier when these are already problems in analog designs.