Kind of two or three things:
1. Just straight up dynamic range.
If you need to work with 10V-scale signals, you need at least as much supply. It can be single-supply, with or without offset, but where everything is zero-referenced, that can get quite annoying, and the offsets really add up after a while. So a bipolar supply is of great help. Notably, audio is AC coupled, so this isn't strictly necessary, but it can be nice-to-have -- for example removing the turn-on *pop* as those coupling capacitors come up to nominal charge.
So, also consider other applications, like control circuits, analog computers, etc. Op-amps were used for a lot more, back in the day!
This point might not be so important for audio, mostly being in the couple-volts range -- but the extra range does allow for transients, and not having to worry about clipping as signals are amped, filtered and attenuated through a signal path, like in a mixer.
2. Common mode range.
Especially for EMI/RFI immunity, you need about as much supply voltage as the noise peak you need to reject. In commercial applications this is around 3V conducted, or 3V/m radiated; both of which can be increased by a modest amount due to resonance and antenna effects. (In the RFI range, it's a lot better to filter it off in the first place, especially against op-amps with bipolar input stages that tend to rectify and detect that noise. RF-insensitive amps are becoming more common these days, especially among precision types, where, to achieve their ~uV spec, even without bipolar transistors, low RF at the input is mandatory!)
And if you need more immunity, even with proper filtering/damping to avoid the super high RF and resonance effects -- if you need to meet the 10V levels of most industrial gear, for example, you simply need that voltage range.
A couple years ago I did a 4-20mA receiver and mux, for an industrial/automotive environment. Put in a cute little +/-15V converter on board. Just a single boost regulator chip did the job, only needed like 10mA total load, plus a bit of filtering to make sure it's not going to interfere with the op-amps.
3. Anything else.
Especially early op-amps, the output range loses 1-2V from either rail. "Single supply" types are the same way but to just one side, with the other saturating potentially pretty close to zero (but doing so, may require additional assumptions, like only pull-down resistors and other loads, so that the pulling-down output never has to work very hard, preferably not at all*). So you might go with +/-12V rails for example, but only have +/-10V of usable range, give or take.
*Especially for types like LM321/358/324, with a class-B output stage that produces very visible crossover distortion if conduction switches polarity. The pulldown is also very weak, being basically to the rail for some 50-100uA, and about a diode drop or more above there. So a doubly good idea to have enough pulldown to keep it behaved.
Same goes for the input common mode range (Vicm), which sometimes also caused the output stage to hard-saturate the opposite direction (phase reversal) when overdriven. Modern amps designs are (almost?) all designed to avoid this, fortunately. RRI (rail-to-rail input) types are also quite common, often with a modest worsening of input offset at the extreme (usually +V side) range of Vicm.
Or just plain old tradition / inertia. Who needs a reason to do anything? Not us humans! All the appnotes say +/-15V? That sounds good, let's do that! Do we really need it, is it even healthy to do so**? Who cares, manufacturer said it, it must be true!
**NE5534 for instance, as I recall just an amped-up 741. This reduces noise and distortion, albeit at great expense to supply current. Which, at the full rated +/-22V, can dissipate up to 350mW just sitting there, and a heck of a lot more under load -- a PDIP-8 is only good for a watt at room temperature, you don't need much load to exceed it!
As for modern applications -- we have a lot of low-noise and high performance amps available to us now. It can be quite reasonable to receive analog signals in the 15V range, simply dividing them down to start, with tightly matched resistor dividers to maintain CMRR (which may be internal to the amp, as in some instrumentation amps -- which are modestly well available as single-chip solutions, no need to design your own).
Or if that's still not good enough, there are even some weirdo op-amps that work even with their inputs outside (usually above) the supplies -- these are more comparators than op-amps as I recall, and their specs might not be that impressive (particularly input bias current, as I recall these all work by effectively being powered by the input voltage(s), it's just little enough that you can get away with it sometimes), but it's yet another trick that's available.
Most analog front-ends (AFE) are going to be feeding an ADC, then the rest is handled digitally, giving vast improvements in performance (bit depth, sample rate; power consumption too!), processing power, and especially configurability (the whole software mixer/effect chain can be rewritten in real time, if needed!).
And yes, digital circuitry is so many orders of magnitude ahead of analog, that it is, in fact, worthwhile even just in terms of power consumption, to bring everything into the digital domain, as early and often as possible. Despite using also exponentially more transistors in the process. It's truly amazing.
The downside is, all that software is arguably harder to write (more bug prone), and generally slower to write as well (with the tradeoff that, it can be patched/updated after release, while hardware can't be patched at all). It also costs essentially nothing to reproduce, besides the cost of the chip it runs on; which is often less than the cost of an equivalent analog solution anyway.
Tim