Decoupling is overblown.
In a 2-layer design, you rarely have anything fast enough, or powerful enough, to require more than infrequent decoupling. When you do (switching regulators being the most common case), simply include them as part of the local subcircuit design -- don't assume an ideal skyhook is there to save you.
In a multilayer design with internal planes, you may have things fast or powerful enough to require frequent bypassing. But you also have the nearly ideal interconnect that is the power plane. Bypass caps almost anywhere on the board look equivalent. It doesn't much matter where you place them. If a chip needs very serious bypassing, caps can be placed beside the pins, or on the opposite side, under the component (e.g. BGAs). For the most part, caps on the opposite side, or not connected as directly to the pins as is possible, might as well not be there -- and here I don't mean that they can be removed entirely, but that they aren't acting locally, and can move around just about anywhere. They may however be removable, depending on whatever's on the planes, and what supply impedance is required.
By the way, the ideal connection to an adjacent pair of pins*, is with a pair of vias (to internal plane) on one side, and the decoupling cap on the other. This way, the pins see two impedances in parallel: the cap and the plane. You can equally well use no cap and two pairs of vias. Or two caps and no vias, except that you can't put a cap underneath a component, so...
Note that a cap on the opposite side (say in a 2-layer build) is farther away than a single cap on the same side, so doesn't do much. Single caps are fine in that case.
*VDD/VSS pairs are common, say with MCUs and such, for this reason -- lower ESL. There's usually a bunch of pairs spread around the chip -- all should be connected, if not to planes then to adjacent caps. Whether any given pair of pairs can be connected to a single common cap, say -- depends. (Sometimes, these pairs are temptingly close together!)
...
Excessive bypasses: I think it's more psychological (read: not based on electrical theory). "No one got fired buying IBM", you know? Throw in "too many" bypass caps and you have the same situation.
It's not hard to analyze these networks, and justify ones' use of decoupling capacitors. Transform pin lengths, trace lengths, via lengths and component body lengths into inductances, and construct the network from the layout topology (don't forget ESR of the various capacitors). This is an easy (under an hour?) SPICE exercise, and allows you to play with virtual placements and types (i.e., vary C and ESR and see how the transient response changes; vary the topology, or the trace lengths, etc.).
Or once you understand the underlying equations (it's all about poles and zeroes), it's possible to hand-wave through this and estimate what C and ESR is needed to terminate a given branch, or what topologies are easier to terminate, or can give better filtering or isolation between sections. Then SPICE it anyway for verification, then hook up a pulse generator to a real board and demonstrate it experimentally.

Tim