If I understand correctly: ESR is just a measure of how fast/slow a cap can store/supply electrons at specific frequencies, literally the resistance?
Yes, that's pretty close. Most often, it's not being dis/charged very far, and just some small ripple voltage is present; in that case, we'd like to know how much voltage for a given ripple current, so need to know the impedance of the capacitor. The resistance part of which is, well, effectively in series, hence ESR.

At most frequencies, the capacitive reactance is minuscule, leaving ESR as the dominant part; obviously, reactance gets more important at low frequencies, which is why we're concerned about a bit of both at mains frequencies. But at switching frequencies, basically ESR.
Handy trick, well, not really a trick, more an observation -- if ESR * C is the time constant at which the part, by itself, can be discharged, then that's also the rate at which you extract zero energy from it. Because, it's dissipating it all internally, and if you connect a load, you necessarily increase the resistance of the circuit, making it go slower. And the slower you go, the smaller the fraction ESR is, out of the total resistance. So the ratio of resistances also gives the efficiency.
That's most important if you're doing deep discharges. Like power supply hold-up, or supercap energy storage, or batteries in general (which can be modeled as nonlinear capacitors: capacitance goes up with charge, so it gets harder and harder to raise the voltage, the more charge you put into it -- eventually flattening out to give the nominal terminal voltage at full charge).
Note that, even at a fairly slow rate, significant heating can still occur in the component -- it only takes say a few watts to overheat a small capacitor, and if you're drawing 100W on average (say a 10J pulse, 10 times a second), and the Q factor is about 10, that's 10 watts dissipated in the component.
So we tend to use electrolytics at relatively small ripple voltages, so that the amount of energy being used is quite small in relation to the total energy stored.
In the extreme -- for applications such as power supply filtering, bypass, etc., we're just using them for the fact that they give a usefully low impedance between the terminals (at AC), while drawing hardly any current (at DC). Indeed in such applications, we hardly even care that the part stores more energy than that tiny increment we're using; we just don't have any other solution to avoid also charging it up to that nominal voltage. (If we had something with effectively high capacitance under bias, and low near zero, we'd be set -- which, sounds suspiciously like a battery, if you've been paying attention! -- but alas, batteries have impedances much too high to be generally useful in circuits this way. There are, however, some specialty ceramic capacitors (specifically: poled ferroelectrics) that have this effect!)
Or put yet another way: whereas we might only need so-and-so capacitance to store the energy used during a cycle (in a bypass application), or such-and-such [additional] capacitance to also keep the voltage ripple low; we might still not have low enough ESR in an electrolytic of that value, to keep total voltage low enough; so we brute-force it by simply using excessively large values. Because electrolytics are so much cheaper and denser per uF than most types, this is often worthwhile!
I don't get that right now, but will further look in how Q is defined. I've looked into that a couple of times but it never really made sense for me (on a "oh, right..." level).
Although... Q is just a unitless thing that defines the "quality" at a given frequency, so then ESR is related to Q because that is simply how we define it. That would make sense... (if my assumption is correct)
Right, Q == Xc / ESR. Or for electrolytics, you often see "tan δ", which is essentially the reciprocal. (δ being the phase angle of the impedance, so tan gives the ratio of resistance to reactance; this can be a more accurate representation for high-loss components like these, which I guess is why they traditionally use this rating.)
And, it would be nice to think of a capacitor as some simple lumped equivalent, like a fixed resistor in series with a fixed capacitor; but that would grossly overestimate the Q at low frequencies, because ESR would be constant, while Xc is inversely proportional to frequency. So Q would rise arbitrarily, as frequency goes down. In reality, it levels off to about 1/tan δ, then falls again as leakage current takes over (in the mHz, that's milli with a small 'm'!).
I'm not sure what a good mechanical analogy is, because I'm not sure that there's any good and intuitive experience that's also frequency domain, and that's unambiguously not just viscous damping (~fixed resistance). For sure, there are fluids with some weird properties, like, consider a nice bowl of tomato soup: you stir it around, it seems... soupy enough? But let it settle, gently twist the bowl, see how it moves; it's viscoelastic, it acts as a gel, a rigid body, at low shear rates. This... seems like a much stronger effect than what you get in a capacitor (except maybe some ceramic capacitors, when you look at them up close enough that hysteresis becomes apparent?), so I'm not saying it's an example of the effect, more just to say that, definitely, resistance can vary with frequency.
The most direct analogy is probably some types of rubber; often used for shock absorbers, rubber consists of an extended molecular structure with fluid-like domains interwoven with cross-linked molecular bonds. The bonds give it solid strength, but the circuitous paths between crosslinks give it lots of room to stretch and deform, while the dangly bits of the structure act almost like a liquid sloshing around inbetween. As a result, the Q factor tends to be modest -- certainly still enough that you get a good bouncy reaction from the stuff, but also not like a bouncy superball that returns like 90% of initial--- well, actually that's fine, 10% loss is the same as saying a Q of 10, right? Anyway, particularly the stuff used for damping, tends to have a modest Q and over a wide frequency range I suspect, so it should be a good representation of this. I'm just not sure how intuitive it is, that that particular characteristic might apply (~constant Q vs. frequency).
You mean ESR isn't really important in this specific place in the circuit (being bulk charge capacitors right behind the bridge rectifier)?
Right. And the other ones, pretty much anything you find will outperform the originals too (but also, likely not so thoroughly outperformed that resonances might be uncovered), so, nothing to worry about.
Cheers!
Tim