Mind, I haven't watched the videos, I don't know what explanation he presents (if any). If you had a blog post or article or book excerpt to refer to, that would be much easier to flip through.
Capacitance depends on desired supply ripple voltage. A typical figure is 10%, so, at 85VAC, that's about 120VDC, and 10% is 12V.
Mains is only supplying current part of the time; the rest of the time, the capacitor discharges through the load. We can use a small-voltage approximation to this exponential discharge* and use the capacitor equation, I = C dV/dt, changing dees to deltas. At 50Hz * FWB, we have 10ms between pulses, about 100mA load (12W input seems reasonable, it won't be 100% efficiency), so C = (0.1A) * (10ms) / (12V) = 83uF. (This is peak-to-peak ripple, by the way.)
*But the SMPS is a constant-power load actually, not a fixed resistor; in fact, it's a dependent negative [incremental] resistance. This gives a parabolic discharge curve (i.e. it accelerates downwards). But again, we can approximate a small segment of this curve as a line.
The FWB conduction angle is more than zero, so the dt will be less than this; but certainly more than 5ms or so. The exact value depends on the ripple fraction (with some inverse trig operations to calculate where the lines intercept the sine peaks). This is the worst case [approaching zero conduction angle], so we can call it a safe overestimate, and we're certainly no more than double what we need to be.
83uF still sounds high; we might compromise further and say it only needs to be 10% at nominal rated voltage, maybe 110 or even 220 VAC. Or we don't care about Vpp, so use 20%, for a 10% Vpk instead. These give smaller values say 13-40uF.
The ultimate limiting conditions are hold-up, and heating. The supply voltage must not dip so low that the regulator would shut down between cycles, at minimum input (85VAC). We can use this calculation,
https://www.seventransistorlabs.com/Calc/PSHoldUp.html#cp to find that point. Note that, as ripple goes up, conduction angle also goes up, so the capacitor handles less of the dropout. Ballpark calculation, say the input is 120V peak, reg shuts down at 60V, draws 12W, and needs 10ms hold-up: that requires only 22uF.
The other part is heating. Electrolytics aren't good for much ripple fraction; they're very good bypass/filter caps, and that's it. A ripple fraction of 10 or 20% is usually about it. So 22uF might get a bit hot at 85VAC and full load, but we might also argue something sneaky, like we're not rating lifetime at 85VAC, so heck it. And the performance at 120VAC or higher will be fine so we don't mind.
So, in the ballpark of 22uF total seems wise. You can't use much less without sacrificing hold-up, Vin(min), lifetime, etc. (Also, give or take a more accurate calculation of supply ripple and conduction angle, as again, it matters more at high ripple fraction.) You can use more, but the power factor gets worse (narrower conduction angle), and cost and size go up.
If he arrives at the same minimum value, then that's fine.
Based on your quote,
2. How to determine the cutoff frequency? At the time 3:54 (EMC Filter Design Part 4: Differential Mode EMC Filter Design Down to Component Level with time code) Ali says that "we have calculated the biggest harmonic content that we have to attenuate to a certain level, therefore... let's say for now that the cutoff frequency needs to be 7kHz". But how did he do it? How can I calculate this cutoff frequency in my case?
it sounds like perhaps he arrives at another minimum value, but not necessarily
the minimum value. A purely signal filtering argument would be fine for X1X2 type (across-the-line) film caps in a typical (CMC based) line filter. Which is done at mains frequency not DC, so we want to avoid using excessive capacitance there (it's nonpolar, bulky, expensive, draws reactive power). The same isn't true of bulk filtering though.
As for cutoff, it's a three-element so 3rd order lowpass filter. We can model the system as a 100 ohm termination (mains source, both LISN lines acting in series in DM), shunt cap, series choke, shunt cap, AC current source (load). We know the load is pulsed (square wave) at Fsw, ballpark 50-200kHz (these one-chip regulators are pretty messy affairs, using hysteretic or burst operating modes, they draw/deliver a broad spectrum, likewise the output voltage ripple tends to be rather mediocre), and at Vin(min) we have say 200mA DC load so it's going between 0 and 400mA. Or +/-200mA around the DC baseline. If we need well under 60dBuV at the LISN (i.e., 1mV), that's a transfer function of 1mV/200mA or 5mΩ (that's transresistance, because we're measuring the voltage and current in different places!).
If we had a plain old 100 ohm matched filter, of one-port-open type (shunt C required on SMPS side), the 200mA would drop 20V and we need 20/1m = 20k or 86dB voltage attenuation. A 3rd order filter gives -60dB/dec. so we need a bit over a decade, or from say 50kHz minimum Fsw, 1.84kHz Fc. Which gives ballpark Xc = XL = 100Ω at 1.84kHz or 0.86uF and 8.6mH (actual values will be in some ratio to these, depending on the exact filter prototype chosen i.e. Butterworth etc.).
By intentionally choosing (or, as it may turn out, necessitated by other constraints, as above) a lower Zo, we get less voltage ripple in the first place, and so need less overall attenuation. We still need to match to Zin though (the 100Ω from the LISN), or else the filter has a strong peak (resonance) at Fc; or we must supply losses ourselves (which is fine because we don't actually care about insertion loss, aside from at LF/DC). We can trade voltage attenuation for impedance, and so achieve the required transresistance with less filtering action. If we start with 10uF, that's 0.3 ohms at 50kHz, which gives us 63mV ripple at the SMPS already. Actually it'll be quite a bit more than this, as ESR dominates at these frequencies. Evidently, we only need another ~40dB attenuation, so Fc ~ 2.3kHz would be fine, and evidently Zo = 6.86Ω so we can choose L = (6.86Ω)^2 * (10uF) = 470uH, conveniently small. Again, this doesn't account for ESR (and the inductor will have some EPR (effective parallel resistance) itself), so we should choose a somewhat higher value to be sure.
So the typical values of ~10uF and ~1mH each, are in the right ballpark.
Perhaps if you go through these relations (I've used nothing more advanced than the reactance of an inductor / capacitor, i.e. XL = 2 pi F L, Xc = 1 / (2 pi F C), and variations thereof, e.g. Zo = sqrt(L/C), Fo = 1 / (2 pi sqrt(L C)) ), you can simplify things down to a point similar to what Shirsavar gives. Or discover which assumptions were made to get there.
Tim