A crude approximation for P_cond is the RMS current through the switches (for MOSFETs), plus diode drops (for diodes), or some combination of both (i.e., the internal resistance + Vf model for diodes, also applicable to IGBTs).
Speaking of analyticity, linear or quadratic models like this are simple enough to write, and may have straightforward solutions, enough that you can draw some conclusions on paper. As soon as you involve an exponential (ideal diode equation), however, the problem becomes transcendental: hopeless on paper, and you need an iterated approximation, or the Lambert W function, to solve it! (Such iterations are SPICE's specialty, however...)
Likewise, P_sw can be approximated as the triangular region between rising (falling) voltage and falling (rising) current. The switch dissipates P_cond while on, effectively nothing while off (typical leakage currents are ~uA), and P_sw when commutating between these states. If we treat the edges as straight lines, either coincident (resistive switching) or one after the other (inductive switching, more typical at least for turn-off), then power is the product of those lines, either a triangle (one after the other) or a parabolic section (coincident). In either case (give or take a splash of calculus), we can determine the area under the curve, and thus the switching energy. Finally, multiplying energy by frequency, we get the average power P_sw.
This is actually a bad approximation in most cases! Modern transistors have an extremely pear-shaped capacitance curve. The factor that makes a linear rise, is Miller effect: as gate voltage rises (falls), drain voltage falls (rises), causing a change in charge across the D-G capacitance. Because the transistor's gain is quite high during switching, the change in gate voltage is small, in fact it plateaus while drain voltage is changing most rapidly.
Well, during the Miller plateau, if Cdg is constant, and Ig is constant, then the drain voltage will be a simple linear ramp. But Cdg actually depends severely on Vds: it's over 100 times higher at low voltages (under 5V, say) than at high voltages (over 200V, say -- typical for a ~600V part). This means most of the gate plateau is spent at low drain voltages, where power dissipation is small.
This also means we don't gain much, in terms of EMI response, by introducing extra gate resistance. The drain risetime is dominated by the low-capacitance region (where it spends, say, 20% of its time, but 80% of its voltage swing). All we've done is increase the delay from gate-falling to drain-rising! To effectively limit drain dV/dt, we need to add a snubber, or add external (and linear) capacitance across D-G to extend the Miller slope. The former is preferable, since we're only adding more power dissipation in the latter case.
(However, if you are stacking devices in series, the latter is unavoidable -- switching times will inevitably mismatch, even if the gates are driven from a common transformer, for example. Ensuring all transistors are rising at the same rate, not just the same time, limits how much worst-case voltage any one transistor will see.)
Back to power, the consequence is that we will grossly overestimate how much P_sw these devices incur. But it's tricky, because it depends on which phase of switching you're looking at. If you have a hard-switching bridge circuit (like a class D amplifier -- in general, load current won't always be positive, sometimes it will go the other way because of a reactive load), you can have the unfortunate case that one transistor is off, but resting at ~0V (i.e., when it turned off, the voltage didn't simply swing up and away), and the other transistor turns on hard, yanking the switching node up and effectively shorting the supply into the other transistor. This goes slowly at first, because the other transistor has that huge capacitance at low Vds, then once it's charged past ~20V, capacitance drops off and fwoom, the switch node flings up at great velocity and usually overshoots and rings. In the process, switch current spikes sharply, loading the switching loop (the stray inductances between the high side and low side transistors, and the nearest supply bypass cap) with that peak current, which drives the capacitance to overshoot and ring. This ringing is often in the 30-100MHz range,
Incidentally, a lot of references, to this day, suggest minimizing switching loop area / stray inductance, as if it were somehow an easily attainable goal. This was fine, back in the days of bipolar transistors and slow MOSFETs -- but it is inapplicable today. You can't get the inductance low enough to reach the required level*. What the advice should say is: optimize. If we can't get it effectively to zero, then what value should we pick? A hint: consider the both-switches-open impedance, which is 2*Coss (the D-S capacitance, at whatever voltage we're measuring -- probably take slightly less than the worst case condition). Then consider the both-switches-shorted impedance, which is the loop inductance. The ratio of these has units of squared impedance, interestingly enough. So we might define Zsw = sqrt(Lloop / Coss), the switching impedance. Compare this impedance to the load impedance, Vsupply / Isw(pk).

*Some examples for illustration: a typical TO-220 device has 10nH lead + bondwire inductance. Put two together in a half bridge, plus another 10nH from each to the nearest bypass cap, and you have a total 40nH switching loop inductance. Modern transistors will switch in 10s of ns even without driving them hard, which implies, for a 10A load say, 40nH * 10A/20ns = 20V induced in that loop. (That's the inductor equation, V = L * dI/dt.) Expect a similar amount of overshoot!
Or a pair of PDFN-8s in a DC-DC converter, say 24V input, 5V 20A output. The switching loop has to deliver those 20A in that fraction of a second, and the loop inductance can't be less than about 4nH for the best possible layout (transistors on opposite sides of a 4-layer board, vias inbetween, bypass caps adjacent). We might have 4nH * 20A / 10ns = 8V here, a full third of the supply -- this induced voltage acts to momentarily reduce the supply voltage, so the switching will actually be that 1/3 slower as a result (i.e., ~4/3 times what we expected). And this feeds into the P_sw calculation.
Tim