1. Maybe. Only the SLD(8S) passed 5a, and it looks like they're saying this is only true above a certain source resistance and below some pulse duration. Your TPSMC does not show that (appnote, Table 1a).
2. Sure, for what's shown. The TVS will absorb more power though; compare Fig. 13 to 14 -- the 36V part requires higher Rs to survive than the 24V.
If you can ride through the full pulse, or take just a little off the top (shunting to more than half the peak voltage), you can get by with a somewhat smaller diode (or more aggressive Rs, Us, td), or potentially none at all (if the rating is high enough).
MOVs are also an option, if the voltage rating is high. Expect to need a semiconductor rating of triple the MOV nominal rating: a 20V MOV should be used with >60V switches and such. MOVs are cruder clamping devices but much cheaper by energy than TVSs.
The capacitors don't really care. The nominal rating is what it is, but breakdown is typically many times higher. Maybe not safe to depend on that for very long, but then, we're talking very rare transients here.
FYI, you may need to use series capacitors, to allow for the case when one fails shorted due to cracking or breakdown. Often, the two are placed at right angles, so that strain on the PCB is less likely to crack both at once.
3. Well, yeah. Rather overkill for just ESD; it's rated 30kV, probably way more than that but who even tests to ESD any higher.

4. For more ordinary 15kV/8kV IEC or HBM ESD, an SMAJ size or smaller diode is sufficient. Can also get small (SMT chip) MOVs, which I don't think are really competitive in price for low voltage ESD application but it's nice to know they're out there.
You might also not use TVSs at all (for signals), but use clamp diodes to dump any ESD into a common (usually supply) rail, which itself has one TVS to handle everything. Mind the trace inductance between signals and whatever's sinking the surge: local bypass is a good idea.
Most of the other automotive transients are comparable or smaller (modest impedance, low energy), so can certainly be handled by the same sort of thing.
5. Current does not flow until the pulse voltage exceeds breakdown of the TVS. Then current flows, i(t) = (u_s(t) - V_br) / Rs. (V_br varies with i(t), but considering it constant is a good enough hand-wave. A simple refinement is to consider the TVS as having some internal resistance as well. They don't give this resistance in the datasheet (only a headline, "Low incremental surge resistance"), but Fig.1 hints at the behavior.)
6. Doesn't matter much; the filter can remove some of the surge voltage for you, and it introduces series impedance (the inductor specifically) which helps with the faster pulses. The inductor does need to handle the surge current, and this would be contraindicated for a small device. (For a water pump, it might not matter.)
7. Common mode only matters if there is noise
going somewhere. If this module mounts directly on the pump, say, it can be grounded to it, and there's no path for noise to get out -- all it can do is couple to ground, which is, well, ground. If your circuit ground is ground ground, it's a single point.
If your module is inline, installed on a bracket or in a module somewhere, say, it's probably necessary to filter the input, output or both.
"Common mode" means to treat all the input lines as one, and all the output lines as one. Which is true enough at AC: the input +/- are shorted together with a big fat capacitor. There's very little RF voltage between them. They act together. But together, they can act with respect to chassis ground say, or the output (motor winding leads) can act against ground, or the input. In that case, filtering may be desirable.
If you're driving the BLDC with synthesized sine waves (i.e., a full fancy class-D sine wave inverter), you'll need normal-mode output filtering anyway (i.e., each line with respect to circuit ground), which can simply be improved to give less RF vs. GND. You get CM and DM filtering at the same time, in other words.
If you're driving it with "modified sine" or other broad switching waveforms (the typical case), you will need to filter the switching edges, but won't be able to afford much filtering down near the fundamental frequency -- that would take huge caps and chokes.
Let me explain the situation here a bit.
These two situations (PWM vs. wide pulses) have the switching energy above or below Fc (filter cutoff frequency), respectively.
With a normal PWM filter, the filter's cutoff can be below the carrier and harmonics, so that it doesn't need to handle the full inverter power. Example, maybe Fsw is 200kHz, Fc is 40kHz, and the signal bandwidth is, well, not much -- 100s Hz is enough for a motor driver; a general audio amplifier would want over 20kHz, though a subwoofer amp could get away with a few kHz.
The spectrum of PWM has the signal reproduced baseband, and around each switching tone. For a 0-20kHz input signal, it appears in the output at 0-20kHz (this is why we can recover the signal with simple filtering!), around Fsw from Fsw-20kHz (lower sideband, flipped) to Fsw+20kHz (upper sideband), around 2Fsw +/- 40kHz, and so on. (The 2nd harmonic only shows up if idle PWM isn't quite 50%; the sidebands always show up though.)
Aside: if we wanted AM radio, we can PWM and get Fsw carrier plus +/- signal BW -- the desired output, an AM signal. By filtering baseband and harmonics out (bandpass around Fsw), we can get AM at high efficiency.
Anyway, if we effectively have Fsw below Fc, then there will be harmonics that get close to Fc, and the energy in those harmonics must be dissipated by the filter -- or it will ring like hell and draw some serious current peaks from the inverter.
There is a solution to this problem. While the inverter has a very low source impedance -- approximately 0 ohms for our purposes -- we can still design a filter that is terminated at just one end. (Filters work by impedance; filtering a system with zero or infinite (open) impedance is
meaningless! You must always find an impedance somewhere, and in particular, a resistance, as it is resistance that supplies and dissipates real power.)
We also want a filter that doesn't draw DC current, so we shall terminate it with an R+C network: the C decouples the resistor at DC.
Here's a simple example:

An H-bridge drives a balanced filter network. The supply impedance (2uH + RD clamp) is optional, and depends on the switching loop; don't mind that for now. The inverter drives a 10uH double choke, differential mode (the two windings act in series, and coupled together, they have 40uH total), then C's to GND. This acts more like a differential mode filter due to the coupling between inductors; this would be better with two independent inductors I think, but I hadn't designed it that way at the time (this was from a bunch of years ago).
Note the differential capacitance is 0.11uF (the 0.22's act in series, with GND in the middle). This puts the cutoff frequency at 1 / (2 pi sqrt(40uH 0.11uF)) = 76kHz. (This particular project switched at about 130kHz, safely above Fc.) The resonant impedance is sqrt(40uH / 0.11uF) = 19 ohms; the 51R + 0.22uF across the output helps dampen this, setting a maximum Q around 3, depending on load characteristics. (A smaller resistor would give better damping, but would also hog more signal at high frequencies -- 0.22uF is 8 ohms (the nominal load resistance) at 90kHz. With a 10W capacity, that 1/2W resistor may end up needing to be much larger, in unlucky circumstances.)
After that, a common mode choke of 112uH helps take down noise, and this is filtered against 0.2uF (acting in parallel, common mode). The CMC also has some leakage inductance (not labeled -- from what I recall, it was probably around 1uH) which introduces differential filtering against the 0.05uF (acting in series, differential mode), helping filter more RF crud but potentially not being well damped and having a resonance around Fsw or a harmonic (5th harmonic or so, I guess). Another R+C, maybe one each to ground, would be reasonable here.
The filter is quite sensitive to the load; in my case this was an audio amplifier, so some care should be taken to deal with ugly loads, like long speaker cables and piezo tweeters. For your case, the motor should be a well defined load and largely inductive at high frequencies, so the R+C should be adequate damping. Better load isolation can be had by putting R||L networks in series with the output -- this way even if the load is open circuit, the R+Cs still do their job, and if the load is shorted, the L decouples the filter from the load while the R provides damping.
The effect of adding lossy elements (R+Cs in parallel with C; R||L in series with L or source/load) is to smoosh out the frequency response around the cutoff (i.e., in the transition band), and make it less sensitive to source/load variation. Loss is a necessary part of this (filters only filter with respect to resistance!) and so one must be careful that there isn't too much harmonic energy in the transition band; or that the resistors have enough power rating to handle it.
Note that lossy filter components don't make good filters; an R+C is pretty bad at "capacitating" above 1/(2 pi R C), and likewise for the inductor case. Basically, though you're adding reactive components, don't count on them to meet your asymptotic attenuation. (In technical terms, the resistor adds a zero to the transfer function.) If you find you need extra filtering, consider adding another LC stage. Or if you find you need extra attenuation at RF, a small L and C may do (higher Fc). You don't have to stick to 100% named filter prototypes (Butterworth, etc.), the response can be much sloppier here than a perfectly sharp signal filter requires.
Tim