2.5A isn't nearly as much as it sounds -- it's obviously not 18V * 2.5A = 45W continuous output power, or even 1/4 of that (which would be realistic for a circuit at those full ratings into an AC load). It is that much capacity, peak, but this doesn't get you as far as you might imagine when considering it has to be done in a hundred nanoseconds or less.
As mentioned, the gate has a capacitive characteristic, so it takes peak current to make its voltage move. It also has some internal resistance, which is due to the gate being constructed from a thin layer of aluminum or polysilicon, as well as other internal losses.
The important datasheet figure is Qgs(tot), the total gate charge required to go from off to on (the voltages are specified; since E = Q*V, it takes more energy to reach a higher voltage, so the voltage is also important information for this parameter).
The internal losses of both gate and driver (which has an equivalent Rds(on) or output resistance specified in the datasheet) add in series, and this limits how fast the gate can be swung.
Typically, during turn-on or turn-off, a transistor will experience Iload * Vsupply for approximately the duration of switching. If it's a power transistor on the mains side of an SMPS, that might be 20A and 320V, so the peak power is extreme! To achieve useful efficiency, the duration of this power dissipation must be kept to a very small fraction of the cycle, never more than 1/20th and typically under 1/100th. Or for a typical 100kHz switching frequency (Tcycle = 10us), under 100ns.
Combining all these facts:
We can estimate the switching speed as:
t_r ~= 2 * (Rdriver + Rgate) * Qg(tot) / (Vgs(on) - Vgs(off))
A typical example might have Rdriver = 4.5 ohms (typical of parts like the HIP4081*), Rgate = 5 ohms, Qg = 100nC, and Vgs(on) - Vgs(off) = 10V (usually 10 and 0V, respectively). This gives an equivalent capacitance Qg/Vgs = 10nF, a time constant of 95ns, and a full switching time about 190ns.
(*Note, by the way, that 2.5A peak is only possible when the output is shorted to the opposite supply, which is only true for the most brief of instants during switching -- much of the time is spent at half that voltage, so the average current during switching must be lower, which is why it's often valuable to use even *larger* drivers!)
And that's without using a series gate resistor, which is often desirable to avoid parasitic oscillations. (The AC small signal equivalent of the gate terminal, actually looks like a small negative resistance, giving rise to oscillations in the 20-100MHz range, if the gate circuit has too much inductance and too little resistance.)
At 100kHz, 190ns is 1.9% of a cycle, so if we're dissipating 1000W during switching (typical of a 100V, 10A application), the average switching loss is 19W -- better have a heatsink! But not only that; this adds with conduction losses, so the total will be higher.
Conduction losses are due to Rds(on), which might be 75mohm for a device in this class. P = I^2 * R = 7.5W, but this is only drawn during half the waveform, so the dissipation per transistor is half this, or 3.75W.
So you can see, the switching losses can be quite substantial, and there is very good reason to keep it fast!
But speed comes at a price, and you must deal with the consequences of that risetime. All inductances and capacitances in the circuit get involved, whether you thought they existed or not! That includes stray wiring inductance as well as component parasitics. A simplified (first order) analysis of this can be made as well, but goes well beyond scope here.
Tim