Gate or base spreading resistance, and capacitance.
BJTs have a dominant-pole characteristic, with respect to current gain. Current gain drops to 1 at f_T, but realize it's still possible to have power gain above that frequency: you just need to achieve enough voltage gain as well. (Real transistors may not be capable of stable operation at enough voltage gain to be worthwhile, even in grounded base operation, because of Cce and Lb.)
The fastest Si BJTs are basically made like any small signal RF type, just "more of it". At least a few RF transistors throughout history have been, basically, a gazillion 2N3904s, on a single die, wire-bonded in parallel.
MOSFETs have more of a diffusion characteristic, because the gate spreading resistance R_G is distributed along many cells. Power MOSFETs (most kinds) are optimized for low Rds(on) and switching loss, at the expense of R_G and capacitances. In particular, capacitance at low Vds is on the "extreme" side. High voltage SuperJunction types have ~half their drain charge below 20V! This is good for switching, because more time is spent transitioning through low voltages, keeping losses low.
SuperJunction transistors have conspicuously small feedback capacitance (Crss) at useful drain voltages (>20V), so they might actually be useful. You don't have nearly the power dissipation capability for a proper RF application, though (example: a 600V, 5A transistor -- that's a 3kW switching SOA -- might dissipate 50W on a cool day).
The 2N7000 is a classic "small signal" transistor. But it has a HUGE junction. With almost an ampere of capacity, but a fractional-watt rating, you have to keep voltage or current very low. Low impedance means more bandwidth, so we choose low voltage and high current.
Typical example:
(2N7002 is the SMT version.) This is a cascode with 50 ohm input (and a small attempt at matching), 12.5 ohm load (the twisted orange wires go to a transmission line transformer, matching the output into 50 ohms). Operating at 200mA and 3V per transistor, the -3dB bandwidth is something like 50MHz. The midband gain is good (I forget what, 18dB or so?), but gain remains modest until quite high frequencies (it's still something like 6dB at 200MHz).
Such a gradual slope is indicative of a diffusion effect, or at most, a single pole. (But you'd expect a two-pole cutoff (-12dB/oct), from the cutoff of source resistance with gate capacitance, and the load resistance with drain capacitance.)
RF transistors are made with the smallest junction possible, and heavy enough gate metallization to keep going up to much higher frequencies. Rds(on) is quite high, so they're pretty bad for switching applications, but in a linear amplifier, you don't care about that.
Typical silicon RF MOSFETs run out of steam around 1GHz, with the fastest pushing around 10GHz, and the slowest being suitable for SW and VHF applications (the advantage being, the slow ones use conventional packages like TO-220, and are available in high voltage ratings). Still higher frequencies are handled with other semiconductors (SiGe:C, GaAs and GaN being the most important).
Tim