Differentiating characteristics of transistors:
BJTs -- the base can be made thin and wide, which causes high resistance (giving a long time constant between applying base voltage and charging junction capacitances) and high hFE. Usually these have a low noise figure too, e.g. 2N5088, BC847C.
Junction breakdown voltage is inversely proportional to doping density (for a given doping profile), while hFE is proportional (sort of). High voltage transistors have lower hFE, and vice versa. hFE is higher when emitter doping is stronger than collector, so Vebo is almost always small, say 7V or less.
To get both high gain and high speed, higher doping and thicker base regions are needed. Less base width is also a plus. Whereas BC847 might be made with a single set of diffusions (so the base is connected to a ring of metal, where that diffusion touches the surface; and the emitter is diffused into the middle of that), RF transistors are made with many "fingers", increasing the perimeter and therefore reducing the resistance to the center of any area of the base. So, RF transistors typically also have less hFE, Vebo and Vcbo than general purpose types.
For FETs, it's much the same: less spreading resistance and more optimal capacitance. Specifically, in comparison to switching transistors (which are optimized for lower losses at switching frequencies, with less regard for input capacitance / matching), since linear operation is common, the pressure is towards larger die areas (more dissipation, freedom from 2nd breakdown) and smaller channel widths, leading to higher Rds(on) (hardly matters -- linear range), lower capacitances, and also the design is optimized for lower feedback capacitance (which a lot of modern switching designs finally achieve too, but this has been a constant pressure for RF parts).
As for transistors intended for specific bands, it's usually a combination of things: the power dissipation is suitable for a typical application (low power portable up to commercial/industrial power amps), the fT is some times the intended band (usually 5-20x, so that the power gain is reasonable), and the package is, well.. more or less suitable for the range. This is tricky, because there are a number of transistors in conventional (i.e., terrible) packages, like TO-220, claimed for service at 30MHz and up. As far as I know, they can't even be tuned for unconditional stability, and the s-parameters often show negative (i.e., generates power out the input port -- negative resistance!) values...
The good ones have suitable packages, typically with low inductance flat leads. Transistors made for UHF+ are exclusively in this way, usually a wide flat pack with input (gate?) on one side and output (drain?) on the other.
BTW, there are also "pre-matched" transistors. These incorporate matching/tuning circuitry for a particular load (50 ohms?) at the frequency range of interest, and aren't of much use outside that range.
Tim