Ah yes, back to OP's situation -- BJTs saturate to < 0.2V very easily (saturation is roughly as resistive as a FET's, so it depends on Ic), so getting a rail-to-rail output isn't any harder than with FETs.
With discrete FETs, you have the additional problem that, if you wire them up as a standard CMOS inverter (gates and drains tied, sources to respective VDD/VSS), you get massive shoot-through current (100s of mA, worse than a NE555!). We're talking several-ohms MOSFETs here! They aren't small!
So either way, you can get comparable switching speed and saturation (and shoot-through, if you're careless!).
Perhaps the best advantage to a BJT is how easy it is to make a current-limited output. Connect to the base(s) with a voltage divider, and add an emitter resistor. (Bonus points: add a diode in series with the supply-rail side divider resistor, compensating for Vbe. Current is more stable with temp.) (You can do this with MOSFETs too, but the dropout voltage is much higher and the current less well defined, because Vgs is higher and more variable, and Gm is lower.)
For slew rate limiting, you want to add an external "Miller" capacitor (i.e., drain to gate, for each transistor that's pushing around the output voltage -- for a complementary output drive, you need one each for P and N), and supply a modest (base/gate) drive current, so that the current acts to charge/discharge the capacitor, thus giving you the limited slew rate. Additionally, you may find an LC (ferrite bead and 100pF+) on the output helps smooth things out even more, especially for pins connecting to the outside world (also, add clamp diodes or zener/TVS) or that connect to long cable runs (minimize EMI).
Tim