I've been doing some CCP and LCD driving plus some weird calculations with SPI/I2C on top of that and I haven't yet hit that wall of code space...
I did ponder about external ROM and RAM etc somewhere along the line but I guess I'm just prematurely worrying...
With that said, XC8, XC16 (I haven't really explored XC32 all that much) in its free form is quite satisfactory for your run-of-the-mill everyday embedded design.
optimization in XC8 is not for code size but rather because the compiler add unnecessary bloat
in older version you could even see instruction put there just to deliberately waste time!
yes, of course more instructions == more code but i'm talking about crippling, not lack of optimization. Not a real problem anymore, though
XC16, XC32... they produce fine code with -O1 already, which is available in free mode