I think our sentiments are in general agreement. XC8 for example deals with large data structures using the FSR(s) rather than bank switching, so an additional level of complexity is introduced by the compiler to hide the mess underneath. And yes, I agree, sometimes the compiler refuses to generate code. I would say though that if you are using large data structures other than simple arrays on PIC16 it's probably not the right device for you anyway, as you say the RAM on PIC16s is pretty small in general. Apart from specifically generating a case to force this on PIC16, I can't remember recently encountering this as a problem in a real project.
I remember dealing with some of the early PIC16 and PIC18 compilers 10 or 12 years ago where you had to deal with the bank switching and memory allocation completely manually yourself, which was pretty time consuming, and as your project grew you found yourself having to juggle data structures around several times to make objects fit into the various RAM banks, it wasn't pretty.