I have not looked inside gcc, but I thought it would follow a 2 (at least) phase process; parsing c code into some middle abstraction, then generating code from that, for the target cpu. are you saying that those 2 conceptual phases are not clean, in gcc's architecture? (I'm not a compiler back-end expert so I don't know how hard it would be; but I always thought that compiling a high level language would have nothing at all to do with how the processor implements it. c flat model? what do you mean by that? c is just a language and it does not force any binary generated code concepts over than pushing stuff to a stack before a call, etc etc. not sure why C is 'hard' for some cpus, but there must be some subtle details I'm not seeing, in how gcc is actually implemented.
in general, I do find it hard to understand why a high level language would, at all, be 'easier' on some cpus than other. code gen is code gen and of course, you need a code generator and optimizer for each different cpu arch. so what?