Why is the 8051 obscene? What's bad about its architecture and how is it remedied in better parts?
Very poor support for anything except registers and absolute addressing, not pointers. Stack is only push/pop, can't be referenced with an offset. Most jumps are within a 2KB code page, rather than e.g. much more useful +/- 1KB relative. 64 KB RAM banks, not flat addressing.
Sure, it's fine as a direct replacement for random logic, state machines etc, with small amounts of code and small amounts of state written directly in assembler. But it's awful for targeting with portable languages.
Considering it was designed years after and is much worse to program than the 8080 you have to ask whyyyyy?
Of course the 251 is a big improvement, but that's not the topic.