I disagree, but my opinion is colored by my own experience, and having specialized in parallel and distributed computing (in HPC, simulations and such).
When you can separate the logical tasks for each core, the end result can be much simpler than the same functionality running on a single core.
The key difference is understanding and using the various mechanisms one can use for communications (and passing data) between the cores, and how to separate the "jobs" effectively.
In other words, you do need to learn/know more programming techniques, but it is worth it in the end. Message passing, message queues and mailboxes, are extremely common, and very often used in systems programming especially in low-level graphics interfaces. Inter-core interrupts (where code in one core causes an interrupt in a designated other core or all other cores) are also useful, but more for synchronization than data/message passing. RPis use a mailbox interface for communicating between the VC and the ARM core, for example. If you care to learn by a combination of learning the background idea/theory and then experiment, I assure you can get the hang of it quick.
It does require some experience to design the separation between logical workloads in a parallel/multi-core system effectively, though.
If we look at historical precedents, the first parallelization primitives in programming languages revolved around parallelizing loops and such, which really isn't that useful in real life; we just had to learn the better ways, eventually.
I don't think there are any right or wrong answers as such. It is like many engineering things, whereby there are millions of different ways of skinning a cat.
A period of time, after I posted, what you quoted. I partly changed my mind, because some real-time tasks, would actually benefit from the separation (partly like how you describe), but I was thinking more on the lines of. One of the CPUs could have any intensive interrupt handling sections, with the corresponding timing jitter, and increases in maximum latency.
But the other core, could have few or no interrupts, and hence be able to have rather deterministic timings and progress through what it is supposed to do.
I see your point, which is a very good one. Which (if I understood it correctly), is saying that splitting real life software tasks/jobs (presumably in some but not all cases), between different CPUs (or threads), can be a useful way of breaking/partitioning down a task, into efficiently sized 'chunks' for software developers, to handle.
Because single core performance (barring possible future developments/inventions, e.g. Quantum computers, although that would be more like a huge number of tiny cores, acting in parallel), is unlikely to speed up that much, because of laws of physics limitations, such as the speed of light and limits as to how small, low capacitance and fast, real life producible integrated circuits, can develop into.
Also, that tends to use disproportionately much more power, than lower frequency solutions.
Whereas, having an ever increasing number of cores, in the same processor package, for CPUs or graphics cards. Tends to be cost, size and power efficient. So could well be the way forward.
But there could be barriers, such as Amdahl law.
https://en.wikipedia.org/wiki/Amdahl%27s_law