To recap the options:
* GPIO
Quite obvious, if you want to communicate something that is just a single command, a sync signal, enable signal, etc., why not just use binary input/output. Could come from a logic gate, or a push-button - why not MCU-to-MCU. Or is this too obvious/trivial?
When larger amount of data than a few bits are needed - for example, ADC measurement values, setpoints, audio, image data, waveforms, commands to do something, then we'll need a generic digital communication bus that can transfer arbitrary amount of data:
* UART
Great for asynchronous commands / response streams. Big minus: does not implement packet delimitation on hardware. The communication unit is typically 8 bits. You have to write delimitation where message begins and where it ends.
* SPI
Great for exchanging memory regions, for example, giving setpoints and receiving measurements. Big plus: hardware delimitation of packets. Big minus: many MCU peripherals are broken, buggy or hard-to-use so said plus may be impossible to fully utilize.
You can use UART for what SPI is best for, or vice versa, it's just that you need to write a bit more code on top of the low-level protocol.
* Bit-banged custom on GPIO
If low data rates are OK, bitbanging on ISR is trivial. Now you can do anything, like use Manchester encoding to make a signal that can pass through a transformer or coupling capacitor.