If you want to run code/data seamlessly from externally memory devices, then make sure said devices can integrate into the memory map of your device, preferably unbanked.
Of course you can connect a (parallel) SRAM chip to any microcontroller given that it has enough I/O; but you don't want to be introducing any read/write memory functions that slows down access, and most of all this is then not seamlessly accessible through the memory map.
Well, you still can, but if you need the performance then you don't.
The FSMC on STM32F407 (and others) can integrate in the memory map (figure 18). It has a maximum speed of 60MHz, which is slower than internal SRAM, but still workable.
You can just initialize FSMC for your memory device and then access it through the physical address from figure 18. That's what I did in a project where I used 512kB SRAM as a bulk sample buffer. Using some pointers starting at 0x6000 0000, that works.
If you want to place variables from your C code with preinitialized values , then things become a bit more complicated. I think you'll need to modify the linker script to tell the compiler it can reserve data there, but also modify the startup code so it copies data from FLASH to RAM, or zero's the values in that region. In addition, it also needs to initialize FSMC before it starts doing this. This is all happening before main() is called, some IDE's will even hide the startup script from the user if it's not necessary to show it to them .
DMA to external memory is not a problem at all. If you look at figure 5 of the STM32F407 datasheet, you'll see that FSMC connects straight into the AHB matrix. DMA operations are just Memory <> Memory transfers, and can still achieve many MegaBytes/s transfer speeds without much problem.
If you step up to STM32F46x series, they also have quad SPI mode with memory integration. You set up the peripheral how it can random access and read the serial memory, and the hardware will handle the rest for you whenever you want to access it from your code, DMA, etc.