In general a project writes the FLASH starting at the base i.e. 0x8000000.
I have an "overlay" scheme where the base 32k remains (the "boot block") and the overlays start at 0x8008000.
I found that Cube does correctly program the FLASH from 0x8008000. Actually I have a separate Cube build (in a VM) for the overlays. So it seems that the Cube+SWD debugger system does not just erase the whole CPU. But how does it do it? It so happens that the bottom 32k is a separately eraseable block on the 32F417 (which, no surprise, is why I chose a 32k block for the boot block) but what if you wrote the linkfile with a starting address of base+4k? Cube would have to read the block first, and merge the data, before programming.
A second question is what happens when you try to single step through the overlay. Obviously, if you build the boot block and the overlay in one Cube project (various threads in years past, with people telling me I am stupid, and boot blocks should always be separate projects) then Cube has access to all the symbols and it all works. But what if you build the overlay in a separate Cube installation? That works too in as much the overlay runs ok, but you can't debug it.
Now, the running Cube does not have access to the symbols used to build the overlay. It's an interesting problem. For it to work, all the addresses relating to the overlay (text, data, bss, common, etc) must be identical. This is achievable but turns out to be very hard. I struggle to see why, because the GCC tool set seems to generate the output in a varying order. I am sure there is no guarantee on the order, so lots of people will say I am stupid, again. But why is the order different between two builds? One with a base of 0x8008000 and nothing below that, and one with a base of 0x8000000, some code in there, then a gap, and then code starting at 0x8008000. In those two cases the ordering of stuff is not the same from 0x8008000 onwards.
What happens is subtle. Single stepping works starting from 0x8008000 but after maybe 100 lines the next step lands in hyperspace. It seems to happen not with my own code but when calling a stdlib etc function like memcpy. The locations of such functions are determined by the linker, but in some random way.
It should be possible to get data and bss in the same order but even that is elusive; if you have say
uint32_t fred1;
uint32_t fred2;
the linker can just place fred2 before fred1. Usually it doesn't but sometimes it just does it. Then due to the effect of various align 4 directives in the linkfile, the addresses all change from some point on.
If I was writing these tools I would never do something which gratuitously reorders stuff.
Anyway, it may be possible to transfer a symbol table from the Cube used to build the overlay to the Cube used to build the boot block or used to build the whole project. Then you could step through the boot block and continue stepping through the overlay (and set breakpoints etc). Is there any way to do that? It would have to override existing symbols of same name.