I think you got it right. The minor difference between what's proposed by me vs. nctnico is the copying step in bootloader. Copying B->A allows you to use single app binary linked to single address space, but the cost is somewhat increased complexity in bootloader (it has to erase and write flash, not bad at all but still something), halved write endurance of flash (probably not a big deal at all), and significant boot latency after the update (erasing is slow, often in seconds) - this might be a real problem.
IMHO having to link two binaries is a non-issue - you just run the linker twice with slightly different linker scripts -, but this is also matter of taste. If you do that, you can just choose whether to run A or B from the bootloader. Downside is having to either communicate, from the app, to the updater, which of the two files the MCU app wants - easy if you have some kind of periodic status message anyway; or always providing the two files concatenated each time, doubling the file transfer length through your update link.
If you want 100% robust flashing against power failures, and I think you should in $current_year, there is no way around effectively halving the flash, it happens in any scenario except the non-robust one.