As mentioned repeatedly, do not even attempt to dead bug prototype this chip.
Even if you can work like a machine night and day to absolutely perfect results the very first time (and you cannot, make no mistake), all of your decoupling caps will be too far away to have any effect, all of your controlled impedance traces and differential pairs will be mismatches, and you will never see anything close to the full performance of the device if you can get it to boot at all. And this is if you can actually execute 1500 pins worth of fine pitch dead bug soldering without accidentally mashing them and ripping balls off the chip, coughing and getting dirt in a place that cooks into a short, or otherwise.
With packages like this reflow is the only option. Vapor phase soldering may not be a requirement, but at the very least you need a board with a footprint, a preheater, a good hot air station, a BGA stencil appropriate for it, good solder paste, and good flux, ideally stuff designed for this kind of soldering. That's a bare minimum even with good soldering skills.
I'm not the kind to tell people not to do something because they can't, but it is frankly foolish to attempt such a huge project without understanding the constraints. 1500 fine pitch contact points (a few hundred of which are probably mandatory just to power the device up) should dissuade most from attempting it... but the physical limitations of such a layout are too great to overcome with just that kind of persistence, unless your target clockrate is just a fraction of what it can operate at. The reason decoupling caps are put on the bottom of the PCB for big BGAs isn't just to save space, it's because the loop area at high frequencies to and from the cap can make it functionally nonexistent if it was put on the top of the board, next to the chip, in many cases.
Anyways, see what you can do with the chips on the boards they are already on - it will be much easier with decoupling, power, and soldering taken care of, and then you can try to find/breakout IOs and things to reprogram at least somewhat to your desires. When you get some comfort with dealing with the chip and you have at least one that works, maybe try fabbing a carrier board for the chip in a CAD package. It will be an immense amount of work, dozens of hours, but with a board from a PCB fab with tight enough tolerances to accommodate such a big chip, you may actually be able to keep decoupling caps and rail generation close enough to use the chip normally and take advantage of all of its IOs. In the mean time, trying similar work with MUCH smaller chips would probably help develop the skills and understanding needed to deal with such a beast.
To that point - maybe an early project would be to try and dead-bug some DDR memory. I'd be impressed if you could even get it to work (and that's not a slight on your skills, this comes down to layout restrictions and precision). Again with trace length matching and impedance control, even a chip with a tiny fraction of the number of balls and operating at a lower frequency, I don't think that memory would be usable at rated speed by a normal ARM MCU in a prototype situation without a PCB. There's a possibility, sure, and it may be a fun exercise to try..... but for the big timesink it will be, at least it won't cost hundreds of dollars and it should give you an idea of how difficult the task you're describing is.