... I think offhand a 5/5 rule should still work? For more than two rows in from the side, obviously, you'll still need multilayer construction.
Nope. 3/3 only. If you did use a 5 mil track you would only have 2.4 mil spacing which could work but might be tight.
The pads are 500um pitch, the balls are 300um and will need 250um NSMD pads (that's about 10 mil pads). That leaves 250um (10 mils) pad-to-pad spacing. Divided by 3 (space/track/space) yields 83.3um which is about 3.2 mils, so 3/3 spacing is good. Also, there is room for a single 250um (10 mil) via between 4 pads (dog-bone style), and it would have to be laser drilled at 4 mils, leaving a 3 mil annular ring.
So we have 10 mil pads, 10 mil vias and 4 mil drill. That's what I would do if I had to do it and had a very dense package.
However, this CP236 package has some room in the middle, so you might get away with doing this on just 2 layers and a cheap board house, but you need to find a house that can do at least 3/3 spacing so you can route between the pads to get at the inner rows. Depending on how fast you are running external to the FPGA, and how much I/O you are actually using, you might get away with it on just 2 layers. If needed, some inner row pads can be routed into the large space and can use larger vias there. However, anything high-speed off the FPGA will give you trouble on 2 layers, and like everyone else, I recommend 4 layers at least.
As an aside: OSH Park only claims 5/5, but users have been able to do 3/3 reliably, but they don't guarantee it. For 2 layer boards, the smallest drill size is 13 mil, with a 7 mil annular ring. That means they want a 27 mil via. For 4 layer boards, the smallest drill size is 10 mil, with a 4 mil annular ring (that's an 18 mil via). You might be able to get a relatively inexpensive 2 or 4 layer board from them.
Also, @FrankBuss's suggestion to consider external SRAM is a valid point and worth looking into.