Sure - you can download KiCAD project with gerbers, as well as assembly PDF file
here. This PDF has been generated by Python script from pick-and-place file and gerbers, and helped a lot during assembly.
Routing this was actually kind of fun - you have to think ahead to make sure you don't block yourself before all pins are broken out. Kinda like playing chess
As you can see, this is a second revision of the board, as I made a number of mistakes in the first one (good thing I've discovered them before I've started assembly - so I only wasted money on PCB). The technique I came up with is to create a package pinout diagram in Excel, and mark pins as I route them so that I always knew which pins are already in use - with FPGAs you have a great deal of flexibility when it comes to pinout in a sense that you can reassign pins to fuctions in order to facilitate routing, but you have to watch out for the fact that different I/O banks can (and do in my case) have different logic voltage levels, as well as dedicated pins with fixed functions (power/ground, JTAG, configuration, etc.). I also didn't want to go with 6 layers as they are much more expensive to manufacture, which is why I had to settle on some compromises - not breaking out all FPGA balls, putting few signal traces on the power layer as well as having some "jumper-traces" on the top layer under BGA (which is not a recommended practice, but I had no choice). Desperately waiting for new Spartan-7 chips to become available, as they come in FBGA-196 package specifically designed to be broken out on 4 layer boards.
In general, routing big BGAs may seem intimidating at first, but once you get into the routine so to speak, it's not that hard. Just think ahead, break out inner balls first (to lessen the chance of cornering yourself), break out whatever component traces are going to first so that you will know in which order traces need to come from FPGA, then break out FPGA balls you want to connect them to making sure that the "bus" have same traces order as component you're connecting them to, and then connect them adjusting trace width if you need it (for example for controlled-impedance traces, or power lines which need to be thicker). Also since these boards cost non-trivial amount of money to manufacture, don't rush it, check, double-check and triple-check all packages' pinouts (unless you used them before and know for sure they are correct, I've messed this up more times than I'm willing to confess
), if you need traces with controlled impedance (as I do here for HyperRAM chips) - make sure you reach out to the fab you going to use to find out the layers stackup so that you can calculate traces' width from that information. Do backups often and be prepared to go few versions back it case you realize that you've cornered yourself (and trust me, you will do that few times until you get some experience). Use 3D view to check that there is enough clearance between components (especially connectors as they tend to have parts of them protruding beyond the footprint), also think about how you're going to assemble the board when you decide where to place components on the board, place noisy magnetics and power parts away from sensitive analog and high-speed traces. Another thing - even though KiCAD's DRC is a royal pain in the behind in KiCAD, make sure you run it and fix all errors which are actual errors (it often issues false alarms like complaining that package pins are too close to each other - nothin' you can do about that!, or some other nonsense like that), also this DRC tool seem to ignore copper pour-to-pour clearances so make sure you manually check that.
And since this thread is about assembly, I ended up using leaded paste for both sides. Assembled bottom side first, then used some bolts and nuts to make standoffs so that I can assemble top side. During top side reflow even big 1210 components stayed in place just fine, not to mention smaller parts. I've purchased a very affordable stereo microscope from AmScope (SE-410-XYZ, $200+S/H,
here), also excellent (even if quite expensive, but it was totally worth it) vacuum pickup tool with set of tips for small components from Zephyrtronics ($493 for the tool and tips + S/H,
here), and I was reflowing in cheap Chinese T-962 oven (with some minor hardware "upgrades" and community firmware). These tools made assembly a breeze, and I hope will serve me well in the future projects.