Thanks for the advice! I just decided to stop being lazy and finally invest time to get my netbook that's got ArchLinux on it going and usable, so it's running Diamond now.
Side question: Making an array of FPGAs, an array of arrays, aka a bigger array. Anything wrong with doing that to get bigger FPGAs for less money? (comparing Logic Blocks per dollar)
If you have what is essentially a monolithic design, that you intend to partition between FPGAs, you are creating a world of hurt for yourself! Whilst what you describe is technically feasible, you will have to partition your design into independent chunks with a minimal interface between chips which you define at the time of board creation (or at least define the number of pins and IO standard at board creation time). In addition it means that the partitions of your design have to have a roughly even split in terms of resource usage, which often is not the case. Not to mention the fact that you now have multiple designs to implement, multiple designs to meet timing on, multiple chips to program, each time you change your code, making your iteration time vastly greater. Assuming that you value your time, then you are likely better off buying the larger FPGA.
Having said that, if you have the same design which you want to replicate multiple times (one instance per FPGA) then the approach you describe makes a lot of sense. It could also make sense if your design consisted of several largely independent blocks with a simple and well defined interface. (E.G. You have a video processing system, which feeds into a video encoder, and the interface between the two is just a pixel stream). You would likely be best off using a high speed serial interface (with an FPGA that has appropriate gigabit transceivers) between the FPGAs if you need high data throughput.
There is no single chip that comes anywhere near the amount of hardware that I am aiming for. Currently the plan is to make little "modules" containing like 25 or 100 FPGAs on each, and some ROM on the other side of each FPGA (saves on BGA routing if the pitch is similar, just going straight through). This scales the hardware and ROM equally. I want to make a linux kernel set up for FPGA computing, no processors. As far as I know, FPGA computers are only novelty video game systems, the only serious attempt i've found is COPACOBANA (which uses multiple FPGAS btw) which is meant for encryption cracking and doesn't really count as it has no operating system. I know it's a massive project likely, years long probably, and it's gonna be way more expensive in hardware to match a CPU. But ima do it anyways, if I can. Rather than allocating time on a processor core for processes, I intend to make a kernel that allocates HARDWARE for processes (and probably needs a hardware equivalent of virtual memory mapping). Starting out i've selected a FPGA that has nonvolatile-instant-on-background-reconfigurable configuration memory, as this takes care of a lotta FPGA issues, eventaully tho there's an FPGA that I'd like to migrate to that doesn't have these features, but has the most bang per buck in terms of hardware. A processorless reconfigurable computer is what i'm aiming for, bigger things later lol.
Learned linux programs are mostly all distributed with source code, which makes me feel like there's a way to compile everything in a way optimized for FPGAs, rather than run a softCPU for anything that wasn't designed to be run on an FPGA. "We'll get there when we get there"
The FPGA i'm starting out with is the Lattice MachXO3L (the version with flash memory isn't a good fit due to the 100k cycle life of the flash memory, it'd not last long being used as ram in a computer). Eventually, I feel switching to their ECP5 chips would offer the most bang for your buck, aka, make it the cheapest to scale the computer's power up, imagine there's no limit on how much power we need, so you just want the cheapest.
Knowing little about FPGAs, having never programmed one, am i horribly wrong anywhere? Any major issues i'm not seeing?