1
FPGA / Flash backed LLMs on FPGA?
« Last post by NiHaoMike on Today at 04:59:35 am »My understanding of LLMs is that there's a very large "lookup table" that's iterated through many times while it generates the output. It's normally stored in RAM which gets quite expensive on the larger models. But what about storing it on Flash instead? I see a cited reason is that SSDs are too slow and would bottleneck it, but what if you just had a FPGA with a lot of Flash chips wired to it? Those are readily available in the form of "ioDrive" SSDs, with 365GB versions going for about $32. Reprogramming one of those for LLM use would be quite a task, but would it even be technically feasible?