1024 taps is a -LOT-...are you sure you need that many? Does the Altera have built-in IP for FIRs? I know Xilinx does, and you can trade off area for speed by feeding it a clock that's faster than the input data rate. A couple of months ago I used it to implement a 144 tap LPF on an Artix-7. 625 kHz sample rate, 32-bit, dual channel, 40 MHz clock, and it takes up 14 of the 120 DSP slices on a 50k LE Artix-7 (negligible amount of other resources).