FPGAs have some serious limitations. Consider the following task: there is a 48-bit counter incrementing by one every clock cycle and compared against a set value. Simple, right? Design has to have zero latency, (meaning: pipelining is not allowed.)

`always@(posedge clk)`

begin

counter<=counter+48'b1;

if (counter==set_value) out_compare<=1;

else out_compare<=0;

NO.

for one: most recent stuff (Virtex 7, Spartan 6 etc) have 6 input LUTs which means that comparison of two 48-bit values will be synthesized as 16 LUTs, each performing comparison of 3 bits from one value and 3 from the other, then results of those 16 LUTS are ANDed by another 3 LUTs and then this one is ANDed by another LUT. So you have 3 levels of logic which pretty much limits you to below 150 MHz. Now you add a binary counter. Synthesize that in FPGA fabric and you go like 'w00t, 40MHz'. That's because due to carry logic setup time of all your DFFs (48 of them) adds up. Ok, but there is a DSP48A1 on a Spartan6, why not use that? You can, it needs about 2 clock cycles to reload to certain value, start counting down and spit out first valid result.

When you can pipeline stuff at will, you can get your 300Mhz or whatever, but when this is not and option, things start to get hard.