I don't particularly care for this style and currently see no specific justification for using it in VHDL with modern tools (if anyone has a good rationale, I'll be glad to read it). I have seen it used quite often by older people (so this is still likely to be the preferred style of many older teachers).
For all of us old geezers out here, can you post an example of the newer style preferred by younger people?
Just take the two process FSM (one clocked to stuff next_state into current_state) and one combinatorial where the actual work is done and rewrite it as one large clocked process.
Really, it doesn't matter if the combinatorial process is glitchy as long as everything settles out before the next clock because other processes aren't going to trigger off the output until the next clock anyway. Even inside the large clocked process approach, signals still transition (and bounce around) following a clock transition. As long as things settle out before the next clock, no harm, no foul.
Exactly and exactly. This FSM style is quite common as I said above, although I personally don't like it and see more negative points than positive, but that's just my opinion. I tend to favor clocked processes only, which overall look more readable to me, although "clockless" processes are certainly possible. There is no inherent problem if you use them wisely. Timing analysis will normally catch potential issues as long as it's in the same clock domain and remains inside the FPGA. (Also, don't underestimate modern synthesis tools. They do a lot more than translating your code "literally". Trying to "hand optimize" code on a very low level, based on assumptions of what the synthesis will spit out, is often fruitless. Some vendors release guides for writing RTL for their specific tools, with sometimes rather different approaches.)
Again, most issues with delays between paths inside an FPGA, as long as they are in the same clock domain, will be caught by timing analysis. One thing the OP didn't state either is whether their tool reported timing violations, so that would obviously be a first step.
But in the OP's case, there is an external signal. Even though it can be considered on the same clock domain, which the OP only made clear later on (seems reasonable for a slave FIFO, but remains to be checked in the FX2 manual), its inherent characteristics are completely out of the control of the timing analysis of the FPGA's tools and may have excessive delay/skew, hence the possible need of resynchronizing it IMO.
I strongly recommend using test benches as well, but it's usually not possible to catch this kind of issues with simulation.