From the links above, the uC will receive 2 interrupts per phase per cycle or about 360 interrupts per second (assuming 60 Hz). Then all that needs to happen is for a Triac to get fired at some time delay after zero crossing. Only 3 timers are necessary, one per phase, fired off at the two zero crossings per cycle. That's probably why the projects are using Mega boards, the standard Arduino ATmega328P has only two timers. The Mega has 5.
I'm not sure how effective the Arduino libraries are going to be. I would be thinking in terms of writing the code to interact directly with the hardware.
The interrupt routines need to be very short. Once zero crossing occurs, we need to get the timer started quickly. When the timer interrupt occurs, we need to fire the Triac quickly. There are a lot of things going on and a lot of interrupts firing.
One idea that just flew past: Use 3 Arduinos, one per phase. It simplifies the code substantially as each uses the exact same code and simply interrupts on the appropriate zero crossing, starts a timer, gets an interrupt on timeout and fires the Triac.
There are details omitted: How long should the timer delay be? It could be 0 ns if we want 100% voltage so we need to account for that. I suppose there needs to be an analog input to determine where some knob is set that controls speed. Just the interrupt prolog code are going to account for some additional delay.
I haven't looked at any of the projects in detail. I would just buy a variable speed drive.
Given the idea of 3 uCs, I might be inclined to use something much smaller than an Arduino but I would need to resolve device programming before I walked away from the Arduino. In any event, I wouldn't be using any of the Arduino supplied code. I would want my code very tight!