the bypass caps are to ensure stable operation of the silicon devices, so adding say a 1uF and a 100nF next to the shift registers and the arduino chip mean while your switching the higher current LED's the supply wont sag at those chips and you wont have glitches (hopefully),
Now if you do away with "digital write" for arduinos and directly toggle pins, you can get much faster output speeds (only take 2 clock cycles to set a pin state), meaning your left with more time to process your animations, and get a higher frame rate, for a 10x10x10 RGB cube, this seriously adds up, considering you will be shifting in (10+10) x 3 bits each step for the colours, double it for the clock pin and add a 2 toggles for the register pin (assuming your using one), or 122 pin state changes per layer, or 1220 for all 10 layers, + your row pulses, which will be about 200uS per frame of writing, meaning you have tonnes of time to process your animations,