You have been thoroughly duped by the arduino framework.
Arduino sucks beginners into blinking led programs and teaches them very sloppy programing, and when they need to do more then blink a few LED's their projects start to fall apart.
To get from "arduino" to "decent programming" you have to completely change course, plow your way through some documentation and read a few good books and practice with that info.
The proper way is to divide a program into a bunch of small tasks. You let ISR's handle the input and output, and the ISR's read from, and write to buffers. Then, in your main program you either keep a keen eye on your buffers, or you wait and react to signals from ISR's.
One of the goals is to keep the interrupts small, because they get executed at "random" intervals. The "bulk" of the code is executed in the main loop, which has (or should have) not very strict timing constraints.
This means that tasks like "sampling the inputs" and "reading Serial Data" or Writing Serial Data" is all done by the ISR's in 100us (or even smaller) chunks, and is also done at exactly the right moment. Your main loop is (almost) free to do the filtering algorithms and other house keeping tasks.
Managing such ISR's and the way how they interact with buffers and the main loop is where real programming starts.
but also, depending how many samples you take, calculating RMS values and filtering can take a serious chunk out of the available clock cycles, but I don't know how much. You should benchmark such algorithms, but make a careful distinction between the uC doing something useful and wasting valuable cycles in a delay() loop. (Which is sloppy programming "taught" by "arduino" to "beginners" who have to un-learn it later.)
If the processor speed is really a bottleneck, there are plenty of options, both with microcontrollers running at several hundred MHz and with microcontrollers having built-in floating point instructions.