The current DIY and most commercial
RTF FPV copters are all the same: there is a camera with an analog PAL or NTSC output, an extra camera signal sender, usually at 5.8 GHz, a 2.4 GHz receiver which outputs servo signals, a controller board where you connect the (analog) servo signals, and the controller board controls individual
ESC boards for each motor, which drive the brushless motors. And then multiple batteries (or one battery and some extra
BECs) to provide the voltages for the different modules. The advantage of these setups: it is flexible and you can combine modules from many different manufactures. But I think there are some disadvantages.
First the ESC could be integrated to the main control board. This would allow easier update of the algorithms of the ESC, lower latency and better logging possibilities (motor stall, current etc.).
Then the receiver could be integrated and changed to use digital signals. This would make it more flexible. No more a limited number of channels and loss of accuracy because of the PWM modulation (needs a new sender, too). Would it be possible to use WLAN? How much is the latency?
And then the camera could be connected to the main controller board, with a digital interface (like the industry standard CSI interface, for which you can get powerful cheap cameras). If the controller board uses an FPGA (the required power for the FPGA would be only a fraction compared to the motors), the FPGA could even do some real-time image process, e.g. for obstacle avoidance or speed calculation.
With a high resolution CSI camera, the FPGA could create a MP4 stream to log it to a SD-card as well. No need anymore for an extra action-cam. For the video downlink, the FPGA could scale the image to reduce the bandwidth. Easiest solution then would be to convert it to PAL or NTSC to use existing sender and receiver hardware. How much is the video latency from a CSI camera? If it first buffers one image, it could be already too slow for some FPV experts
But the video feed could be sent digitally, too. I think a line based transmission would be even better than the analog system. With analog video transmissions, sometimes multiple frames are lost, until the receiver re-synchronizes. With sufficient synchronization information in the digital signal, only a few lines should getting lost. Would it be fast enough to use WLAN for this, too? Using 2.4 GHz has the advantage that the signal has less problems with reflections or shielding by trees etc. than 5.8 GHz and you could use an off-the-shelf WLAN transceiver. Of course, because the camera downlink is digital, it could contain telemetry data, that are not visually embedded in the video feed, so that the video receiver can decide what to do with it, or how to log or process it.
This system would be not as flexible as the traditional system, but would result in a much better copter, if built for one specific model.
And regarding the PID tuning: why has this to be done manually? I'm not an expert, but IIRC there are algorithms which do this automatically. If it starts with some reasonable default values, and then measures lots of data, including GPS, accelerometer, gyro and maybe even evaluate the video feed (could be done offline on a PC), this shouldn't be too difficult.