To me the specs seem a bit excessive and/or written down like you're married to PIC32MZ.
I've built a DC power meter (basically 1-ch data acquisition) on a STM32F407 at 150MHz. It samples an 16-bit ADC at 550ksps and sends the raw data out over UDP. Unfortunately most TCP/IP stacks I found were right on the limit for this chip and bandwidth level, that I opted for UDP. Because packet loss is inevitable for UDP (and I couldn't have that), I buffered each sent data frame in a circular buffer block system, on 4Mbit of external SRAM. This buffers data for a few tenths of a second, which is more than enough time for a python script or PC application to signal that a particular expected frame ID timed out. Each frame is ~1kiB.
I used DMA to transfer SPI data to the RAM buffers, while using a timer/external interrupt input. By using DMA straight on the UDP payload buffer I also avoided 1 memcpy. However, you do need to keep several UDP buffers, because in my case the ethernet was a separate OS task that needed to "pick up" these buffers at it's own rate.
All in all at 550ksps and DMA it only took 10-12% of CPU (measured from RTOS using an idle counter).
In a later stage I added 2 digital inputs as well. It started a DMA transfer from a GPIO port using the external ADC timebase, which it then parsed to put 8 digital samples into 1 byte (compression). I then packed this data at the end of each frame.
This addition was quite CPU heavy as you can imagine, however with a tightly written loop and local GCC O3 optimisation it only added 20% CPU usage. I suppose 30MIPS is fair to process 1.1MSPS of data, but not exceptional (27 cycles/sample).
Going back to your requirements:
10ksps on a 150MHz chip with 512KB of SRAM seems out of proportion. SRAM isn't even the biggest issue; badly written code is. Repetitive, real-time code needs to be fast and predictable. I first tried this project with interrupts but gave up because I was stuck at 250ksps. The delay from entering an interrupt, starting SPI transfer, bus delays, reading & saving the sample took way too long.
If you want to use 4 ADC's you may not be able to use DMA that easily because of bus collisions. But also think about reasserting chip-select on each transfer. In my case I used a trick in hardware to use the interrupt output of the ADC to chip-select itself. The STM32F407 couldn't generate chip select signals on a DMA stream, maybe newer chips have more options.
Also like I said 512K of SRAM is a pretty unique spec to PIC32MZ, which marries you to that chip. However you can achieve that or more RAM on any flag-ship ARM cortex chip though by adding external S(D)RAM, however that requires a load of wires on the PCB. You can use DMA transfer on external SRAM; STM32 has memory-to-memory transfers. On STM32 (and many other ARM chips; not sure about PIC32MZ without looking it up) the external SRAM integrates completely into the memory space.
The rest of your specs seem quite reasonable; although not sure what "low power" is. Remember that ethernet is pretty energy hungry because of termination on the ethernet wiring.