With 1:1000 steps for the shunts the likely 12 bit resolution for the µC internal ADC may be a little low, even if one may get some 2 extra bits from oversampling.
I would consider a seprate higher resolution ADC, like MCP3421.
Compared to the µC internal ADC the LM321 may not be that bad, however one may get away with that extra OP. AFAIk the max4238 can also run with 3.3 V.