Hi, sounds like a nice project. I think either a precision JFET opamp or a chopper stabilized MOSFET opamp would give you better performance at least on paper. If you go to the analog devices or TI website and run through the parametric search you should be able to find some parts with better specs, but I have some other suggestions to consider.
The first thing is to quantify your limitations from the physics side. 16 bit ADC resolution is nice, but you have to consider how well your optical measurement translates to the parameter you actually care about. I would expect that a relative accuracy of 1% or certainly 0.1% is more than enough. I don't know much about oximeters, but my guess here is that what you really want is a lot of dynamic range: you have two signals and all you care about is the ratio, but their absolute values may vary by an order of magnitude or more. In order to get 0.1% accuracy with 1 microamps while keeping 100 microamps on scale you need a lot of extra ADC bits. The old school solution to this is to use switchable gain or logarithmic amplifiers. Log amps are especially attractive as they make taking the ratio of two signals quite simple. A pair of log amps feeding into a differencing amplifier gives you a signal that is proportional to the log of the ratio. You can feed this into a low resolution ADC and display the result in optical density straight away. On the other hand, we are lucky enough to live in the age of cheap and plentiful ADCs. Don't be afraid to throw a 24 bit ADC at the problem. The usual suspects have a wide selection of 24 bit ADCs that can run from a few sample/s up to a few kS/s. Most of them doesn't actually have 24 effective bits (17-22 is common), especially at the higher sample rates, but they are still quite good. You can get them with built in multiplexers for multi-channel acquisition, buffer amplifiers, and occasionally programmable gain amplifiers for truly stupendous dynamic range. Averaging samples from a 12 bit ADC to get 16 effective bits can work, but you have to be careful. A perfectly noiseless DC value will always generate the same ADC code, and averaging will not improve matters. In order for sample averaging to work you *need* random noise (dither). At a minimum you need a few LSBs, maybe more. Sometimes your signal will be so noisy this is no problem, in other cases you will have to add your own dither. Just keep in mind that you have to average it out, so the number of samples you need is more than you might expect. Unless cost or parts count is a serious issue I would look for a higher resolution ADC.
For better accuracy you don't want to measure at DC if you can avoid it. You can make your amplifier close to perfect, but there are some things you can't fix. Your photodiode will have a dark current due to the reverse bias voltage. This is rather temperature sensitive, so when someone clamps this thing on their finger the dark current is going to start rising as it heats up. There will also be some background noise current due to ambient light. Hopefully it is small, but it won't be in the nanoamps regime. You already have half the solution to this: you take a reference measurement with the light off, then switch on the light and take another measurement. The problem here is that if you want to average for a long time for better resolution all of those background things can drift. The logical conclusion of this is the chopper technique. Pulse your LED at a fixed frequency (something like 70-200 Hz, avoiding harmonics of your power line frequency). Now subtract the amplifier output on alternating half-cycles, and then do your averaging over many cycles. This is the technique of the lock-in amplifier, and it essentially completely eliminates the problems associated with dark current, background signal, offset voltage, opamp bias current, the 1/f noise of your amplifier. You can do the subtraction and averaging in software if your ADC can sample faster than your chopping frequency, or do it old school with analog switches and an averaging capacitor then feed the filtered signal to a high resolution ADC running at 16.6 S/s for line frequency rejection.
Finally, you should think about calibration. You will need to have some way to measure and verify the relationship between your measured photocurrents and the number you actually care about: oxygen level. I guess you are basically measuring the absorption ratio at two wavelengths. The photocurrent will depend on the relative brightness of the two LEDs and their orientation, so at a minimum you need a calibration cycle where you measure the maximum photocurrent from the two LEDs. It is important that this measurement be as similar to the real measurement as possible, and also be designed so that variations in measurement conditions do not change the result excessively.
Anyway, that is probably more than enough rambling. Almost certainly the solution I would come up with would be total overkill -- I am used to trying to ring every last drop out of each photon, and do so at high speed. For what I expect is a low frequency, low accuracy application you probably can't go too far wrong.