To check for jitter from the LM311, one could trigger the scope on the output of the LM311 and look at the input signal. This could be tricky, as the voltages are really small, so a normal scope will not see much. The trouble is there is not that much noise needed to upset the comparators. The integrator output is rather lowly changing (e.g. 250 V/s) so it only takes some 250 µV to delay the comparator by 1 µs. This rather small signal will be problem with a separate test jig too.
One way to exclude / reduce noise from the comparators would be to give them an amplified signal. E.g. have a small amplifier (e.g. 10 fold amplification for small signals, down to 1 for a large signal) between IC202 and the comparators. This addition might be easier than an extra test jig for the comparators.
It might also be interesting to have the scope triggered from the forcing signal and look at the outputs of the comparators. Ideally there should be not more than about 100-200 ns of variations due to the clocked feedback.
From looking at the counter in the digital section, I can no see (at least if the circuit is as in the plan, they could have done different !) a way a mistake there would add noise to individual conversions, but not effecting the long time average.
It might be a good idea to check the old data again, by plotting a histogram. The distribution of values might give more clues about the origin of the errors - could be still a few measurements that are off by a lot or more normal noise.
It might also help to compare the noise in the 5 digit mode. Here the date are from shorter times and thus might show the trouble even stronger. The higher forcing frequency might give some more information. Here it should also be more clear if the noise is really to high.
With the 5 digit mode one could also do a test on the effect of the forcing waveform. With a smaller resistance (e.g. 22 K in parallel) for R221 one would get a stronger forcing signal. This should result in more noise if the forcing part is the problem. It's difficult to do the test in 6 digit mode, as the integrator might saturate.