Careful: "as accurately as possible" implies picoseconds -- that is, what is possible these days. But you only need milliseconds? That's not accurate at all, as electronic circuits go. In fact it's quite easy (which helps!).
Also, is that from press to press, or release to press? That is, what happens if a "button" is held down for a long time, should the hold duration be included or not? Does it not matter because the "press" timing is consistent (from whatever your optos are driven by)?
In any case, it's a simple logic analyzer style exercise. That many inputs can easily be polled on a millisecond schedule, and then you merely need some code to count samples between changes for each channel. A regular Arduino wouldn't do that very easily or at all, but a Due might, or any other choice with similar power. A PC might not, because you're limited by USB polling rate and OS multitasking; 10s of ms is likely reliable in that case.
Actually on further contemplation, an AVR might be able to pull off a few hundred bits at 1ms rate; it'll be tight, most likely written in ASM. Needless to say, you won't achieve that performance in an Arduino sketch, but a more powerful Arduino can.
And then, I don't know what format you want out of it. Presumably some sort of event saying that a "button" was "pressed" with so-and-so delay, and then the event triggers other functions: logging, alerts, state machine, whatever. That could be transmitted via serial, say. You'll need quite a high baud rate to capture every event possible (i.e., if every input is toggling at ~1ms), or something to prevent high event rates (perhaps a holdoff, so that a button will not register an event until some delay has passed since last trigger). It should probably be buffered, which is fine, buffered serial IO is typical.
In any case, this is probably a good candidate for an embedded solution, or maybe some existing hardware of various sorts.
- Logic analyzers: cheap and plentiful, though not usually offering quite as many inputs (say 16, maybe 32 per module), and far finer timing precision than you need (fractional microseconds), and probably proprietary interfaces (but maybe some offer plugins or interprocess messaging that can implement that?).
- USB GPIO expanders as mentioned: may not have great timing? A multitasking OS isn't a great place to handle timing finer than a few ms here or there. (There are some ways around that, of varying difficulty.) Documented and ready-to-go interface, shouldn't be too painful to program with.
- Maybe a serial port, that supports synchronous operation, married to a bunch of shift registers. Could potentially be as simple as opening a serial port, prodding it to generate clocks, and reading in binary bytes for processing. Still subject to OS timing.
- Other modest-bandwidth devices come to mind, like audio or video capture, which again needs a serializer. Upside, these are buffered so the hardware, OS and software all have the means to deal with latency.
- Terrible hacks -- put the optos in place of keyboard switches? Keyboards probably don't scan frequently enough, and can't have too many buttons pressed at once (or, most can't anyway). Would be kind of hilarious to see working.
Thought I had thought of other things too, but nothing more is coming to mind... anyway, lots of possibilities, but you're going to have to program something somewhere to use it, and writing a user program plus embedded program is one of the better compromises.
Tim