What I think is interesting is this question: will such a cache actually bootstrap any potential future civilization?
Consider if we discovered such a cache today. Would it be a curiosity? Would it be useful? Would it be inscrutable and useless?
Consider what we are doing today. We're storing data, with plain language descriptions. Which, by the way--
And if history is anything to go by, problems with decoding it would have nothing to do with understanding 2020 technology and everything to do with undestanding 2020 English
They're storing it in multiple languages; you could've spent a few seconds checking.
Which we know from historic precedence, is very likely to be helpful. The Rosetta Stone is famous for a reason!
If we discovered a cache that was similarly labeled, likely we would have some understanding of one or several of the languages given -- there are only a few historic languages known today, from any context, that are just about indecipherable. And it's not clear whether some of them are merely
elaborate hoaxes...
Indeed, most languages that were common 1000 years ago, are
still common today, given some fairly aggressive drift in that time. We have plenty of surviving works, and excellent understanding of them by academics today.
So that's language. What about content?
Suppose we found the archive of a semi-ancient civilization. It contains a large cross-section of their technical knowledge. What the heck can we do with it?
Well...
1. Translate the documents, as needed.
2. The documents describe how to access, reproduce and work with the data.
3. We will need a computer of some sort to process the data. The Github archive is QR coded, LZMA compressed, byte based, and has two very fundamental data formats: plain text in UTF-8 (for the most part?), and raw binary. (The text of course is stored as binary, but the meaning of those bytes must be decoded and translated, a substantial effort.) Probably, our hypothetical historic archive has similar methods employed, but we must figure out what, and we must figure out what encoding they used -- maybe something like ASCII but based on the local language, maybe something awful like EBCDIC, maybe something entirely different like custom PDF font encodings, or like some vintage game consoles have used*.
*Tile-based graphics, with hard memory limitations, force a simplified character set. Text might be encoded on a dedicated tilemap, or as part of the base tilemap at whatever offsets the characters happened to land on. Things get very interesting when Kanji and Latin scripts must be supported, as some (JP/US) game titles did.
4. But what if we don't have a computer to process those data? If a more primitive civilization picks up our data, we might hope they can bootstrap off of it. If more advanced, they might at least enjoy the quaintness of our playthings, and recreate them on a hobby basis for example, much as hobbyists today recreate vintage architectures, instruction sets and so forth.
If more primitive, the best we can hope for is, the data get published as widely as possible; the most basic, accessible and translatable data at least. A team of researchers could dedicate their effort to decompressing things by hand, and probably they could prioritize repos by what seems most promising.
But what would be promising? Probably not the gigs of inscrutable minified JavaScript and other crap. Text files would be nice, just for basic understanding. (
Historical similarity.) What would be really cool, are examples of simple, low-level machines, as hardware, emulator, instruction set, etc.
I think what might end up happening is this:
Say we discovered such a cache around 1900. Or 1950, or 1800 for that matter. With the technology present in each of those eras, we would be able to construct some sort of very basic, low level machine. In 1800, it would likely be a hybrid of mechanical and human function. (I'm sure figures like Babbage would be utterly fascinated by such findings!) In 1900, electromechanical to electronic; in 1950, vacuum tube to solid state.
I can imagine it would be a hobby just to construct models of various instruction sets and
virtual machines -- the archaeo-6502 might well be one of the first implemented. What's funny is, it might not prove very useful:
a. 6502 is a simple instruction set by VLSI standards, but it still takes thousands of transistors. ENIAC used as many tubes and relays (well.. a few more), and was far more limited in general computing terms (though far more powerful, numerically, per instruction).
b. What use is a computer? First priority, automate computors (human computers). Your basic four-banger calculators, and some more advanced sequential machines. Then automate sequences of operations: accounting (IBM's bread and butter of the day), creating tables, and military applications (many of the early computers were merely for firing tables and other what-today-seems-like-bullshit tasks). Everything from Babbage to early IBM to ENIAC. Nothing very general-purpose if at all, there's far too much hardware required to build such a thing, let alone any understanding of how to do that.
c. But these are still very big machines. Thousands of working parts, whatever the technology: at best these would be the hobbies of prominent academics, with any luck, helped with patronage from anyone they could convince of the value of these things -- especially corporate, government and military.
d. But also the competitive advantage is hard to tell. Likely many would argue that these historic records are humanity's shared achievement, and thus they would be translated and published widely.
What I can't decide is whether some of these problems would solve themselves. It was a
very long time before we got modern operating systems, languages and semantics -- early pioneers could see it back in the '50s, 1850s for that matter; but it wasn't until the 60s and 70s that these sorts of things were finally realized (e.g. the introduction of Algol, Lisp, C..), and decades further before they became widely accessible -- outside of academic and professional environments -- personal computing. Perhaps the accessibility of the ideas alone would lead to a cultural revolution, perhaps we would have languages structured after the historic examples, before any machines even exist to implement them? Perhaps we'd have assemblers and compilers right away, rather than waiting some years, decades, for their introduction (with great ease of use, mind)?
It may very well be that, without the widespread availability of hardware to actually test out these ideas on, they might never catch on. Even if widely published.
But I digress; such is the nature of the what-if.
Tim