A teacher from long back told me that you can know the formula but if you don't know how to use it, it is worthless.
That's exactly why I too push people to use simple tools like
dimensional analysis. Fancy name, but just means you ignore numeric values and check the units only through the formula, to see if it makes any sense whatsoever. Next step, simple powers of ten with human-scale known results, to see if it produces garbage. Takes only minutes, but tells a lot about the formula.
I do not bother to remember any formulas off-hand. It just isn't worth the effort and risk (in misremembering). It is much better to understand where and how things apply. For example, in physics, it is useful to know that momentum is conserved in "elastic collisions", when no internal deformation or change occurs as if the objects were infinitely hard spheres; and that total energy is conserved in all isolated systems. You can quickly find the relevant formulas and be able to apply them. For things like "sentence embedding" in large language models, you need to understand it is the numerical vector representation of the input textual data or words, that can be used as an input to the actual numerical neural network; the details of exactly how it can be done, how two sentence embeddings can be compared to each other, and what differences the different approaches have, you can look up whenever you need.
In programming, it means there is no use in trying to remember interface details, like the parameter order for
memset() for example. Just keep a terminal window open, ready for
man -s 2,3 function or a browser window open for
Linux man pages online (which contains the standard C documentation, and mentions which systems/standards provide each, so isn't "just" for Linux), or whatever documentation you're using. You'll soon become so efficient and fast at looking up the exact details, so that memorizing the interfaces (and occasionally remembering them wrong, wasting precious development time) becomes counterproductive. And you only need to understand when to use which interfaces, not remember nitty-gritty details.
In such terms, neural networks and large language models are not difficult to understand. The only big step is discarding any preconceptions one might have, and keep ones mind open for truly understanding, instead of trying to force the new information about it into the preconceived notions. Typically, this is called "thinking outside the box", but really, it is more like an attitude you can learn to ignore your own preconceptions and assumptions. It hurts to admit, yes, but our current 'knowledge' often limits what 'new' we are ready to learn.
It is a typical human error to confuse existing knowledge (wisdom) with the ability to solve problems (intelligence).
Another saying I believe still holds up is "If the human brain was so simple that we could understand it, we would be so simple that we couldn't".
Exactly. For example, I might sound "clever", but that's all just the result of hard effort; I'm not
that intelligent.
It is extremely difficult to design an intelligence test for those more intelligent than the test maker in the g-factor sense; the IQ tests that measure
ability and
knowledge are easier.
It is even more difficult to design a test to measure something you cannot exactly describe or define, like "intelligence".
"Wisdom" is much easier, and because of the common error of conflating the two, many tests claiming to test for "intelligence" or problem-solving ability, actually just test for "wisdom" or the breadth of experience in problem solutions.
But I also believe that intelligence comes in many forms. Some shine at math, where others shine in music or art, etc.
We also lack proper terms to describe these. I like to use "intelligence" specifically in the g-factor sense, as in the ability to solve new problems not previously encountered (even in analogous forms). For the others, I use various different terms.
True creativity, or creating something
new instead of "creating" something "new" by mixing existing things, is a very, very big one.
I do not have a word for it, but creating something interesting by mixing existing things is also "creative", and some of the neural networks can do this extremely well. This, too, is useful in practice, but I don't like to conflate it with the truly-new-creative term. Needs a new word; I don't have one. Synthesizy?
Intuition, and especially intuitive leaps, are another: it is like a transformation function on a pattern that allows you to match it to a completely different one, like Fourier, Laplace, or Z-transforms applied in signal processing. Sentence embedding in large language models is itself such a transform, with results similar to how a Fourier transform on the digits of two multiplicands allows you to calculate their product with a simple sum and an inverse Fourier transform.
Even "simple" pattern matching,
especially if adaptive (i.e. does not look for
exact copy, but uses an analog of the abstraction-filtration-comparison test used for similarity in the copyright sense in the USA), is extremely useful. Many, many human jobs have a large pattern matching component as a core part: for example, you have a set of rules, and you need to apply them.
The problem in using LLMs and transformers in pattern matching is that there is no way to know how accurate the pattern match is, because the matching itself is the product of the entire network. A completely separate mechanism is needed to check the results for accuracy or applicability. Current use cases require us humans to do that check ourselves. To anyone familiar with human nature, that means the checks are rarely, basically never, done. And we can see the results: lawyers citing cases that do not exist because they used an LLM, and so on. Ugly.
I myself enjoy the discovery process, the act of solving the problem in a manner I can show how I found the result. I believe that being able to show how the result was obtained, and derive the rules when the result applies from that, is about half the worth of the entire solution. I dislike the current use patterns of LLMs exactly because they discard this half, as if it was worthless and not needed anyway: as if the quick wrong answer is worth just as much as a slow, proven-correct one. I do not like that, because to me, it is exactly equivalent to just applying formulas blindly, and presenting the results using similar language as verified results tend to be presented. It is too close to intentionally lying to me. At least humans are
sometimes held accountable for how they apply the rules they're supposed to pattern-match against.
However, that does not change the fact that neural networks are an extremely powerful tool. Once again, it is just a question of how us humans use it. I wholeheartedly support many use cases –– including things like summarizing existing texts ––, and object to some. I'm also a bit peeved that some profit hugely off of models they ripped the training data from off the web; I'm not sure that is
fair. I want all interactions to be mutually beneficial.