Large Language Models, the likes of ChatGPT, are a new and powerful force among us. They are mysterious to most people and shrouded by misconceptions which are, to make things worse, promulgated by prominent figures in the field. Here is an excerpt from an interview between the host of the TV show 60 Minutes, Scott Pelley, and Geoffrey Hinton, one of three individuals dubbed ‘the Godfathers of Deep Learning’ who received the 2018 Turing Award for their work on ‘deep learning’:
Pelley: You believe [large language models] can understand.
Hinton: Yes.
Pelley: You believe they are intelligent.
Hinton: Yes.
Pelley: You believe these systems have experiences of their own and can make decisions based on those experiences in the same sense as people do.
Hinton: Yes.1
This series of posts is a response to such assertions. I'd like to dispel the misconceptions about LLMs (Large Language Models) by explaining how they work on a finer level than the common abra-cadabra offered on the internet, using a language accessible to the layperson. I hope to be conceptually as thorough as possible without getting into technicalities, providing analogies and examples on trickier points, and evoking a bit of high-school level math when I deem it unavoidable.
By understanding how LLMs work, I hope you'll get convinced that LLMs are not ‘intelligent,’ or perhaps in what sense they are and aren't, and that they are, indeed, despite what Hinton has (elsewhere) claimed, ‘merely’ sophisticated autocompletion systems.
I'll say in advance that it is not that I believe humans and their mind possess supernatural qualities. Brains are biological computers. Still, while I think an artificial intelligence is indeed possible, perhaps even created within my lifetime, LLMs are not yet there and there's quite a road ahead before an artificial system would reach that point. This will be discussed in further detail on the last part of this series.
These are not the only objectionable things Hinton said on that interview. A few more:
Pelley: You think these AI systems are better at learning than the human mind.
Hinton: I think they may be, yes. And at present they're quite a lot smaller. So even the biggest chatbots only have about a trillion connections in them. The human brain has about 100 trillion, and yet, in the trillion connections in a chatbot it knows far more than you do in your 100 trillion connections. Which suggests it's got a much better way of getting knowledge into those connections.
There's a very qualified manner by which I agree that these systems learn better than humans, but the comparison Hinton makes is preposterous. A chatbot doesn't know anything anymore than a Chinese grammar book knows Chinese better than you. As for learning, we might as well say that a piece of paper learns something when you write on it, and as it immediately captures the information and never forgets it, perhaps with its zero connections it's even a better learner than the chatbot.
Hinton: we have a very good idea of sort of roughly what it's doing. But as soon as it gets really complicated, we don't actually know what's going on anymore than we know what's going on in your brain.
Pelley: What do you mean we don't know exactly how it works? It was designed by people.
Hinton: No it wasn't. What we did was we designed the learning algorithm. That's a bit like designing the principle of evolution. But when this learning algorithm then interacts with data, it produces complicated neural networks that are good at doing things, but we don't really understand exactly how they do those things.
There's plenty to object to here. Had I taken Hinton's expressions on these matters seriously, I'd have said it deserved a post on its own. In brief, however:
The brain is far more complex than any existing neural network, and certainly less understood by ‘us’.
Interpretability of neural networks —the ability to explain how or why a network arrived to a certain prediction and not another— has been long a field of study and must advances has been made. I think one thing that gets confounded in this research is the ’why‘ and the ‘how’. It's a topic of its own, but in short, to say we don't understand how exactly they do those things is simply false.
We understand about biology much more than merely the principle of evolution.
The learning algorithm doesn't produce a complicated neural network when it interacts with data. The complicated neural network is there from the beginning, and was designed by humans; it was not some fluke, it was intelligent design. When the learning algorithm interacts with the data it sets the values of the network's parameters.
If any of this doesn't make sense, hopefully this series of posts would explain it.
For what it's worth, another of the ‘godfathers,’ Yann LeCun, based on an extract from an interview with him, does not share Hinton's views.