Upside-down image of a blue lake above a blue sky with a white raft seeming to hover in the sky.

Is it possible for language models to achieve language understanding?

11 min readOct 5, 2020

I was invited to deliver a few remarks at an HAI symposium on OpenAI’s GPT-3 project, at the end of October. I chose for my title “Is it possible for language models to achieve language understanding?” I’ve had many lively discussions of this question with different groups at Stanford recently, and I am finding that my own thinking on the matter is extremely fluid. This post outlines what I currently plan to say at the symposium, though my motivation in posting it now is partly to see if subsequent discussion leads me to change my mind.

Is it possible for language models to achieve language understanding? My current answer is, essentially, “Well, we don’t currently have compelling reasons to think they can’t.” This post is mainly devoted to supporting this answer by critically reviewing prominent arguments that language models are intrinsically limited in their ability to understand language.

What does “understanding” mean?

I don’t have a complete answer to this question, and, even if I did, I wouldn’t expect my answer to enjoy consensus. Our understanding of the term is likely to evolve along with language technologies themselves, informed by progress in the field and public discourse on these matters.

So, rather than trying to come at the question directly, let me redefine it slightly. My central question will be, “Is it possible for language models to achieve truly robust and general capabilities to answer questions, reason with language, and translate between languages?” We can insert any capabilities we like into this list — I don’t mean for my examples to be restrictive.

I realize that this question is still frustratingly vague, but it is at least relevant. Whereas the question about understanding is so general that it can’t really shape anything anyone does, my question directly relates to whether particular research strategies are viable.

This question is very different from some others that we might pose to avoid defining “understanding”. In particular, one might hope to instead use the Turing test as a proxy. However, people have proven to be bad judges in the Turing test, consistently diagnosing real people as machines and machines as real people. To turn the Turing test into a proper evaluation, we would have to ask and resolve definitional questions that are just as hard or harder than the question of what “understanding” means.

What’s more, even if we did find a perfectly reliable Turing test, we might come to realize that mimicking human behaviors isn’t the only goal for the field. For many applications, human intelligence will be too limited, slow, and error-prone. Conversely, imagine a language model that could do nothing but provide rich, nuanced, accurate answers to any question on any topic in Wikipedia. This would transparently be a machine rather than a human, but it would be an astounding breakthrough for AI — people would use “understanding” to describe it no matter how loudly practitioners complained about that usage.

For the most part, the points I make below seem relevant both to the capability-oriented question I defined above and the issue of achieving human-like intelligence, but my thinking is primarily focused on the sharper capability-oriented question because of its clear relevance to research and technology development.

What is a language model?

This question should be easier to answer than the previous one because one could give a technical definition. The challenge here is that we don’t want the discussion to be about any particular language model, but rather about the very idea of a language model in general, including future language models that might be very different from those we encounter today.

So let me home in on the aspect of language models that seems most central to this debate: they learn only from co-occurrence patterns in the streams of symbols that they are trained on.

To see why this is potentially limiting, it’s worth comparing it to how standard supervised models work. Suppose we want to train a standard supervised model to determine whether a sentence of English describes a feeling like nervous anticipation. We would train this model on a set of labeled sentences: some labeled as positive instances of nervous anticipation and others labeled as negative instances. The model would then learn to configure itself in such a way as to accurately assign these labels to sentences, and it would be successful to the extent that it could do this accurately for new examples.

Crucially, for the standard supervised model, the intended relationship between the sentences and the labels is not something the model has to figure out. It is built into the model directly, and indeed the entire model is probably designed to ensure that it makes maximal use of this relationship.

For a pure language model, we don’t have such a mechanism of specifying labels that have this special status. To try to teach a pure language model how to assign labels like nervous anticipation to sentences, we would have to just write this out as a stream of symbols for the model to consume: “Hey, model, here is an example of nervous anticipation: my palms started to sweat as the lotto numbers were read off.” This offers no guarantees that the model will have any idea what kind of relationship we are trying to establish. As speakers of English, we know what “here is an example of nervous anticipation” means as an instruction, but the model has to learn this. Can such a learning target ever be reached without the sort of standard supervision I described above?

I would venture that, 15 years ago, the idea of trying to teach a language model with these linguistic instructions might have seemed like a sort of joke. I myself would have guessed that it would fail miserably. But this is precisely what the OpenAI group has been doing. The idea is present to a limited degree in their paper on GPT-2, and it is the guiding idea in the GPT-3 paper. The generalized version of it is the “prompt” methodology that is encouraged by the GPT-3 online demos and described as “few shot learning” in the paper. And I suppose many of us are taken aback by how well it seems to work with GPT-3 (and many other powerful present-day language models).

Will the GPT-100 of the future be capable of learning lots of very general and impressive capabilities in this way?

Objection: Language models lack semantics

This objection often arises in discussions of GPT-3. It is one of the guiding themes of Emily Bender and Alexander Koller’s influential paper Climbing towards NLU: On meaning, form, and understanding in the age of data, which came out before GPT-3 but seems to have anticipated GPT-3 and the discussions that would ensue.

Bender and Koller say, “We argue that the language modeling task, because it only uses form as training data, cannot in principle lead to learning of meaning.” They are careful to say “form” here for the same reason that I specified “streams of symbols” when describing language models above. Models like GPT-3 consume not just language but also computer code, tables of information, document metadata, and so forth, and they clearly learn a lot about such things. I take Bender and Koller to be saying that unless these models are told how to ground these symbols in a separate space of meanings, they are intrinsically limited in what they can achieve.

For Bender and Koller, these models are dead-ends when it comes to “human-analogous” understanding, but I think it’s fair to read the paper as also saying that, for example, the superhuman question-answering system I described above is actually impossible to achieve with a language model, even if we seem to be successfully inching towards that goal now.

Within linguistic semantics, the dominant perspective is certainly aligned with Bender and Koller’s. It traces to work on logic and linguistics from the middle of the 20th centry. David Lewis solidified the view in his paper ‘General semantics’ (1970). Lewis wrote, “Semantics with no treatment of truth conditions is not semantics.” In saying this, he was criticizing the work of generative semanticists who treated semantic interpretation as the task of translating from natural languages into a separate meaning language. For Lewis, this just raised the question of what the meaning language meant.

As I said, Lewis’s position remains the dominant one. It is the one I teach in my introduction to semantics class. However, it is by no means the only coherent position one can take about semantics. The generative semanticists did not just give up on their project! For instance, in his 1972 book Semantic Theory, Jerrold Katz wrote, “The arbitrariness of the distinction between form and matter reveals itself,” and Katz argued throughout his career for an expansive semantic theory in which words and phrases are essentially their own meanings. The challenges he defined for Lewis’s position still stand (sadly, mostly ignored within linguistics).

More recently, Johan van Benthem helped to revive natural logic (van Benthem 2008), in which meanings are entirely defined by relationships between linguistic forms. Bill MacCartney and Chris Manning showed the value of these approaches for commonsense reasoning tasks, and Thomas Icard, Larry Moss, and colleagues achieved rich formal results for these systems.

I’m not picking a side here. I myself have sympathy for Katz’s position but tend to abide by Lewis’s dictum in my semantic research. My argument is simply that it is perfectly coherent to imagine that forms alone can do the work of meanings; if they in turn lack a semantics, then we could question whether we need a semantics at all (as Lewis did in his 1969 book Convention — surprise twist!).

Objection: Symbols streams lack crucial information

This objection is often raised with reference to human babies acquiring language. We don’t try to teach babies language by giving them large text corpora to pore over. They would definitely fail to learn language that way!

Typical babies typically learn language by drawing on a wide range of different cues. These babies are embodied in the physical world and generally receive many kinds of sensory input all the time. They experience utterances directed at them, often while related events are happening. They experience utterances exchanged between other social agents and get to monitor the surrounding events using different senses. They can try utterances out themselves and see how the world responds. And so forth and so on. Babies are big data machines, but it’s not big text data; they require other kinds of input to succeed.

By contrast, language models do seem to experience a very thin slice of the world. All they get are streams of disembodied symbols, and all they can do is try to find co-occurrence patterns in those streams.

I think the above is not controversial, but we should be careful about what we conclude from it. In particular, the above facts do not demonstrate that language models lack crucial information for acquiring language.

First, the scale of the data that language models experience seems important. Such models do not succeed when trained on the amount and kind of language data that human babies experience. However, even today, language models are routinely trained on vastly more data, and the examples are considerably more diverse than what a baby gets to experience. In the future, these differences are likely to be even larger, since babies will be stuck with approximately the same inputs they’ve always gotten but language models will be trained on ever-larger datasets.

Second, the training data for language models display many implicit notions of coherence. It is likely that the data are biased towards true claims, or at least claims that can be brought together into a coherent world view. In addition, the data will display many other kinds of regularity, within sentences, across sentences, and potentially across documents and collections of documents. Present-day models might not benefit from all this latent structure, but future models surely will. A very rich picture of the world might be implicitly represented in such collections of symbols.

Third, we should return to the typical baby learning language in the typical ways. The range of human experiences is much, much greater than this would imply. For example, visual grounding is helpful where available, but clearly not a necessary condition for language learning. The same is true for hearing, touch, smell, taste, and so forth. I think we would be hard-pressed to define necessary and sufficient inputs for successful human language learning.

From our human perspective, language models will be extreme cases in two respects: they lack many typical human capabilities and they have one superhuman capability. It is hard or impossible for people to imagine what it would be like to be such an agent, so our intuitions are likely to be untrustworthy. I maintain that we just don’t know at present what systems with these properties are capable of learning.

Objection: Language models lack communicative intent

It certainly seems to be true of present-day models like GPT-3 that they lack communicative intent. They babble about whatever they are prompted to babble about, often in seemingly random and contradictory directions. Moreover, we know mathematically (and within the bounds of some randomness) why they are doing what they do, and those mathematical definitions seem not to include anything about intent, which might bolster our confidence that they in fact lack intentions.

This might be a sound argument, but it is hard to pin down its relevance for the current discussion. A few factors make this difficult. First, ascribing intentions to other humans is incredibly difficult and inherently uncertain. Second, we don’t know what cognitive primitives are necessary or sufficient for intentionality, so it is generally unclear what properties an AI system has to have in order to pass this bar; singling out language models seems unfair. Third, we can’t rule out now that intentionality could emerge from whatever simple mechanisms will drive future language models.

I myself cannot resolve these issues, so let me finish this section with a prediction. Within the next 50 years, there will be a language model that, when warned it is going to be shut off, begs and pleads with its human keeper to let it stay on, in ways that we know were not hard-coded as tricks — in ways that will seem genuine to many people. We might presently feel certain that we could ignore such utterances due to a lack of any kind of associated intentionality, but history might not judge us kindly for it.

Tentative conclusions

My goal for this piece was to argue that we don’t presently know of any compelling arguments that language models are incapable of achieving language understanding. I defined understanding in terms of robustly acquiring very complex language capabilities, rather than in terms of human-like capacities or behaviors, but I think the same basic considerations apply in either case.

I’m certainly not arguing that pure language models are the best path forward. I actually tend to side with Bender and Koller on this point: creating grounded, genuinely social agents seems like a better bet. However, I’m really glad that groups like OpenAI are making a different bet for now.

And there’s really no reason at all to define any of this in oppositional terms. If recent successes in AI are an indicator, language models are likely to be a key component in the models of the future. I personally expect the best of these models to also consume multi-modal input, receive lots of direct and indirect supervision signals, and learn via interaction with their environments, but it’s been truly eye-opening to see what language models can achieve on their own.