Yoav Shoham Language models that reason

Published

Dec 29, 2021

Reading time

2 min read

I believe that natural language processing in 2022 will re-embrace symbolic reasoning, harmonizing it with the statistical operation of modern neural networks. Let me explain what I mean by this.

AI has been undergoing a natural language revolution for the past half decade, and this will continue into 2022 and well beyond. Fueling the revolution are so-called large language models (sometimes called foundation models), huge neural networks pretrained on gigantic corpora that encode rich information about not only language but also the world as described by language. Models such as GPT-3 (OpenAI), Jurassic-1 (AI21 Labs), Megatron-Turing NLG (Microsoft-Nvidia), and WuDao 2.0 (Beijing Academy of Artificial Intelligence), to name some of the largest ones, perform impressively well on a variety of natural language tasks from translation to paraphrasing. These models dominate academic leaderboards and are finding their way into compelling commercial applications.

For all the justified excitement around large language models, they have significant shortcomings. Perhaps most notably, they don’t exhibit true understanding of any kind. They are, at heart, statistical behemoths that can guess sentence completions or missing words surprisingly well, but they don’t understand (nor can they explain) these guesses, and when the guesses are wrong — which is often — they can be downright ridiculous.

Take arithmetic. GPT-3 and Jurassic-1 can perform one- and two-digit addition well. This is impressive, as these general-purpose models were not trained with this task in mind. But ask them to add 1,123 to 5,813 and they spit out nonsense. And why would they not? None of us learned addition merely by observing examples; we were taught the underlying principles.

What’s missing is reasoning, and math is just an example. We reason about time, space, causality, knowledge and belief, and so on via symbols that carry meaning and inference on those symbols. These abstract symbols and reasoning don’t emerge from the statistics encoded in the weights of a trained neural network.

The new holy grail is to inject this sort of semantic, symbolic reasoning into the statistical operation of the neural machinery. My co-founders and I started AI21 Labs with this mission, and we’re not alone. So-called neuro-symbolic models are the focus of much recent (and some less-recent) research. I expect that 2022 will see significant advances in this area.

The result will be models that can perform tasks such as mathematical, relational, and temporal reasoning reliably. No less important, since the models will have access to symbolic reasoning, they will be able to explain their answers in a way that we can understand. This robustness and explainability will help move natural language processing from the current era of statistical pattern recognition into an era of trustworthy, understandable AI. This is not only intellectually exciting, but it also unlocks practical applications in domains in which trustworthiness is essential such as finance, law, and medicine.

The year 2022 likely will not mark the end of the quest for such models, but I believe that it may be recognized as a pivotal year in this quest.

Yoav Shoham is a co-founder of AI21 Labs and professor emeritus of computer science at Stanford University.

Subscribe to The Batch