Earlier language models powered by Word2Vec and GloVe embeddings yielded confused chatbots, grammar tools with middle-school reading comprehension, and not-half-bad translations. The latest generation is so good, some people consider it dangerous.

What happened: A new breed of language models wrote news that readers rated as credible as the New York Times and contributed to an article in the New Yorker. Happily, these models didn’t fulfill fears that they would unleash a dark tide of disinformation.

Driving the story: In 2019, researchers made a leap in natural language performance. The new models become generally proficient by pretraining on a huge, unlabeled dataset. Then they master a given task or subject matter via fine-tuning on a specialized corpus.

  • While earlier models like ULMFiT (by Jeremy Howard and Sebastian Ruder) and ELMo (from the Allen Institute for AI and University of Washington) demonstrated pretraining’s potential, Google’s BERT was the method’s first breakout success. Released in late 2018, BERT scored so high on the GLUE reading comprehension benchmark that, for the first time, the test’s organizers compared the model’s performance with human baseline scores. In June, a Microsoft derivative called MT-DNN beat the human scores.
  • In mid-February, OpenAI announced GPT-2, a pretrained model it deemed too dangerous to release due to its ability to churn out convincing computer-generated prose. Trained on 40GB of Reddit comments, it didn’t fuel a fake-news apocalypse, but it did contribute to a novel, avant-garde song lyrics, and Game of Thrones fan fiction. The organization finally published the full-blown model in November.
  • In between, a parade of models from Baidu, Carnegie Mellon and Google Brain, Facebook, and elsewhere topped the NLP benchmarks in turn. Many were based on the transformer architecture and took advantage of BERT-style bi-directional coding.

Behind the news: In July 2018 — months before BERT came out — DeepMind researcher Sebastian Ruder anticipated pretraining’s impact on natural language processing. Further, he predicted that breakthroughs in NLP would revolutionize AI as a whole. He based his argument on the energizing effect of pretrained vision models circa 2012. Many in the field trace the deep learning explosion to this moment.

Where things stand: Despite the year’s innovations, language models still have room to grow: Even GPT-2’s 1.5 trillion parameters often spit out gobbledygook. As for whether the latest models are capable of disrupting democracy with potent disinformation: U.S. election season is coming up fast.


Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox