Writer’s Unblock Language models keep getting bigger and better.

Published

Dec 23, 2020

Reading time

1 min read

Neural networks for natural language processing got bigger, more prolific, and more fun to play with.

What happened: Language models, which already had grown to gargantuan size, continued to swell, yielding chatbots that mimic AI luminaries and have very strange ideas about horses.

Driving the story: OpenAI’s colossal 175 billion-parameter text generator GPT-3 showcased ongoing progress in natural language processing. It also exemplified widespread trends in machine learning: exponential rise in parameter counts, growing prevalence of unsupervised learning, and increasing generalization.

GPT-3 writes more coherent text than its predecessor, GPT-2 — so much so that tricksters used it to produce blog articles and Reddit comments that fooled human audiences. Other users showed off the technology’s inventiveness in unique ways, such as drafting philosophical essays and inventing conversations with historical figures.
Language modeling boosted tools for businesses, for instance by helping Apple’s autocorrect differentiate among languages, enabling Amazon’s Alexa to follow shifts in conversation, and updating the DoNotPay robot lawyer to file lawsuits against telemarketers who unlawfully call U.S. citizens.
Meanwhile, OpenAI trained GPT-2 on pixel data to produce iGPT, which is capable of filling in partially obscured pictures to generate images of uncanny weirdness.

Where things stand: In language models, bigger clearly is better — but it doesn’t stop there. iGPT portends models trained on both images and words. Such models, which are in the works at OpenAI, at least, may be smarter, and weirder, than the giant language models of 2020.

Learn more: Our NLP s pecial issue includes stories about counteracting bias in word embeddings, making conversation, and choosing the right words, plus an exclusive interview with NLP pioneer Noam Shazeer. Learn how to build your own NLP models in the Natural Language Processing Specialization on Coursera.

Subscribe to The Batch