Talking bubbles inside talking bubbles
transformers

Bigger is Better

Natural language processing lately has come to resemble an arms race, as the big AI companies build models that encompass ever larger numbers of parameters. Microsoft recently held the record — but not for long.
2 min read
Generative BST example and graph
transformers

Big Bot Makes Small Talk

Facebook recently rolled out its entry in the World’s Biggest Chatbot sweepstakes. In keeping with the company’s social-networking dominance, the bot is designed to excel at chitchat on any subject.
2 min read
A chatbot called Meena and a graph comparing it with other chatbot services
transformers

Toward Open-Domain Chatbots

Progress in language models is spawning a new breed of chatbots and, unlike their narrow-domain forebears, they have the gift of gab. Recent research tests the limits of conversational AI.
2 min read
Series of images related to Jukebox, a deep learning system by OpenAI
transformers

Roll Over, Beyoncé

A new generative model croons like Elvis and raps like Eminem. It might even make you think you’re listening to a lost demo by the Beatles. OpenAI released Jukebox, a deep learning system that has generated thousands of songs in styles from country to metal and soul.
2 min read
Data related to Reformer
transformers

Transformers Transformed

Transformer networks have revolutionized natural language processing, but they hog processor cycles and memory. New research demonstrates a more frugal variation.
2 min read
Capture of a chatbot telling jokes developed by Google Brain
transformers

Bot Comic

Androids may not dream of electric sheep, but some crack jokes about horses and cows. Meena, a 2.6-billion parameter chatbot developed by Google Brain, showed impressive conversational ability, discussing a variety of topics.
2 min read
Math equations represented as trees
transformers

Neural Networks Study Math

In tasks that involve generating natural language, neural networks often map an input sequence of words to an output sequence of words. Facebook researchers used a similar technique on sequences of mathematical symbols, training a model to map math problems to math solutions.
2 min read
Excerpt from The Squire, an AI written short film
transformers

Here Be Dragons

AI is contributing to paintings, music, and now a whimsical fantasy video. The Squire is an amateur romp through a snowy realm of knights in armor and damsels in distress. The script was composed by AI Dungeon 2, an interactive text-adventure game based on the GPT-2 language model.
1 min read
Single Headed Attention RNN (SHA-RNN)
transformers

Language Modeling on One GPU

The latest large, pretrained language models rely on trendy layers based on transformer networks. New research shows that these newfangled layers may not be necessary.
2 min read
Yann LeCun
transformers

Yann LeCun: Learning From Observation

How is it that many people learn to drive a car fairly safely in 20 hours of practice, while current imitation learning algorithms take hundreds of thousands of hours, and reinforcement learning algorithms take millions of hours? Clearly we’re missing something big.
2 min read
Illustration of a fireplace with "Happy holidays" cards in English, Spanish and French
transformers

Language Models Get Literate

Earlier language models powered by Word2Vec and GloVe embeddings yielded confused chatbots, grammar tools with middle-school reading comprehension, and not-half-bad translations. The latest generation is so good, some people consider it dangerous.
2 min read
Sesame Street characters together
transformers

Inside AI’s Muppet Empire

As language models show increasing power, a parallel trend has received less notice: The vogue for naming models after characters in the children’s TV show Sesame Street.
1 min read
Automatically generated text summary
transformers

Keeping the Facts Straight

Automatically generated text summaries are becoming common in search engines and news websites. But existing summarizers often mix up facts. For instance, a victim’s name might get switched for the perpetrator’s.
2 min read
Comparison between TrXL and GTrXL
transformers

Melding Transformers with RL

Large NLP models like BERT can answer questions about a document thanks to the transformer network, a sequence-processing architecture that retains information across much longer sequences than previous methods. But transformers have had little success in reinforcement learning — until now.
2 min read
Proposed model for abstractive summarization of a scientific article
transformers

Two Steps to Better Summaries

Summarizing a document using original words is a longstanding problem for natural language processing. Researchers recently took a step toward human-level performance in this task, known as abstractive summarization, as opposed to extractive summarization.
1 min read

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox