Math equations represented as trees
Transformer

Neural Networks Study Math: A sequence to sequence model for solving math problems.

In tasks that involve generating natural language, neural networks often map an input sequence of words to an output sequence of words. Facebook researchers used a similar technique on sequences of mathematical symbols, training a model to map math problems to math solutions.
Excerpt from The Squire, an AI written short film
Transformer

Here Be Dragons: AI Dungeon 2 generated the script for a short movie.

AI is contributing to paintings, music, and now a whimsical fantasy video. The Squire is an amateur romp through a snowy realm of knights in armor and damsels in distress. The script was composed by AI Dungeon 2, an interactive text-adventure game based on the GPT-2 language model.
Single Headed Attention RNN (SHA-RNN)
Transformer

Language Modeling on One GPU: Single-headed attention competes with transformers.

The latest large, pretrained language models rely on trendy layers based on transformer networks. New research shows that these newfangled layers may not be necessary.
Yann LeCun
Transformer

Yann LeCun — Learning From Observation: The power of self-supervised learning

How is it that many people learn to drive a car fairly safely in 20 hours of practice, while current imitation learning algorithms take hundreds of thousands of hours, and reinforcement learning algorithms take millions of hours? Clearly we’re missing something big.
Illustration of a fireplace with "Happy holidays" cards in English, Spanish and French
Transformer

Natural Language Processing Models Get Literate: Why 2019 was a breakthrough year for NLP

Earlier language models powered by Word2Vec and GloVe embeddings yielded confused chatbots, grammar tools with middle-school reading comprehension, and not-half-bad translations. The latest generation is so good, some people consider it dangerous.
Sesame Street characters together
Transformer

Inside AI’s Muppet Empire: Why Are So Many NLP Models Named After Muppets?

As language models show increasing power, a parallel trend has received less notice: The vogue for naming models after characters in the children’s TV show Sesame Street.
Automatically generated text summary from FactCC with misleading facts highlighted in different colors.
Transformer

Keeping the Facts Straight: NLP system FactCC fact checks texts.

Automatically generated text summaries are becoming common in search engines and news websites. But existing summarizers often mix up facts. For instance, a victim’s name might get switched for the perpetrator’s.
Comparison between TrXL and GTrXL
Transformer

Melding Transformers with RL

Large NLP models like BERT can answer questions about a document thanks to the transformer network, a sequence-processing architecture that retains information across much longer sequences than previous methods. But transformers have had little success in reinforcement learning — until now.
Proposed model for abstractive summarization of a scientific article
Transformer

Two Steps to Better Summaries

Summarizing a document using original words is a longstanding problem for natural language processing. Researchers recently took a step toward human-level performance in this task, known as abstractive summarization, as opposed to extractive summarization.
Pipeline for identifying sentences containing evidence of SDIs and SSIs
Transformer

Hidden Findings Revealed

Drugs undergo rigorous experimentation and clinical trials to gain regulatory approval, while dietary supplements get less scrutiny. Even when a drug study reveals an interaction with supplements, the discovery tends to receive little attention.
GPT-2 text generator
Transformer

Putting Text Generators on a Leash

Despite dramatic recent progress, natural language generation remains an iffy proposition. Even users of the muscular GPT-2 text generator have to press the button a number of times to get sensible output. But researchers are figuring out how to exert greater control over generated text.
Graph related to Language Model Analysis (LAMA)
Transformer

What Language Models Know

Watson set a high bar for language understanding in 2011, when it famously whipped human competitors in the televised trivia game show Jeopardy! IBM’s special-purpose AI required around $1 billion. Research suggests that today’s best language models can accomplish similar tasks right off the shelf.
OpenAI's GPT-2 results
Transformer

How to Share Dangerous AI

OpenAI raised eyebrows in February when it announced — and withheld — the full version of its groundbreaking language model, GPT-2. Six months later, the company has re-examined the decision.
Bert and Ernie from Sesame Street
Transformer

BERT Is Back

Less than a month after XLNet overtook BERT, the pole position in natural language understanding changed hands again. RoBERTa is an improved BERT pretraining recipe that beats its forbear, becoming the new state-of-the-art language model — for the moment.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox