Data showing how new pretrained language models might learn facts like weight and cost

The Measure of a Muppet

The latest pretrained language models have shown a remarkable ability to learn facts. A new study drills down on issues of scale, showing that such models might learn the approximate weight of a dog or cost of an apple, at least to the right order of magnitude.
Examples of contrastive learning

Learning From Words and Pictures

It’s expensive to pay doctors to label medical images, and the relative scarcity of high-quality training examples can make it hard for neural networks to learn features that make for accurate diagnoses.
Data related to Nvidia's Pay Attention When Required (Par) approach

Selective Attention

Large transformer networks work wonders with natural language, but they require enormous amounts of computation. New research slashes processor cycles without compromising performance.
Proof Search Tree

The Proof Is in the Network: An NLP Transformer Based on GPT that Creates Math Proofs

OpenAI’s Generative Pre-Trained Transformer (GPT) architecture has created coherent essays, images, and code. Now it generates mathematical proofs as well.
AI medical chatbot having a conversation with a patient

GPT-3 Is No MD

The world’s most sophisticated language model won’t replace your doctor anytime soon. Researchers at Nabla, an AI-enabled healthcare platform, found that GPT-3 lacks the logical reasoning skills to be a useful medical chatbot.
Illustration of two witches with half a pumpkin each and the moon behind them

The AI Community Splinters

Will international rivalries fragment international cooperation in machine learning? Countries competing for AI dominance will lash out at competitors.
Graphs related to different attention mechanisms

More Efficient Transformers

As transformer networks move to the fore in applications from language to vision, the time it takes them to crunch longer sequences becomes a more pressing issue. A new method lightens the computational load using sparse attention.
Graphs with data related to Microsoft's library DeepSpeed

Toward 1 Trillion Parameters: DeepSpeed PyTorch Library Supports Large AI and NLP Models

An open source library could spawn trillion-parameter neural networks and help small-time developers build big-league models. Microsoft upgraded DeepSpeed, a library that accelerates the PyTorch deep learning framework.
Bert (muppet) and information related to BERT (transformer-based machine learning technique)

Do Muppets Have Common Sense?

Two years after it pointed a new direction for language models, Bert still hovers near the top of several natural language processing leaderboards. A new study considers whether Bert simply excels at tracking word order or or learns something closer to common sense.
Graphs and data related to transformer networks

The Transformation Continues

Transformer networks are gaining popularity as a high-accuracy alternative to recurrent neural networks. But they can run slowly when they’re applied to long sequences.
Graphs and data related to language models and image processing

Transforming Pixels

Language models like Bert, Ernie, and Elmo have achieved spectacular results based on clever pre-training approaches. New research applies some of those Sesame Street lessons into image processing.
Examples of clothes image-text combo search

That Online Boutique, But Smarter

Why search for “a cotton dress shirt with button-down collar, breast pockets, barrel cuffs, scooped hem, and tortoise shell buttons in grey” when a photo and the words “that shirt, but grey” will do the trick? A new network understands the image-text combo.
Examples and explanation of an automatic headline generation

AI Makes Headlines: An NLP System for Automatically Generating Headlines

Which headline was written by a computer? A: FIFA to Decide on 2022 World Cup in March B: Decision in March on 48-team 2022 World Cup, Says Infantino
Examples of detection of animals in images using Detection Transformer (DETR).

Computer Vision Transformed

The transformer architecture that has shaken up natural language processing may replace recurrent layers in object detection networks. A Facebook team led by Nicolas Carion and Francisco Massa simplified object detection pipelines by using transformers, yielding Detection Transformer (DETR).
Virtual bot speaking

Bots Don’t Need Social Distancing

A chatbot is providing companionship for the locked-down and lonely. Downloads of Replika, a chatbot designed to be a virtual friend, have spiked during the coronavirus pandemic, reports the New York Times.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox