Animation showing example questions and answers obtained by a pretrained language model
Machine Learning Research

Ask Me in a Different Way: Prompt Engineering Improves Few-Shot Learning Results

Pretrained language models like GPT-3 have shown notable proficiency in few-shot learning. Given a prompt that includes a few example questions and answers (the shots) plus an unanswered question (the task), such models can generate an accurate answer.
Series of images showing some of the findings of the new study by researchers at Stanford’s Human AI Institute
Machine Learning Research

Weak Foundations Make Weak Models: Foundation AI Models Pass Flaws to Fine-Tuned Variants

A new study examines a major strain of recent research: huge models pretrained on immense quantities of uncurated, unlabeled data and then fine-tuned on a smaller, curated corpus.
Information about a new unsupervised pretraining method called VICReg
Machine Learning Research

More Reliable Pretraining: Pretraining Method Helps AI Learn Useful Representations

Pretraining methods generate basic representations for later fine-tuning, but they’re prone to certain issues that can throw them off-kilter. New work proposes a solution.
Sequence of famous arcade games' scenes
Machine Learning Research

Solve RL With This One Weird Trick: How to get better performance from reinforcement learning.

The previous state-of-the-art model for playing vintage Atari games took advantage of a number of advances in reinforcement learning (RL). The new champion is a basic RL architecture plus a trick borrowed from image generation.
Graph showing Expire-span which enables attention to ignore tokens that aren’t useful to the task at hand
Machine Learning Research

Sharper Attention: NLP transformer technique for more Efficient token usage.

Self-attention enables transformer networks to track relationships between distant tokens — such as text characters — in long sequences, but the computational resources required grow quadratically with input size.
Image recognition examples
Machine Learning Research

Smaller Models, Bigger Biases

Compression methods like parameter pruning and quantization can shrink neural networks for use in devices like smartphones with little impact on accuracy — but they also exacerbate a network’s bias.
Graphs, images and data related to the activation function known as ReLU
Machine Learning Research

Upgrade for ReLU

The activation function known as ReLU builds complex nonlinear functions across layers of a neural network, making functions that outline flat faces and sharp edges. But how much of the world breaks down into perfect polyhedra?
Simpler multilayer neural network
Machine Learning Research

Revenge of the Perceptrons

Why use a complex model when a simple one will do? New work shows that the simplest multilayer neural network, with a small twist, can perform some tasks as well as today’s most sophisticated architectures.
Frozen Pretrained Transformer (FPT) explained
Machine Learning Research

Transformers: Smarter Than You Think

The transformer architecture has shown an uncanny ability to model not only language but also images and proteins. New research found that it can apply what it learns from the first domain to the others.
Series of images showing how single trained network generates 3D reconstructions of multiple scenes
Machine Learning Research

One Network, Many Scenes

To reconstruct the 3D world behind a set of 2D images, machine learning systems usually require a dedicated neural network for each scene. New research enables a single trained network to generate 3D reconstructions of multiple scenes.
Image showing how object detectors work
Machine Learning Research

I Know It When I See It

Object detectors typically detect only items that were labeled in their training data. A new method liberates them to locate and recognize a much wider variety of objects.
A four-legged robot walking over difficult and changing terrain
Machine Learning Research

Walking the Dog

A reinforcement learning system enabled a four-legged robot to amble over unfamiliar, rapidly changing terrain.
Automated player learning by watching recorded gameplay
Machine Learning Research

Behavioral Cloning Shootout

Neural networks have learned to play video games like Dota 2 via reinforcement learning by playing for the equivalent of thousands of years (compressed into far less time). In new work, an automated player learned not by playing for millennia but by watching a few days’ worth of recorded gameplay.
Few-shot Learning with a Universal Template (FLUTE)
Machine Learning Research

Pattern for Efficient Learning

Getting high accuracy out of a classifier trained on a small number of examples is tricky. You might train the model on several large-scale datasets prior to few-shot training, but what if the few-shot dataset includes novel classes? A new method performs well even in that case.
AI generated videos and VideoGPT training pipeline
Machine Learning Research

Synthetic Videos on the Double

Using a neural network to generate realistic videos takes a lot of computation. New work performs the task efficiently enough to run on a beefy personal computer.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox