Animation showing gMLP, a simple architecture that performed some language and vision tasks as well as transformers
transformers

Perceptrons Are All You Need: Google Brain's Multi-Layer Perceptron Rivals Transformers

The paper that introduced the transformer famously declared, “Attention is all you need.” To the contrary, new work shows you may not need transformer-style attention at all.What’s new: Hanxiao Liu and colleagues at Google
2 min read
Animation showing example questions and answers obtained by a pretrained language model
transformers

Ask Me in a Different Way: Prompt Engineering Improves Few-Shot Learning Results

Pretrained language models like GPT-3 have shown notable proficiency in few-shot learning. Given a prompt that includes a few example questions and answers (the shots) plus an unanswered question (the task), such models can generate an accurate answer.
2 min read
Series of images showing some of the findings of the new study by researchers at Stanford’s Human AI Institute
transformers

Weak Foundations Make Weak Models: Foundation AI Models Pass Flaws to Fine-Tuned Variants

A new study examines a major strain of recent research: huge models pretrained on immense quantities of uncurated, unlabeled data and then fine-tuned on a smaller, curated corpus.
2 min read
Graph showing Expire-span which enables attention to ignore tokens that aren’t useful to the task at hand
transformers

Sharper Attention

Self-attention enables transformer networks to track relationships between distant tokens — such as text characters — in long sequences, but the computational resources required grow quadratically with input size.
2 min read
Frozen Pretrained Transformer (FPT) explained
transformers

Transformers: Smarter Than You Think

The transformer architecture has shown an uncanny ability to model not only language but also images and proteins. New research found that it can apply what it learns from the first domain to the others.
2 min read
Image showing how object detectors work
transformers

I Know It When I See It

Object detectors typically detect only items that were labeled in their training data. A new method liberates them to locate and recognize a much wider variety of objects.
2 min read
AI generated videos and VideoGPT training pipeline
transformers

Synthetic Videos on the Double

Using a neural network to generate realistic videos takes a lot of computation. New work performs the task efficiently enough to run on a beefy personal computer.
2 min read
Architecture of vision-language tasks
transformers

One Model for Vision-Language

Researchers have proposed task-agnostic architectures for image classification tasks and language tasks. New work proposes a single architecture for vision-language tasks.
2 min read
Protein structures
transformers

What AI Knows About Proteins

Transformer models trained on sequences of amino acids that form proteins have had success classifying and generating viable sequences. New research shows that they also capture information about protein structure.
2 min read
Animation showing a methaforical transition from AI to a green environment
transformers

Greener Machine Learning

A new study suggests tactics for machine learning engineers to cut their carbon emissions. Led by David Patterson, researchers at Google and UC Berkeley found that AI developers can shrink a model’s carbon footprint a thousand-fold by streamlining architecture...
1 min read
A generative adversarial network (GAN)
transformers

Image Generation Transformed

A recent generative adversarial network (GAN) produced more coherent images using modified transformers that replaced fully connected layers with convolutional layers. A new GAN achieved a similar end using transformers in their original form.
2 min read
CogView home website
transformers

Large Language Models for Chinese

Researchers unveiled competition for the reigning large language model GPT-3. Four models collectively called Wu Dao were described by Beijing Academy of Artificial Intelligence, a research collective funded by the Chinese government, according to Synced Review.
2 min read
Examples of image generators using GANsformer
transformers

Attention for Image Generation

Attention quantifies how each part of one input affects the various parts of another. Researchers added a step that reverses this comparison to produce more convincing images.
2 min read
Commercial about The Trevor Lifeline
transformers

Chatbots Against Depression

A language model is helping crisis-intervention volunteers practice their suicide-prevention skills. The Trevor Project, a nonprofit organization that operates a 24-hour hotline for LGBTQ youth, uses a “crisis contact simulator” to train its staff in how to talk with troubled teenagers.
1 min read
Graph showing information about different transformer models
transformers

Transformer Variants Head to Head

The transformer architecture has inspired a plethora of variations. Yet researchers have used a patchwork of metrics to evaluate their performance, making them hard to compare. New work aims to level the playing field.
2 min read

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox