Animation showing 3 main types of data augmentation and random cropping of a picture
Machine Learning Research

Cookbook for Vision Transformers: A Formula for Training Vision Transformers

Vision Transformers (ViTs) are overtaking convolutional neural networks (CNN) in many vision tasks, but procedures for training them are still tailored for CNNs. New research investigated how various training ingredients affect ViT performance.
Animated overview of PP-Matting
Machine Learning Research

Automating Mattes for Visual Effects: New ML Method Produces Image Mattes Easier

Researchers at Baidu introduced PP-Matting, an architecture that, given an image, estimates the transparency of pixels surrounding foreground objects to create mattes without requiring additional input.
Information related to Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC)
Machine Learning Research

Update Any Language Model: New Method to Update Pretrained Language Models

The ability to update language models is essential to incorporate new information and correct undesirable behaviors. Previous methods are unwieldy and often fail as the amount of new data increases. New work offers a workaround.
Illustration shows different self-attention mechanisms used by Transformer-based AI models.
Machine Learning Research

Attention to Rows and Columns: Altering Transformers' Self-Attention Mechanism for Greater Efficiency

A new approach alters transformers' self-attention mechanism to balance computational efficiency with performance on vision tasks.
Object-Detection Transformers Simplified: New Research Improves Object Detection With Vision Transformers
Machine Learning Research

Object-Detection Transformers Simplified: New Research Improves Object Detection With Vision Transformers

ViTDet, a new system from Facebook, adds an object detector to a plain pretrained transformer.
Animated chart shows how AI can avoid mistaking an image's subject for its context.
Machine Learning Research

Taming Spurious Correlations: New Technique Helps AI Avoid Classification Mistakes

When a neural network learns image labels, it may confuse a background item for the labeled object. New research avoids such mistakes.
Overall architecture of GEM.
Machine Learning Research

What a Molecule’s Structure Reveals: Baidu Creates AI to Classify Molecular Properties

The authors trained a modified GNN on a dataset of 18 million molecules to find molecular properties.
Animated graphs showing how an ensemble of fine-tuned models can provide better performance.
Machine Learning Research

Ensemble Models Simplified: New Machine Learning Research Simplifies Ensembles

A CLIP model whose weights were the mean of an ensemble of fine-tuned models performed as well as the ensemble and better than its best-performing constituent.
Animated flowcharts show how the ProtCNN AI model classifies proteins.
Machine Learning Research

Protein Families Deciphered: Machine Learning Categorizes Proteins Based on Their Functions

Convolutional neural networks separate proteins into functional families without considering their shapes.
A flowchart shows how a jury learning method reduces annotator bias in machine learning models.
Machine Learning Research

Choose the Right Annotators: Jury-Learning Helps Remove Bias from NLP Models

A new machine learning method attempts to account for biases that may be held by certain subsets of labelers.
Flowcharts show how a new contrastive learning approach uses metadata to improve AI image classifiers
Machine Learning Research

Learning From Metadata: Descriptive Text Improves Performance for AI Image Classification Systems

Images in the wild may not come with labels, but they often include metadata. A new training method takes advantage of this information to improve contrastive learning.
Humanized Training for Robot Arms
Machine Learning Research

Humanized Training for Robot Arms: New Research Improves Robot Performance and Adaptability

Robots trained via reinforcement learning usually study videos of robots performing the task at hand. A new approach used videos of humans to pre-train robotic arms.
Two randomly cropped pictures
Machine Learning Research

Tradeoffs for Higher Accuracy: Data Augmentation Plus Weight Decay can Boost Some AI Models

Vision models can be improved by training them on several altered versions of the same image and also by encouraging their weights to be close to zero. Recent research showed that both can have adverse effects that may be difficult to detect.
Masked Auto-Encoder (MAE) explanation
Machine Learning Research

Who Was That Masked Input? Pretraining Method Improves Computer Vision Performance

Researchers have shown that it’s possible to train a computer vision model effectively on around 66 percent of the pixels in each training image. New work used 25 percent, saving computation and boosting performance to boot.
Graph Transformer with positional encoding
Machine Learning Research

A Transformer for Graphs: New Method for Processing Graph Data with Transformers

Transformers can learn a lot from sequential data like words in a book, but they’ve shown limited ability to learn from data in the form of a graph. A new transformer variant gives graphs due attention.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox