Series of images showing how single trained network generates 3D reconstructions of multiple scenes

One Network, Many Scenes

To reconstruct the 3D world behind a set of 2D images, machine learning systems usually require a dedicated neural network for each scene. New research enables a single trained network to generate 3D reconstructions of multiple scenes.
Few-shot Learning with a Universal Template (FLUTE)

Pattern for Efficient Learning

Getting high accuracy out of a classifier trained on a small number of examples is tricky. You might train the model on several large-scale datasets prior to few-shot training, but what if the few-shot dataset includes novel classes? A new method performs well even in that case.
Computer vision is probing the history of ancient pottery

Sorting Shattered Traditions

Computer vision is probing the history of ancient pottery | What’s new: Researchers at Northern Arizona University developed a machine learning model that identifies different styles of Native American painting on ceramic fragments and sorts the shards by historical period.
Two images showing the process of turning handwriting into text

The Writing, Not the Doodles

Systems designed to turn handwriting into text typically work best on pages with a consistent layout, such as a single column unbroken by drawings, diagrams, or extraneous symbols. A new system removes that requirement.
Architecture of vision-language tasks

One Model for Vision-Language

Researchers have proposed task-agnostic architectures for image classification tasks and language tasks. New work proposes a single architecture for vision-language tasks.
Model identifying erroneous labels in popular datasets

Labeling Errors Everywhere

Key machine learning datasets are riddled with mistakes. Several benchmark datasets are shot through with incorrect labels. On average, 3.4 percent of examples in 10 commonly used datasets are mislabeled and the detrimental impact of such errors rises with model size.
Data related to SElf-supERvised (SEER), an image classifier pretrained on uncurated, unlabeled images

Pretraining on Uncurated Data

It’s well established that pretraining a model on a large dataset improves performance on fine-tuned tasks. In sufficient quantity and paired with a big model, even data scraped from the internet at random can contribute to the performance boost.
Sequence showing a training step that uses different perspectives of the same patient to enhance unsupervised pretraining

Same Patient, Different Views

When you lack labeled training data, pretraining a model on unlabeled data can compensate. New research pretrained a model three times to boost performance on a medical imaging task.
Sequence related to image processing

Vision Models Get Some Attention

Self-attention is a key element in state-of-the-art language models, but it struggles to process images because its memory requirement rises rapidly with the size of the input. New research addresses the issue with a simple twist on a convolutional neural network.
Graphs and data related to ReLabel, a technique that labels any random crop of any image.

Good Labels for Cropped Images

In training an image recognition model, it’s not uncommon to augment the data by cropping original images randomly. But if an image contains several objects, a cropped version may no longer match its label. Researchers developed a way to make sure random crops are labeled properly.
Graphs and data related to ImageNet performance

ImageNet Performance: No Panacea

It’s commonly assumed that models pretrained to achieve high performance on ImageNet will perform better on other visual tasks after fine-tuning. But is it always true? A new study reached surprising conclusions.
System Oscar+ working

Sharper Eyes For Vision+Language

Models that interpret the interplay of words and images tend to be trained on richer bodies of text than images. Recent research worked toward giving such models a more balanced knowledge of the two domains.
Art pieces with subjective commentary regarding their emotional impact

How Art Makes AI Feel

An automated art critic spells out the emotional impact of images. Led by Panos Achlioptas, researchers at Ecole Polytechnique, King Abdullah University, and Stanford University trained a deep learning system to generate subjective interpretations of art.
Different data related to the phenomenon called underspecification

Facing Failure to Generalize

The same models trained on the same data may show the same performance in the lab, and yet respond very differently to data they haven’t seen before. New work finds this inconsistency to be pervasive.
Series of images and graphs related to cancer detection

Shortcut to Cancer Treatment

Doctors who treat breast cancer typically use a quick, inexpensive tumor-tissue stain test to diagnose the illness and a slower, more costly one to determine treatment. A new neural network could help doctors to go straight from diagnosis to treatment.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox