ImageNet

16 Posts

Masked Pretraining for CNNs: ConvNeXt V2, the new model family that boosts ConvNet performance

Vision transformers have bested convolutional neural networks (CNNs) in a number of key vision tasks. Have CNNs hit their limit? New research suggests otherwise.

ImageNet

Vision Transformers Made Manageable: FlexiViT, the vision transformer that allows users to specify the patch size

Vision transformers typically process images in patches of fixed size. Smaller patches yield higher accuracy but require more computation. A new training method lets AI engineers adjust the tradeoff.

ImageNet

Diffusion Transformed: A new class of diffusion models based on the transformer architecture

A tweak to diffusion models, which are responsible for most of the recent excitement about AI-generated images, enables them to produce more realistic output.

ImageNet

Stable Biases: Stable Diffusion may amplify biases in its training data.

Stable Diffusion may amplify biases in its training data in ways that promote deeply ingrained social stereotypes.

Animation showing 3 main types of data augmentation and random cropping of a picture

ImageNet

Cookbook for Vision Transformers: A Formula for Training Vision Transformers

Vision Transformers (ViTs) are overtaking convolutional neural networks (CNN) in many vision tasks, but procedures for training them are still tailored for CNNs. New research investigated how various training ingredients affect ViT performance.

ImageNet

Abeba Birhane: Clean up web datasets

From language to vision models, deep neural networks are marked by improved performance, higher efficiency, and better generalizations. Yet, these systems are also marked by perpetuation of bias and injustice.

Animated image showing the transformer architecture of processing an image

ImageNet

Transformer Speed-Up Sped Up: How to Speed Up Image Transformers

The transformer architecture is notoriously inefficient when processing long sequences — a problem in processing images, which are essentially long sequences of pixels. One way around this is to break up input images and process the pieces

Model identifying erroneous labels in popular datasets

ImageNet

Labeling Errors Everywhere: Many deep learning datasets contain mislabeled data.

Key machine learning datasets are riddled with mistakes. Several benchmark datasets are shot through with incorrect labels. On average, 3.4 percent of examples in 10 commonly used datasets are mislabeled and the detrimental impact of such errors rises with model size.

Blurred human faces in different pictures

ImageNet

De-Facing ImageNet: Researchers blur all faces in ImageNet.

ImageNet now comes with privacy protection.What’s new: The team that manages the machine learning community’s go-to image dataset blurred all the human faces pictured in it and tested how models trained on the modified images on a variety of image recognition tasks.

Data related to SElf-supERvised (SEER), an image classifier pretrained on unlabeled images

ImageNet

Pretraining on Uncurated Data: How unlabeled data improved computer vision accuracy.

It’s well established that pretraining a model on a large dataset improves performance on fine-tuned tasks. In sufficient quantity and paired with a big model, even data scraped from the internet at random can contribute to the performance boost.

ImageNet

ImageNet Performance, No Panacea: ImageNet pretraining won't always improve computer vision.

It’s commonly assumed that models pretrained to achieve high performance on ImageNet will perform better on other visual tasks after fine-tuning. But is it always true? A new study reached surprising conclusions.

Different data related to the phenomenon called underspecification

ImageNet

Facing Failure to Generalize: Why some AI models exhibit underspecification.

The same models trained on the same data may show the same performance in the lab, and yet respond very differently to data they haven’t seen before. New work finds this inconsistency to be pervasive.

ImageNet

A Privacy Threat Revealed: How researchers cracked InstaHide for computer vision.

With access to a trained model, an attacker can use a reconstruction attack to approximate its training data. A method called InstaHide recently won acclaim for promising to make such examples unrecognizable to human eyes while retaining their utility for training.

ImageNet

Representing the Underrepresented: Many important AI datasets contain bias.

Some of deep learning’s bedrock datasets came under scrutiny as researchers combed them for built-in biases. Researchers found that popular datasets impart biases against socially marginalized groups to trained models due to the ways the datasets were compiled, labeled, and used.

ImageNet

Unsupervised Prejudice: Image classification models learned bias from ImageNet.

Social biases are well documented in decisions made by supervised models trained on ImageNet’s labels. But they also crept into the output of unsupervised models pretrained on the same dataset.

ImageNet

Masked Pretraining for CNNs: ConvNeXt V2, the new model family that boosts ConvNet performance

Vision Transformers Made Manageable: FlexiViT, the vision transformer that allows users to specify the patch size

Diffusion Transformed: A new class of diffusion models based on the transformer architecture

Stable Biases: Stable Diffusion may amplify biases in its training data.

Cookbook for Vision Transformers: A Formula for Training Vision Transformers

Abeba Birhane: Clean up web datasets

Transformer Speed-Up Sped Up: How to Speed Up Image Transformers

Labeling Errors Everywhere: Many deep learning datasets contain mislabeled data.

De-Facing ImageNet: Researchers blur all faces in ImageNet.

Pretraining on Uncurated Data: How unlabeled data improved computer vision accuracy.

ImageNet Performance, No Panacea: ImageNet pretraining won't always improve computer vision.

Facing Failure to Generalize: Why some AI models exhibit underspecification.

A Privacy Threat Revealed: How researchers cracked InstaHide for computer vision.

Representing the Underrepresented: Many important AI datasets contain bias.

Unsupervised Prejudice: Image classification models learned bias from ImageNet.

Subscribe to The Batch