Vision

257 Posts

High-level overview of the STEGO architecture at train and prediction steps
Vision

Segmented Images, No Labeled Data: Improved unsupervised learning for semantic segmentation

Training a model to separate the objects in a picture typically requires labeled images for best results. Recent work upped the ante for training without labels.
Moving slide with information about AWS AI Service Cards.
Vision

Transparency for AI as a Service: Amazon introduces service cards to enhance responsible AI.

Amazon published a series of web pages designed to help people use AI responsibly. Amazon Web Services introduced so-called AI service cards that describe the uses and limitations of some models it serves.
3 graphs showing projections of data usage. Each one shows two extrapolations of data usage.
Vision

Will We Have Enough Data?

The world’s supply of data soon may fail to meet the demands of increasingly hungry machine learning models. Researchers at Epoch AI found that a shortage of text data could cause trouble as early as this year. Vision data may fall short within a decade.
Reza Zadeh photographed during a conference
Vision

Reza Zadeh: Active Learning Takes Off

As we enter the new year, there is a growing hope that the recent explosion of generative AI will bring significant progress in active learning. This technique, which enables machine learning systems to generate their own training examples and request them to be labeled...
Illustration of three deers doing holiday household chores: washing a champagne flute, cooking pie and wrapping a gift
Vision

One Model Does It All: Multi-task AI models got more sophisticated in 2022.

Individual deep learning models proved their mettle in hundreds of tasks. The scope of multi-task models expanded dramatically in the past year.
Illustration of a snowman with a top hat and glasses
Vision

AI's Eyes Evolve: Vision transformer research exploded in 2022.

Work on vision transformers exploded in 2022. Researchers published well over 17,000 ViT papers during the year. A major theme: combining self-attention and convolution.
Sequence showing how FIFA's Video Assisted Review (VAR) works
Vision

The World Cup's AI Referee

The outcome of the FIFA World Cup 2022 depends on learning algorithms. The annual championship tournament of football, which wraps up this week, is using machine learning to help human arbiters spot players who break a rule that governs their locations on the field.
Ground truth video of a road on the left and predicted video with MaskViT on the right
Vision

Seeing What Comes Next: Transformers predict future video frames.

If a robot can predict what it’s likely to see next, it may have a better basis for choosing an appropriate action — but it has to predict quickly. Transformers, for all their utility in computer vision, aren’t well suited to this because of their steep computational and memory requirements...
Network architecture of Reasoner
Vision

What the Missing Frames Showed: Machine Learning Describes Masked Video Events

Neural networks can describe in words what’s happening in pictures and videos — but can they make sensible guesses about things that happened before or will happen afterward? Researchers probed this ability.
Computer vision model detecting grain-storage facilities in an aerial photo
Vision

Ukraine's Lost Harvest Quantified: AI Analysis Shows Ukraine War Grain Farming Impacts

Neural networks are helping humanitarian observers measure the extent of war damage to Ukraine’s grain crop. Analysts from the Yale School of Public Health and Oak Ridge National Laboratory built a computer vision model that detects grain-storage facilities in aerial photos.
Series of images showing different AI tools for farmers
Vision

Smarts for Farms: Microsoft Open Sources AI Systems for Agriculture

The next green revolution may be happening in the server room. Microsoft open-sourced a set of AI tools designed to help farmers cut costs and improve yields.
List of AI tools used to improve fast food services
Vision

Food Forecaster: Chipotle Tests AI For Predicting Customer Demand

The ability to predict customer demand could make fast food even faster. The Mexican-themed Chipotle restaurant chain is testing AI tools that forecast demand, monitor ingredients, and ensure that workers fill orders correctly.
Map showing areas likely to have been damaged by Hurricane Leo
Vision

Wreckage Recognition: AI System Spots Hurricane Ian Damage

A machine learning model identified areas likely to have been damaged by Hurricane Leo as it swept through the southern United States.
The Dark Side of the Moon — Lit Up! AI Illuminates Dark Regions of the Moon
Vision

The Dark Side of the Moon — Lit Up! AI Illuminates Dark Regions of the Moon

Neural networks are making it possible to view parts of the Moon that are perpetually shrouded by darkness. Researchers devised a method called Hyper-effective Noise Removal U-net Software (HORUS) to remove noise from images of the Moon’s south pole.
Animation showing 3 main types of data augmentation and random cropping of a picture
Vision

Cookbook for Vision Transformers: A Formula for Training Vision Transformers

Vision Transformers (ViTs) are overtaking convolutional neural networks (CNN) in many vision tasks, but procedures for training them are still tailored for CNNs. New research investigated how various training ingredients affect ViT performance.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox