Series of pictures of drivers
Vision

The View Through the Windshield: New Zealand Uses Computer Vision to Spot Distracted Drivers

Overhead cameras equipped with computer vision are spotting distracted drivers on the road. A system from Melbourne-based Acusensus alerts police when drivers are engaged in risky activities such as using a cell phone, not wearing a seatbelt, or speeding.
Overview of Graph Hyper Network (GHN-2)
Vision

Who Needs Training? Graph neural network selects optimal weights for image tasks.

When you’re training a neural network, it takes a lot of computation to optimize its weights using an iterative algorithm like stochastic gradient descent. Wouldn’t it be great to compute the best parameter values in one pass? A new method takes a substantial step in that direction.
Industrial gauges being placed
Vision

Remote Meter Reader: Computer vision tool reads analog gauges at industrial sites.

Industrial gauges are often located on rooftops, underground, or in tight spaces — but they’re not out of reach of computer vision. The Okinawa startup LiLz Gauge provides a system that reads analog gauges and reports their output to a remote dashboard.
The performance of different downstream (DS)
Vision

The Limits of Pretraining: More pretraining doesn't guarantee a better fine-tuned AI.

The higher the accuracy of a pretrained model, the better its performance after fine-tuning, right? Not necessarily. Researchers conducted a meta-analysis of image-recognition experiments and performed some of their own.
Fake face diagram - FaceSynthetics
Vision

Fake Faces Are Good Training Data: Synthetic data improves face recognition performance.

Collecting and annotating a dataset of facial portraits is a big job. New research shows that synthetic data can work just as well.
Woman walking on a store scanning codes
Vision

Let the Model Choose Your Outfit: Inside Amazon's AI-powered clothes stores.

Amazon’s first brick-and-mortar clothing store is getting ready to deliver automated outfit recommendations. The ecommerce giant announced plans to open a flagship Amazon Style location at a Los Angeles-area mall this year.
Transformer Architecture
Vision

Transformers See in 3D: Using transformers to visualize depth in 2D images.

Visual robots typically perceive the three-dimensional world through sequences of two-dimensional images, but they don’t always know what they’re looking at. For instance, Tesla’s self-driving system has been known to mistake a full moon for a traffic light.
Multimodal deep learning model
Vision

AI Versus the Garbage Heap: How Amazon uses AI to cut waste.

Amazon reported long-term success using machine learning to shrink its environmental footprint. The online retailer developed a system that fuses product descriptions, images, and structured data to decide how an item should be packed for shipping.
Man with gun walking by detector
Vision

Stopping Guns at the Gate: How Camden Yards uses AI to scan for weapons.

A Major League Baseball stadium will be using computer vision to detect weapons as fans enter.What’s new: A system called Hexwave will look for firearms, knives, and explosives carried by baseball fans who visit Camden Yards, home field of the Baltimore Orioles.
Photograph of Yale Song
Vision

Yale Song: Foundation models for vision

Large models pretrained on immense quantities of text have been proven to provide strong foundations for solving specialized language tasks. My biggest hope for AI in 2022 is...
Yoav Shoham
Vision

Yoav Shoham: Language models that reason

I believe that natural language processing in 2022 will re-embrace symbolic reasoning, harmonizing it with the statistical operation of modern neural networks. Let me explain what I mean by this.
Alexei Efros
Vision

Alexei Efros: Learning from the ground up

Things are really starting to get going in the field of AI. After many years (decades?!) of focusing on algorithms, the AI community is finally ready to accept the central role of data and the high-capacity models that are capable of taking advantage of this data.
Wolfram Burgard
Vision

Wolfram Burgard: Train robots in the real world

Robots are tremendously useful machines, and I would like to see them applied to every task where they can do some good. Yet we don’t have enough programmers for all this hardware and all these tasks.
Abeba Birhane
Vision

Abeba Birhane: Clean up web datasets

From language to vision models, deep neural networks are marked by improved performance, higher efficiency, and better generalizations. Yet, these systems are also marked by perpetuation of bias and injustice.
A living room made out of cups of coffee: the people, the seats, the chimney, the lamp, all gather around a cozy fire.
Vision

One Architecture to Do Them All: Transformer: The AI architecture that can do it all.

The transformer architecture extended its reach to a variety of new domains.What happened: Originally developed for natural language processing, transformers are becoming the Swiss Army Knife of deep learning.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox