Object-Detection Transformers Simplified: New Research Improves Object Detection With Vision Transformers

ViTDet, a new system from Facebook, adds an object detector to a plain pretrained transformer.
Two randomly cropped pictures

Tradeoffs for Higher Accuracy: Data Augmentation Plus Weight Decay can Boost Some AI Models

Vision models can be improved by training them on several altered versions of the same image and also by encouraging their weights to be close to zero. Recent research showed that both can have adverse effects that may be difficult to detect.
Masked Auto-Encoder (MAE) explanation

Who Was That Masked Input? Pretraining Method Improves Computer Vision Performance

Researchers have shown that it’s possible to train a computer vision model effectively on around 66 percent of the pixels in each training image. New work used 25 percent, saving computation and boosting performance to boot.
House for sale AD

U.S. Acts Against Algorithmic Bias: Meta Removes Bias from its Ad Algorithms

Regulators are forcing Meta (formerly Facebook) to display certain advertisements more evenly across its membership. The United States government compelled Meta to revise its ad-placement system to deliver ads for housing to members regardless of their age, gender, or ethnicity.
Gato’s performance on simulated control tasks | Image captions generated by Gato

One Model, Hundreds of Tasks: Multimodal Transformer Performs Over 600 Different Tasks

Researchers took a step toward achieving a longstanding goal: One model that performs a whole lot of very different tasks. Scott Reed, Konrad Żołna, Emilio Parisotto and a team at DeepMind announced Gato.
Overview of Graph Hyper Network (GHN-2)

Who Needs Training? Graph neural network selects optimal weights for image tasks.

When you’re training a neural network, it takes a lot of computation to optimize its weights using an iterative algorithm like stochastic gradient descent. Wouldn’t it be great to compute the best parameter values in one pass? A new method takes a substantial step in that direction.
AI Research SuperCluster (RSC)

New Supercomputer on the Block: All about Meta's AI Research Supercluster.

Facebook’s parent company is staking its future on a new compute cluster. Meta unveiled AI Research SuperCluster (RSC), which is designed to accelerate training of large models for applications like computer vision, natural language processing, and speech recognition.
Photograph of Yale Song

Yale Song: Foundation models for vision.

Large models pretrained on immense quantities of text have been proven to provide strong foundations for solving specialized language tasks. My biggest hope for AI in 2022 is...
A living room made out of cups of coffee: the people, the seats, the chimney, the lamp, all gather around a cozy fire.

One Architecture to Do Them All: Transformer: The AI architecture that can do it all.

The transformer architecture extended its reach to a variety of new domains.What happened: Originally developed for natural language processing, transformers are becoming the Swiss Army Knife of deep learning.
An illustration shows a cozy cabin where all the furniture is made out of coffee mugs.

Transformers Take Over: Transformers Applied to Vision, Language, Video, and More

In 2021, transformers were harnessed to discover drugs, recognize speech, and paint pictures — and much more.
Illustration of a woman riding a sled

Multimodal AI Takes Off: Multimodal Models, such as CLIP and DALL-E, are taking over AI.

While models like GPT-3 and EfficientNet, which work on text and images respectively, are responsible for some of deep learning’s highest-profile successes, approaches that find relationships between text and images made impressive
Demostration of Suspicious User Control tool in the Stream Chat section (Twitch)

Troll Recognition: Twitch uses AI to flag trolls who try to avoid bans.

A prominent online streaming service is using a machine learning model to identify trolls who try to get around being banned.
Google's Decision Transformer

Reinforcement Learning Transformed: Transformers succeed at reinforcemend learning tasks.

Transformers have matched or exceeded earlier architectures in language modeling and image classification. New work shows they can achieve state-of-the-art results in some reinforcement learning tasks as well.
A conversation between a human and an open-domain chatbot.

Long-Haul Chatbot: Facebook Chatbot is Able to Carry on Long Conversations

Facebook released a chatbot that summarizes dialog on the fly and uses the summary to generate further repartee.
Example comparing a nonaugmented model (left) to a model with internet-augmentation (right)

This Chatbot Does Its Research: Facebook Chatbot Uses the Internet to Inform its Answers

Chatbots often respond to human input with incorrect or nonsensical answers. Why not enable them to search for helpful information?

