Adversarial Attacks

11 Posts

Two images showing RETRO Architecture and Gopher (280B) vs State of the Art
Adversarial Attacks

Large Language Models Shrink: Gopher and RETRO Prove Lean Language Models Can Push Boundaries

DeepMind released three papers that push the boundaries — and examine the issues — of large language models.
2 min read
Examples of InstaHide scrambling images
Adversarial Attacks

A Privacy Threat Revealed

With access to a trained model, an attacker can use a reconstruction attack to approximate its training data. A method called InstaHide recently won acclaim for promising to make such examples unrecognizable to human eyes while retaining their utility for training.
2 min read
Screen captures of online platform Dynabench
Adversarial Attacks

Dynamic Benchmarks

Benchmarks provide a scientific basis for evaluating model performance, but they don’t necessarily map well to human cognitive abilities. Facebook aims to close the gap through a dynamic benchmarking method that keeps humans in the loop.
2 min read
Data and examples related to a new technique to detect portions of an image
Adversarial Attacks

The Telltale Artifact

Deepfakes have gone mainstream, allowing celebrities to star in commercials without setting foot in a film studio. A new method helps determine whether such endorsements — and other images produced by generative adversarial networks — are authentic.
2 min read
Images and data related to a t-shirt that tricks a variety of object detection models into failing to spot people
Adversarial Attacks

Hidden in Plain Sight

With the rise of AI-driven surveillance, anonymity is in fashion. Researchers are working on clothing that evades face recognition systems and designed a t-shirt that tricks a variety of object detection models into failing to spot people.
2 min read
Data and information related to shortcut learning
Adversarial Attacks

When Models Take Shortcuts

Neuroscientists once thought they could train rats to navigate mazes by color. Rats don’t perceive colors at all. Instead, they rely on the distinct odors of different colors of paint. New work finds that neural networks are prone to this sort of misalignment between training goals and learning.
2 min read
Data related to neural networks
Adversarial Attacks

Undercover Networks

Neural networks can spill their secrets to those who know how to ask. A new approach secures them from prying eyes. Researchers demonstrate that that adversaries can find out a model’s parameter values by measuring its power use.
2 min read
Autonomous vehicle detecting images projected on the street
Adversarial Attacks

Phantom Menace

Some self-driving cars can’t tell the difference between a person in the roadway and an image projected on the street. A team of researchers used projectors to trick semiautonomous vehicles into detecting people, road signs, and lane markings that didn’t exist.
2 min read
Graph related to LIME and SHAP methods
Adversarial Attacks

Bias Goes Undercover

As black-box algorithms like neural networks find their way into high-stakes fields such as transportation, healthcare, and finance, researchers have developed techniques to help explain models’ decisions. New findings show that some of these methods can be fooled.
2 min read
DeepPrivacy results on a diverse set of images
Adversarial Attacks

Anonymous Faces

A number of countries restrict commercial use of personal data without consent unless they’re fully anonymized. A new paper proposes a way to anonymize images of faces, purportedly without degrading their usefulness in applications that rely on face recognition.
2 min read
T-shirt covered with images of license plates
Adversarial Attacks

This Shirt Hates Surveillance

Automatic license plate readers capture thousands of vehicle IDs each minute, allowing law enforcement and private businesses to track drivers with or without their explicit consent.
1 min read

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox