Adversarial Attacks

13 Posts

Don’t Steal My Style: Glaze tool prevents AI from learning an artist's style.

Asked to produce “a landscape by Thomas Kinkade,” a text-to-image generator fine-tuned on the pastoral painter’s work can mimic his style in seconds, often for pennies. A new technique aims to make it harder for algorithms to mimic an artist’s style.

Model that defeats KataGo, an open source Go-playing system

Adversarial Attacks

Champion Model Is No Go: Adversarial AI Beats Master KataGo Algorithm

A new algorithm defeated a championship-winning Go model using moves that even a middling human player could counter. Researchers trained a model to defeat KataGo, an open source Go-playing system that has beaten top human players.

Two images showing RETRO Architecture and Gopher (280B) vs State of the Art

Adversarial Attacks

Large Language Models Shrink: Gopher and RETRO prove lean language models can push boundaries.

DeepMind released three papers that push the boundaries — and examine the issues — of large language models.

Adversarial Attacks

A Privacy Threat Revealed: How researchers cracked InstaHide for computer vision.

With access to a trained model, an attacker can use a reconstruction attack to approximate its training data. A method called InstaHide recently won acclaim for promising to make such examples unrecognizable to human eyes while retaining their utility for training.

Screen captures of online platform Dynabench

Adversarial Attacks

Dynamic Benchmarks: A platform for fooling language models

Benchmarks provide a scientific basis for evaluating model performance, but they don’t necessarily map well to human cognitive abilities. Facebook aims to close the gap through a dynamic benchmarking method that keeps humans in the loop.

Data and examples related to a new technique to detect portions of an image

Adversarial Attacks

The Telltale Artifact: A technique for detecting GAN-generated deepfakes

Deepfakes have gone mainstream, allowing celebrities to star in commercials without setting foot in a film studio. A new method helps determine whether such endorsements — and other images produced by generative adversarial networks — are authentic.

Images and data related to a t-shirt that tricks a variety of object detection models into failing to spot people

Adversarial Attacks

Hidden in Plain Sight: Researchers make clothes that fool face recognition.

With the rise of AI-driven surveillance, anonymity is in fashion. Researchers are working on clothing that evades face recognition systems and designed a t-shirt that tricks a variety of object detection models into failing to spot people.

Data and information related to shortcut learning

Adversarial Attacks

When Models Take Shortcuts: The causes of shortcut learning in neural networks

Neuroscientists once thought they could train rats to navigate mazes by color. Rats don’t perceive colors at all. Instead, they rely on the distinct odors of different colors of paint. New work finds that neural networks are prone to this sort of misalignment between training goals and learning.

Adversarial Attacks

Undercover Networks: Protecting neural networks from differential power analysis

Neural networks can spill their secrets to those who know how to ask. A new approach secures them from prying eyes. Researchers demonstrate that that adversaries can find out a model’s parameter values by measuring its power use.

Autonomous vehicle detecting images projected on the street

Adversarial Attacks

Phantom Menace: Fake images can fool some self-driving cars.

Some self-driving cars can’t tell the difference between a person in the roadway and an image projected on the street. A team of researchers used projectors to trick semiautonomous vehicles into detecting people, road signs, and lane markings that didn’t exist.

Adversarial Attacks

Bias Goes Undercover: Adversarial attacks can fool explainable AI techniques.

As black-box algorithms like neural networks find their way into high-stakes fields such as transportation, healthcare, and finance, researchers have developed techniques to help explain models’ decisions. New findings show that some of these methods can be fooled.

Adversarial Attacks

Anonymous Faces

A number of countries restrict commercial use of personal data without consent unless they’re fully anonymized. A new paper proposes a way to anonymize images of faces, purportedly without degrading their usefulness in applications that rely on face recognition.

T-shirt covered with images of license plates

Adversarial Attacks

This Shirt Hates Surveillance

Automatic license plate readers capture thousands of vehicle IDs each minute, allowing law enforcement and private businesses to track drivers with or without their explicit consent.

Adversarial Attacks

Don’t Steal My Style: Glaze tool prevents AI from learning an artist's style.

Champion Model Is No Go: Adversarial AI Beats Master KataGo Algorithm

Large Language Models Shrink: Gopher and RETRO prove lean language models can push boundaries.

A Privacy Threat Revealed: How researchers cracked InstaHide for computer vision.

Dynamic Benchmarks: A platform for fooling language models

The Telltale Artifact: A technique for detecting GAN-generated deepfakes

Hidden in Plain Sight: Researchers make clothes that fool face recognition.

When Models Take Shortcuts: The causes of shortcut learning in neural networks

Undercover Networks: Protecting neural networks from differential power analysis

Phantom Menace: Fake images can fool some self-driving cars.

Bias Goes Undercover: Adversarial attacks can fool explainable AI techniques.

Anonymous Faces

This Shirt Hates Surveillance

Subscribe to The Batch