Ilya Sutskever
Reinforcement Learning

Ilya Sutskever: OpenAI’s co-founder on building multimodal AI models

The past year was the first in which general-purpose models became economically useful. GPT-3, in particular, demonstrated that large language models have surprising linguistic competence and the ability to perform a wide variety of useful tasks.
AI-driven balloon reaching high altitude
How to Drive a Balloon: How high-altitude balloons navigate using AI.

Helium balloons that beam internet service to hard-to-serve areas are using AI to navigate amid high-altitude winds. Loon, the Alphabet division that provides wireless internet via polyethylene blimps.
Fighter pilot in action
Phantom Menace: Fighter pilot trains against augmented reality jet.

A fighter pilot battled a true-to-life virtual enemy in midair. In the skies over southern California, an airman pitted his dogfighting skills against an AI-controlled opponent that was projected onto his augmented-reality visor.
Takes from Agence, an interactive VR project
RL Agents SOS!: Inside Agence, a reinforcement learning video game.

A new multimedia experience lets audience members help artificially intelligent creatures work together to survive. Agence, an interactive virtual reality (VR) project blends audience participation with reinforcement learning to create an experience that’s half film, half video game.
Example of Occupancy Anticipation, a navigation system that predicts unseen obstacles, working
Guess What Happens Next: Research teaches robots to predict unseen obstacles.

New research teaches robots to anticipate what’s coming rather than focusing on what’s right in front of them. Researchers developed Occupancy Anticipation (OA), a navigation system that predicts unseen obstacles in addition to observing those in its field of view.
Different chess moves
Chess: The Next Move: Chess masters use AI to test variants of the game.

AI has humbled human chess masters. Now it’s helping them take the game to the next level. DeepMind and retired chess champion Vladimir Kramnik trained AlphaZero, a reinforcement learning model that bested human experts in chess, Go, and Shogi, to play-test changes in the rules.
Sequence of an autonomous fighter pilot
AI Versus Ace: An AI fighter pilot beat a human ace in a virtual dogfight.

An autonomous fighter pilot shot down a human aerial ace in virtual combat. Built by defense contractor Heron Systems, the system also defeated automated rivals from seven other companies to win the AlphaDogfight trial.
Data related to experience replay
Experience Counts: Research proposes an upgrade to experience replay.

If the world changes every second and you take a picture every 10 seconds, you won’t have enough pictures to observe the changes clearly, and storing a series of pictures won’t help. On the other hand, if you take a picture every tenth of a second, then storing a history will help model the world.
Series of pictures of people smiling
Deepfakes for Good: Tencent on the commercial value of deepfakes

A strategy manifesto from one of China’s biggest tech companies declares, amid familiar visions of ubiquitous AI, that deepfakes are more boon than bane.
Information related to Policy Adaptation during Deployment (Pad)
Same Job, Different Scenery: A reinforcement learning technique for visual changes

People who take driving lessons during daytime don’t need instruction in driving at night. They recognize that the difference doesn’t disturb their knowledge of how to drive. Similarly, a new reinforcement learning method manages superficial variations in the environment without re-training.
Man with prosthetic leg walking
AI Steps Up: This prosthetic leg uses AI to learn a human-like stride.

A prosthetic leg that learns from the user’s motion could help amputees walk more naturally. Researchers from the University of Utah designed a robotic leg that uses machine learning to generate a human-like stride.
Data related to a new reinforcement learning approach
Eyes on the Prize: Vision-only reinforcement learning improves generalizability.

When the chips are down, humans can track critical details without being distracted by irrelevancies. New research helps reinforcement learning models similarly focus on the most important details.
Takes from videogame Source of Madness
Monsters in Motion: Source of Madness is an AI-powered Lovecraftian nightmare.

How do you control a video game that generates a host of unique monsters for every match? With machine learning, naturally. The otherworldly creatures in Source of Madness learn how to target players through reinforcement learning.
Graphs and data related to Plan2Vec
Visual Strategies for RL: Plan2Vec helps reinforcement learning with complex tasks.

Reinforcement learning can beat humans at video games, but humans are better at coming up with strategies to master more complex tasks. New work enables neural networks to connect the dots.
Data related to reinforcement learning and optimization of worker productivity and income equality
Taxation With Vector Representation: A reinforcement learning approach to better tax policy

Governments have struggled to find a tax formula that promotes prosperity without creating extremes of wealth and poverty. Can machine learning show the way?

