Reinforcement Learning

75 Posts

Robo-Football From Simulation to Reality: Reinforcement learning powers humanoid robots to play football

Humanoid robots can play football (known as soccer in the United States) in the real world, thanks to reinforcement learning.

Reinforcement Learning

Learning Language by Exploration: Agent develops language skills through simulated exploration tasks

Machine learning models typically learn language by training on tasks like predicting the next word in a given text. Researchers trained a language model in a less focused, more human-like way.

Assembly pseudocode before and after applying the AlphaDev swap move

Reinforcement Learning

AI Builds Better Sorting Algorithms: AlphaDev, a new system for high-speed sorting of lists and numbers

Online sorting algorithms run trillions of times a day to organize lists according to users’ interests. New work found faster alternatives. Daniel J. Mankowitz and colleagues at Google developed AlphaDev, a system that learned to generate algorithms that sort three...

Reinforcement Learning

Stratego Master: DeepNash, the RL system that plays Stratego like a master

Reinforcement learning agents have mastered games like Go that provide complete information about the state of the game to players. They’ve also excelled at Texas Hold ’Em poker, which provides incomplete information, as few cards are revealed.

Reinforcement Learning

Bug Finder: A system that provides feedback with near human-level accuracy

One challenge to making online education available worldwide is evaluating an immense volume of student work. Especially difficult is evaluating interactive computer programming assignments such as coding a game.

Reinforcement Learning

Letting Chatbots See Your Data: Coding framework LlamaIndex enables data interaction with LLMs

A new coding framework lets you pipe your own data into large language models. LlamaIndex streamlines the coding involved in enabling developers to summarize, reason over, and otherwise manipulate data from documents, databases, and apps using models like GPT-4.

Reinforcement Learning

Optimizing Matrix Multiplication: AlphaTensor for faster matrix multiplication, explained

Matrix multiplication is executed so often in deep learning, video games, and scientific computing that even a slight acceleration can save substantial amounts of processing time. New work finds ways to speed up this crucial operation.

Transformer-based system simulating simulate the Atari game "Pong"

Reinforcement Learning

Efficient Reinforcement Learning: IRIS used reinforcement learning to master Atari games with little gameplay.

Both transformers and reinforcement learning models are notoriously data-hungry. They may be less so when they work together. Vincent Micheli and colleagues at the University of Geneva trained a transformer-based system to simulate Atari games using a small amount of gameplay.

Examples of learned gaits acquired on a variety of real-world terrains

Reinforcement Learning

Real-World Training on the Double: A new method rapidly trains robots in the real world.

Roboticists often train their machines in simulation, where the controller model can learn from millions of hours of experience. A new method trained robots in the real world in 20 minutes.

Architecture for PointGoal Navigation on a legged robot

Reinforcement Learning

Streamlined Robot Training: Robots trained in lo-fi simulation perform better in reality.

Autonomous robots trained to navigate in a simulation often struggle in the real world. New work helps bridge the gap in a counterintuitive way.

Scene from a video game called Rocket League

Reinforcement Learning

AI Cheat Bedevils Popular Esport: Gamers are using AI to cheat in Rocket League.

Reinforcement learning is powering a new generation of video game cheaters. Players of Rocket League, a video game that ranks among the world’s most popular esports, are getting trounced by cheaters who use AI models originally developed to train contestants.

Reinforcement Learning

Google’s Rule-Respecting Chatbot: Research helps AI chatbots be more truthful and less hateful.

Amid speculation about the threat posed by OpenAI’s ChatGPT chatbot to Google’s search business, a paper shows how the search giant might address the tendency of such models to produce offensive, incoherent, or untruthful dialog.

Reinforcement Learning

Yoshua Bengio: Deep learning pioneer Yoshua Bengio looks forward to neural nets that can reason.

Recent advances in deep learning largely have come by brute force: taking the latest architectures and scaling up compute power, data, and engineering. Do we have the architectures we need, and all that remains is to develop better hardware and datasets so we can keep...

Model that defeats KataGo, an open source Go-playing system

Reinforcement Learning

Champion Model Is No Go: Adversarial AI Beats Master KataGo Algorithm

A new algorithm defeated a championship-winning Go model using moves that even a middling human player could counter. Researchers trained a model to defeat KataGo, an open source Go-playing system that has beaten top human players.

Robot with an arm, camera, and gripper handing over a plastic bottle to a person

Reinforcement Learning

Parsing Commands Into Actions: NLP Helps Google Robot Understand Spoken Instructions

A new method enables robots to respond helpfully to verbal commands by pairing a natural language model with a repertoire of existing skills.

Reinforcement Learning

Robo-Football From Simulation to Reality: Reinforcement learning powers humanoid robots to play football

Learning Language by Exploration: Agent develops language skills through simulated exploration tasks

AI Builds Better Sorting Algorithms: AlphaDev, a new system for high-speed sorting of lists and numbers

Stratego Master: DeepNash, the RL system that plays Stratego like a master

Bug Finder: A system that provides feedback with near human-level accuracy

Letting Chatbots See Your Data: Coding framework LlamaIndex enables data interaction with LLMs

Optimizing Matrix Multiplication: AlphaTensor for faster matrix multiplication, explained

Efficient Reinforcement Learning: IRIS used reinforcement learning to master Atari games with little gameplay.

Real-World Training on the Double: A new method rapidly trains robots in the real world.

Streamlined Robot Training: Robots trained in lo-fi simulation perform better in reality.

AI Cheat Bedevils Popular Esport: Gamers are using AI to cheat in Rocket League.

Google’s Rule-Respecting Chatbot: Research helps AI chatbots be more truthful and less hateful.

Yoshua Bengio: Deep learning pioneer Yoshua Bengio looks forward to neural nets that can reason.

Champion Model Is No Go: Adversarial AI Beats Master KataGo Algorithm

Parsing Commands Into Actions: NLP Helps Google Robot Understand Spoken Instructions

Subscribe to The Batch