Automated player learning by watching recorded gameplay
Machine Learning Research

Behavioral Cloning Shootout: AI learns to play Counter Strike Global Offensive.

Neural networks have learned to play video games like Dota 2 via reinforcement learning by playing for the equivalent of thousands of years (compressed into far less time). In new work, an automated player learned not by playing for millennia but by watching a few days’ worth of recorded gameplay.
Few-shot Learning with a Universal Template (FLUTE)
Machine Learning Research

Pattern for Efficient Learning: A training method for few-shot learning in computer vision.

Getting high accuracy out of a classifier trained on a small number of examples is tricky. You might train the model on several large-scale datasets prior to few-shot training, but what if the few-shot dataset includes novel classes? A new method performs well even in that case.
AI generated videos and VideoGPT training pipeline
Machine Learning Research

Synthetic Videos on the Double: VideoGPT is an efficient generative AI system for video.

Using a neural network to generate realistic videos takes a lot of computation. New work performs the task efficiently enough to run on a beefy personal computer.
Two images showing the process of turning handwriting into text
Machine Learning Research

The Writing, Not the Doodles: A handwriting detection AI model for messy paper.

Systems designed to turn handwriting into text typically work best on pages with a consistent layout, such as a single column unbroken by drawings, diagrams, or extraneous symbols. A new system removes that requirement.
Neural networks generating novel views of a 3D scene based on existing pictures
Machine Learning Research

3D Scene Synthesis for the Real World: Generating 3D scenes with radiance fields and image data

Researchers have used neural networks to generate novel views of a 3D scene based on existing pictures plus the positions and angles of the cameras that took them. In practice, though, you may not know the precise camera
Architecture of vision-language tasks
Machine Learning Research

One Model for Vision-Language: A general purpose AI for vision and language tasks.

Researchers have proposed task-agnostic architectures for image classification tasks and language tasks. New work proposes a single architecture for vision-language tasks.
Protein structures
Machine Learning Research

What AI Knows About Proteins: NLP systems can be used to code amino acids.

Transformer models trained on sequences of amino acids that form proteins have had success classifying and generating viable sequences. New research shows that they also capture information about protein structure.
A new metod for compressing images and yielding better classification
Machine Learning Research

What Machines Want to See: An image compressor for more accurate computer vision

Researchers typically downsize images for vision networks to accommodate limited memory and accelerate processing. A new method not only compresses images but yields better classification.
Process showing how FastNeRF accelerates the photorealistic 3D rendering method
Machine Learning Research

Virtual Reality in Real Time: FastNeRF renders 3D scenes at 200 frames per second.

Ideally, real-time 3D applications such as virtual and augmented reality transition smoothly between different viewpoints of a scene — but generating a fresh perspective can take time. New research speeds the process.
Minecraft video capture
Machine Learning Research

3D Object Factory: Researchers train neural networks to build in Minecraft.

In the open-ended video game Minecraft, players extract blocks of virtual materials from a 3D environment to assemble objects of their own design, from trees to cathedrals. Researchers trained neural networks to generate these structures.
Diagram showing how Project Debater works
Machine Learning Research

Up for Debate: IBM's NLP-powered debate bot mines LexisNexis.

IBM’s Watson question-answering system stunned the world in 2011 when it bested human champions of the TV trivia game show Jeopardy! Although the Watson brand has fallen on hard times, the company’s language-processing prowess continues to develop.
Semantic Similarity Video Retrieval (SVR) working
Machine Learning Research

Toward Better Video Search: An NLP system for improved video search

Researchers at the University of Bristol led by Michael Wray propose a new benchmark, Semantic Similarity Video Retrieval (SVR), that evaluates video retrieval systems by their ability to rank many similar videos. They also built a system that performed well on it.
System designed to isolate changes in the pose of a two-dimensional figure
Machine Learning Research

Motion Mapper: An AI system for automated animations for video game sprites

In some animated games, different characters can perform the same actions — say, walking, jumping, or casting spells. A new system learned from unlabeled data to transfer such motions from one character to another.
A generative adversarial network (GAN)
Machine Learning Research

Image Generation Transformed: New research combines GANs with transformers.

A recent generative adversarial network (GAN) produced more coherent images using modified transformers that replaced fully connected layers with convolutional layers. A new GAN achieved a similar end using transformers in their original form.
Data related to SElf-supERvised (SEER), an image classifier pretrained on unlabeled images
Machine Learning Research

Pretraining on Uncurated Data: How unlabeled data improved computer vision accuracy.

It’s well established that pretraining a model on a large dataset improves performance on fine-tuned tasks. In sufficient quantity and paired with a big model, even data scraped from the internet at random can contribute to the performance boost.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox