Machine Learning Research

515 Posts

In a lab, four robots move a metal frame using graph neural network coordination on a platform.
Machine Learning Research

Coordinating Robot Teams: Google DeepMind’s RoboBallet project blends GNNs with RL to drive 8-armed robots

In factories, where teams of robotic arms work in tight spaces, their motions are programmed by hand to keep them from interfering with one another. Researchers automated this programming using graph neural networks trained via reinforcement learning.
Graph shows Ernie-4.5 outperforming competitors in document understanding and visual reasoning tasks.
Machine Learning Research

Baidu’s Multimodal Bids: Giant Ernie 5 natively generates multiple media; Ernie-4.5-VL-28B-A3B-Thinking tops Vision-Language metrics

Baidu debuted two models: a lightweight, open-weights, vision-language model and a giant, proprietary, multimodal model built to take on U.S. competitors.
GIF showing a 360° walkthrough of a conference room with a wooden table, high-back chairs, wall screens, and ceiling lights.
Machine Learning Research

Generated, Editable Virtual Spaces: World Labs makes Marble world model public, adds Chisel editing tool

Models that generate 3D spaces typically generate them as users move through them without generating a persistent world to be explored later. A new model produces 3D worlds that can be exported and modified.
GIF showing AI object detection tagging penguins on a beach, cars in traffic, and dancing people.
Machine Learning Research

Open 3D Generation Pipeline: Meta’s SAM 3 image segmentation models can analyze and create bodies and other objects

Meta’s Segment Anything Model (SAM) image-segmentation model has evolved into an open-weights suite for generating 3D objects. SAM 3 segments images, SAM 3D turns the segments into 3D objects, and SAM 3D Body produces 3D objects of any people among the segments. You can experiment with all three.
Diagram shows AI traits with pipelines for "evil" vs. "helpful" responses to user queries on animal treatment.
Machine Learning Research

Toward Steering LLM Personality: Persona Vectors allow model builders to identify and edit out sycophancy, hallucinations, and more

Large language models can develop character traits like cheerfulness or sycophancy during fine-tuning. Researchers developed a method to identify, monitor, and control such traits.
Table shows Gemini 3 Pro leading in benchmarks, outperforming Gemini 2.5, Claude Sonnet 4.5, and GPT-5.1.
Machine Learning Research

Google Dominates Arena Leaderboards (For the Moment): Gemini 3 Pro and Nano Banana Pro boast best-in-class multimodal reasoning and image generation

Google introduced Gemini 3 Pro and Nano Banana Pro, its flagship vision-language and image-generation models, and deployed them to billions of users worldwide.
Image illustrates the Self-Search method, simulating web searches to improve model accuracy in tests.
Machine Learning Research

More-Efficient Agentic Search: Researchers fine-tune models to search their own parameters to boost recall

Large language models may have learned knowledge that’s relevant to a given prompt, but they don’t always recall it consistently. Fine-tuning a model to search its parameters as though it were searching the web can help it find knowledge in its own weights.
Visual map outlines cybercrime operation phases, highlighting AI-driven processes and human validation steps.
Machine Learning Research

Anthropic Cyberattack Report Sparks Controversy: Security researchers question whether coding agents allow unprecedented automated attacks

Independent cybersecurity researchers pushed back on a report by Anthropic that claimed hackers had used its Claude Code agentic coding system to perpetrate an unprecedented automated cyberattack.
Chart highlights Kimi K2’s top performance in agentic tasks, outperforming rivals in reasoning and coding.
Machine Learning Research

Top Agentic Results, Open Weights: Kimi K2 Thinking outperforms proprietary models with new techniques for agentic tool use

The latest open-weights large language model from Moonshot AI challenges top proprietary LLMs at agentic tasks by executing hundreds of tool calls sequentially and pausing to think between each.
White Waymo vehicle near water, city skyline visible; displays autonomous service for urban freeways.
Machine Learning Research

Self-Driving Cars on U.S. Freeways: Waymo deploys autonomous cars on California and Arizona expressways

Waymo became the first company to offer fully autonomous, driverless taxi service on freeways in the United States.
Series of graphs transformed via tokenization and transformer layers, resulting in predicted outputs.
Machine Learning Research

Forecasting Multiple Time Series: Amazon’s Chronos-2 sorts out tangled variables to make better predictions

Transformers are well suited to predicting future values of time series like energy prices, wages, or weather, but often — as in those examples — multiple time series often influence one another. Researchers built a model that can forecast multiple time series simultaneously.
AI models are compared on a graph showing benchmark accuracy from 20% to 100%, highlighting GPT-5's rise.
Machine Learning Research

The Year AI Went Industrial: The State of AI Report 2025 says AI’s barriers aren’t technological but social and material

A year-in-review report heralds the dawn of AI’s industrial era.
Bar chart shows HunyuanImage 3.0's performance against Nano Banana and Seedream 4.0, highlighting differences.
Machine Learning Research

Better Images Through Reasoning: HunyuanImage-3.0 uses reinforcement learning and thinking tokens to better understand prompts

A new image generator reasons over prompts to produce outstanding pictures.
Chart illustrates exact and approximate memorization percentages in different Gemma models.
Machine Learning Research

Masking Private Data in Training Sets: Google researchers released VaultGemma, an open-weights model redacting personal information

Large language models often memorize details in their training data, including private information that may appear only once, like a person’s name, address, or phone number. Researchers built the first open-weights language model that’s guaranteed not to remember such facts.
Chart displays MiniMax-M2 with high intelligence and competitive pricing, outshining other models.
Machine Learning Research

Open-Weights Coding Leader: MiniMax-M2’s lightweight footprint and low costs belie that its top performance

An open-weights model from Shanghai-based MiniMax challenges top proprietary models on key benchmarks for coding and agentic tasks.
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox