Machine Learning Research

591 Posts

Flowchart shows book text split, input summary, model training, and memorization testing in LLM workflow.
Machine Learning Research

Fine-Tuning LLMs to Expand on Summaries Unearths Pretraining Texts: Fine-Tuning can strip models of copyright alignment guidelines

Fine-tuning large language models on a seemingly benign task that would be useful to writers — expanding plot summaries into paragraphs of polished fiction — causes them to regurgitate substantial portions of books on which they were pretrained.
Flowchart depicting LLMs memorizing and responding to state media, affecting language-specific outputs.
Machine Learning Research

Qwen3.7-Max Adds Speed and Power: Alibaba's latest proprietary model challenges U.S. rivals

Alibaba updated its flagship large language model for long-running agentic work, pushing it into the top rank among LLMs built in China.
Diagram showing step-by-step image creation process, featuring bears, cats, and birds as examples.
Machine Learning Research

Planning Generated Images In Stages: Meta improves image models by plotting and revising generations step-by-step

Text-to-image generators that use diffusion or flow-matching typically compose a whole image at once (although they refine the whole image in steps).
An arm juggles three EU star-adorned rings, representing the EU balancing new AI regulatory amendments.
Machine Learning Research

Europe Pauses Some AI Regulations: European Union regulators delay some AI Act provisions, delete others

The European Union weakened some provisions of its landmark AI Act and delayed others after businesses and policymakers argued the law made European companies less competitive.
Gemini 3.5 Flash shows improved performance, surpassing previous model scores in most benchmarks.
Machine Learning Research

Gemini 3.5 Flash Pairs Smarts With Speed: Google's updated Flash levels up, approaching top models but raising prices

Google’s faster model brings substantive gains at a substantially higher price, part of a rising trend in prices per token.
The chart compares AI benchmark efforts with employment and capital in U.S. job sectors, highlighting discrepancies.
Machine Learning Research

Toward Agent Benchmarks That Reflect Human Work: AI agents may not be getting better at full range of economically valuable labor

AI agents seem to be increasingly capable of performing economically valuable tasks, but current benchmarks measure this capability only narrowly.
Diagram showing threat actor using AI to find vulnerabilities and bypass two-factor authentication.
Machine Learning Research

Cybersecurity Alarms Grow Louder: Google study shows LLM-generated malware is getting harder to track and stop

An AI-generated script to bypass two-factor authentication signals a dawning era of industrial-scale cyberattacks, according to a Google report.
Performance data table displays metrics for conversational models, emphasizing TML-Interaction-Small's results.
Machine Learning Research

Built-In Conversational Interactivity: Thinking Machines reveals its first interaction model, a new type of multimodal AI

Conversational models typically wait for a turn before they respond.
A woman in martial arts attire faces off against a cartoon lobster in a futuristic cityscape.
Machine Learning Research

Hermes Agent Challenges OpenClaw: OpenClaw created a class of personal agents; upstart Hermes Agent is outworking it

OpenClaw, the immensely popular AI agent, has fast-rising competition.
Map with UK sites; flowchart depicts mammogram study steps, highlighting AI’s role alongside doctors.
Machine Learning Research

AI Mammogram Diagnosis Under Real-World Conditions: Two studies test Google's breast cancer detection models in clinics

Introduced in 2020, Google’s AI system for detecting breast cancer in mammograms still hasn't been used to diagnose current patients.
Graph depicts GPT-Realtime-2's performance across sectors, competing with other speech-to-speech models.
Machine Learning Research

OpenAI Challenges Speech-to-Speech Leaders: RealTime API updates audio models that reason, transcribe, and translate

An update of OpenAI’s speech-to-speech model lets developers tune the tradeoff between speed and reasoning.
Chart compares U.S. and PRC AI model performance over time, highlighting Elo scores and increasing trends.
Machine Learning Research

U.S. to Evaluate Upcoming Models: U.S. Government Will Test AI Models for National Security Risks, Other Hazards Prior to Release

The U.S. government said it will evaluate cutting-edge models before they’re available to the public, a sharp reversal of the White House’s earlier hands-off policy.
Diagram showing sequential task learning steps with images of robotic tasks and flow arrows.
Machine Learning Research

Robots That Adapt to New Tasks: Sony and university researchers train robots on new tasks without catastrophic forgetting

Neural networks can forget how to perform earlier tasks as they learn new ones.
Infographic showing Nvidia's chip design flow, highlighting placer, router, and optimization stages.
Machine Learning Research

How Nvidia Uses AI to Design Chips: Chipmaker's models design circuits, verify designs, and test new layouts

Nvidia’s chief scientist dreams of telling an AI model to design a new GPU, then skiing for a couple days while the system does the job.
Through a rainy window, a pizza worker prepares food beneath menu boards and a red neon "Pizza" sign.
Machine Learning Research

ByteDance Bids for Video Leadership: ByteDance adds state-of-the-art Seedance 2.0 video to Capcut, while OpenAI retreats

As OpenAI prepares to shut down Sora, ByteDance made its own video generation model available to hundreds of millions of users.
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox