Veo 3 adds synchronized audio to realistic video Linux researcher uses LLMs to find security holes

Data Points

Published

May 26, 2025

Reading time

4 min read

In today’s edition, you’ll learn more about:

A new earth system model that can predict weather disasters
Google’s speedy diffusion-based text generation model
Claude 4’s new system prompts for power users
Microsoft’s AI agent marketplace for developers

But first:

Google launches Veo 3 and Flow for video generation with audio

Google DeepMind released Veo 3, its latest video generation model. Veo 3 can create notably realistic videos with speech, dialogue, voice-overs, music, and sound effects from text and image prompts. The technology enables marketers and filmmakers to produce video that previously required extensive production resources, with some companies reporting 50 percent reductions in costs and time-to-market. Google also launched Flow, an AI filmmaking tool available to Google AI Pro and Ultra subscribers in the U.S. Veo 3 is currently in private preview on Vertex AI, with broader availability coming in the coming weeks. (Google)

OpenAI’s o3 model helps discover a zero-day vulnerability in Linux kernel

A security researcher used OpenAI’s o3 model to discover CVE-2025-37899, a dangerous vulnerability in the Linux kernel’s ksmbd server. The researcher provided o3 with approximately 12,000 lines of code from the SMB protocol implementation, using only the standard API without additional frameworks or tools. The vulnerability occurs when multiple connections share session objects, allowing one thread to free memory while another thread still accesses it, potentially enabling arbitrary code execution in kernel context. This marks the first publicly documented case of an LLM finding this type of complex vulnerability, showing that LLMs (while still finding many false positives) can meaningfully assist expert vulnerability researchers. (Sean Heelan’s Blog)

Microsoft’s Aurora AI model outperforms numerical Earth system forecasts

Microsoft Research introduced Aurora, a versatile model trained on over one million hours of diverse geophysical data. Researchers claim Aurora can predict weather, air quality, ocean waves, and tropical cyclone tracks more accurately and efficiently than current operational systems. The model achieves state-of-the-art performance across multiple domains: it beats the Copernicus Atmosphere Monitoring Service (CAMS) on 74 percent of air pollution forecasting targets, surpasses ocean wave models on 86 percent of targets, outperforms seven operational centers for tropical cyclone tracking, and exceeds high-resolution weather models on 92 percent of targets. Aurora’s architecture uses a 3D Swin Transformer that can handle different resolutions, variables, and pressure levels, making it adaptable to various Earth system prediction tasks through fine-tuning. The model operates at computational speeds that are orders of magnitude faster than traditional numerical models — for example, generating air pollution forecasts approximately 100,000 times faster than CAMS while running on a single GPU. For machine learning researchers, Aurora may help develop architectures that can efficiently process 3D spatiotemporal data while maintaining physical consistency across multiple scales and modalities. (Nature)

Google unveils Gemini Diffusion, a blazing-fast experimental language model

Google DeepMind demonstrated Gemini Diffusion at I/O, an experimental language model that generates text at 1,000 to 2,000 tokens per second — four to five times faster than Google’s current fastest model. The model uses diffusion techniques, traditionally employed in image generation, to refine random noise into coherent text by processing multiple parts simultaneously rather than generating one word at a time like traditional transformers. Gemini Diffusion matches the coding performance of larger models while excelling at tasks requiring iterative refinement, such as mathematical reasoning and code generation. If successful, diffusion-based text models could reshape the competitive landscape among AI companies, particularly for coding agents and specialized applications where speed and accuracy matter more than narrative flow. Google has opened a waitlist for researchers to access the experimental demo, though no public release date or pricing have been announced. (Google)

Claude 4 system prompts offer useful info for power users

Anthropic published the system prompts for Claude Opus 4 and Claude Sonnet 4, offering users an unofficial manual for optimizing their interactions with these AI models. The prompts reveal detailed instructions about Claude’s personality, safety guidelines, and capabilities, including warnings against reproducing copyrighted content and guidance on when to use search tools. Notable features include support for “thinking blocks” where Claude can switch modes during processing, integration with tools like web search that can execute up to 5 queries for complex requests, and the Artifacts feature’s support for libraries like Three.js, React, and TensorFlow that can help create interactive applications. Anthropic notably omitted the tool-specific prompts, which were later discovered through leaked versions; these provide further crucial details about Claude’s full capabilities. (Simon Willison’s Weblog)

Microsoft launches Agent Store for AI assistants

Microsoft debuted the Agent Store, a marketplace within Microsoft 365 Copilot where users can discover and install AI agents built by Microsoft, partners, and customers. The store launches with over 70 agents designed to automate specific business processes, ranging from simple knowledge assistants to complex multi-modal orchestrators. Developers can build agents using either Microsoft Copilot Studio’s low-code tools or the Microsoft 365 Agents Toolkit for custom orchestration logic, then publish them to reach Microsoft 365 users. Microsoft’s store could make AI agents more accessible for workplace automation, complementing the company’s broader Copilot AI assistant strategy. The Agent Store is available now to both paid and free Microsoft 365 Copilot customers. (Microsoft)

Still want to know more about what matters in AI right now?

Read last week’s issue of The Batch for in-depth analysis of news and research.

Last week, Andrew Ng shared how large companies could move fast in the age of AI by creating sandbox environments that allowed small teams to innovate without needing constant permission.

“Dozens or hundreds of prototypes can be built and quickly discarded as part of the price of finding one or two ideas that turn out to be home runs.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: OpenAI introduced Codex, a new multi-agent, cloud-based software engineering tool integrated into ChatGPT; xAI attributed Grok’s controversial “white genocide” responses to an unnamed, unauthorized employee, raising concerns about internal safeguards; U.S. tech giants including Nvidia, AMD, and Amazon secured deals to supply chips and infrastructure to Middle Eastern companies like Saudi Arabia’s Humain and the UAE’s G42; and Microsoft researchers showed that 4-bit quantized versions of Llama models can match the accuracy of 16-bit models, offering major efficiency gains without compromising performance.

Subscribe to Data Points