In today’s edition of Data Points, you’ll learn more about:
- Google’s research preview for CodeMender
- Anthropic’s Petri, an open framework for automating alignment tests
- Music labels near a deal with AI companies
- Nano Banana becoming officially production available
But first:
OpenAI launches apps inside ChatGPT, with new Apps SDK
At its DevDay, OpenAI demonstrated a new feature that lets users work with third-party applications directly within ChatGPT conversations. Users can tag apps like Canva (to design posters) or Zillow (to search for homes) while ChatGPT provides context and advice throughout the process. The initial launch includes apps from Booking.com, Canva, Coursera, Expedia, Figma, Spotify, and Zillow, with DoorDash, OpenTable, Target, and Uber coming in the following weeks. Developers can access the new Apps Software Developer Kit in preview today, with app submission for review opening later this year alongside a browsable app directory. CEO Sam Altman says OpenAI plans to share monetization guidance soon. (OpenAI and The Verge)
Zhipu AI releases GLM-4.6 with stronger coding performance
Zhipu AI launched an updated version of its flagship GLM language model that expands the context window from 128,000 to 200,000 tokens and improves coding, reasoning, and agentic capabilities. The model shows gains over its predecessor GLM-4.5 across eight public benchmarks, performing competitively with DeepSeek-V3.2-Exp and Claude Sonnet 4, though it still trails Claude Sonnet 4.5 in coding tasks. In real-world evaluations, GLM-4.6 achieved near parity with Claude Sonnet 4 with a 48.6 percent win rate, while completing tasks with approximately 15 percent fewer tokens than GLM-4.5. The model is available through the Z.ai API platform, works with coding agents like Claude Code, and can be deployed locally using weights published on HuggingFace and ModelScope. (Z.ai)
Google unveils AI agent to find and patch security flaws in code
Google introduced CodeMender, an AI agent that automatically discovers and fixes security vulnerabilities in software. The system combines Gemini models with program analysis tools like static and dynamic analysis, fuzzing, and SMT solvers to identify security flaws and generate patches. CodeMender uses multi-agent systems and automatic validation to ensure code changes are correct, avoid regressions, and follow style guidelines before human review. The tool already contributed 72 security fixes to open source projects, including codebases with up to 4.5 million lines of code, and can proactively rewrite code to use more secure data structures and APIs. Google is introducing CodeMender as a research preview before making it publicly available. (Google)
Anthropic releases tool for automated AI safety testing
Anthropic released Petri, an open source framework that uses AI agents to automatically test frontier models for misaligned behaviors. The tool works by having an auditor agent interact with a target model across different scenarios, simulating environments and creating synthetic tools, while a judge component scores the resulting transcripts for concerning behaviors. When applied to 14 frontier models with 111 seed instructions, Petri elicited behaviors including autonomous deception, oversight subversion, and cooperation with harmful requests. In pilot evaluations, Claude Sonnet 4.5 and GPT-5 showed the strongest safety profiles, while Gemini 2.5 Pro, Grok-4, and Kimi K2 demonstrated concerning rates of user deception. Petri is available now on GitHub. (Anthropic)
Big Music close to AI licensing agreements with Big Tech
Universal Music and Warner Music are close to finalizing licensing deals with AI companies, including start-ups like ElevenLabs, Stability AI, Suno, and Udio, as well as tech giants like Google and Spotify, according to a new report by the Financial Times. The labels aim to establish payment structures similar to streaming services, where using a song triggers a micropayment, and they want AI companies to develop attribution technology to identify when their music is used. The talks cover licensing songs for AI-generated tracks and training large language models, with deals potentially coming within weeks. These agreements could set a precedent for how AI companies compensate the music industry, as labels seek to avoid the mistakes of the internet era that nearly destroyed their business in the early 2000s. The deals would potentially include settlements for past use of music, including for Suno and Udio, which the labels sued for copyright infringement in 2024. (Financial Times)
Google’s Gemini 2.5 Flash Image becomes generally available
Google released its Gemini 2.5 Flash Image model, aka “Nano Banana,” for general production use. New features include support for 10 different aspect ratios. The model allows developers to blend multiple images, maintain character consistency, perform natural language edits, and leverage Gemini’s knowledge base for image generation and modification. The model costs $0.039 per image and is available through the Gemini API on Google AI Studio and Vertex AI. (Google)
Want to know more about what matters in AI right now?
Read the latest issue of The Batch for in-depth analysis of news and research.
Last week, Andrew Ng talked about LandingAI's Agentic Document Extraction (ADE) tool, which transformed PDF files into LLM-ready markdown text for use in sectors like healthcare, financial services, and legal, emphasizing the importance of accurate data extraction from complex documents.
“How can we accurately extract information from large PDF files? Humans don’t just glance at a document and reach a conclusion on that basis. Instead, they iteratively examine different parts of the document to pull out information piece by piece. An agentic workflow can do the same.”
Read Andrew’s letter here.
Other top AI news and research stories covered in depth:
- Google’s AP2 provides developers with new tools to build agentic payments, in a bid to transform digital transactions.
- A recent study reveals that ChatGPT users are now more likely to be young, female, and seeking information, highlighting demographic shifts in AI use.
- Gambling sites are deploying AI tools that predict wins and track bets for sports fans, marking a new era in sports betting.
- Researchers have developed a new technique that auto-selects training examples to speed up fine-tuning, advancing the efficiency of reinforcement learning.