Updated Gemini Pro model builds interactive websites from prompts OpenAI unveils new restructuring plan

Published
May 9, 2025
Reading time
4 min read
Man designing a fun tech website on a desktop computer in a cozy modern home office.

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll learn more about:

  • Mistral’s new medium-sized language model
  • Claude developers gain access to web search
  • Alibaba uses RL to teach LLMs to search better
  • OpenAI’s plans to build AI infrastructure worldwide

But first:

Google updates Gemini 2.5 Pro’s coding and web design skills in surprise early release

Google launched early access to Gemini 2.5 Pro Preview (I/O edition), an updated version with significantly improved coding capabilities, particularly for building interactive web applications. The model now leads the WebDev Arena Leaderboard, surpassing its previous version by 147 Elo points, ranks first on Chatbot Arena for coding, and achieves 84.8 percent on the VideoMME benchmark for video understanding. Developers can access the updated model through the Gemini API via Google AI Studio and Vertex AI, while general users can experience it through the Gemini app. (Google)

OpenAI abandons plans to transition full control to for-profit company

OpenAI announced it would transform its for-profit arm into a Public Benefit Corporation (PBC) while keeping its nonprofit foundation in control of the company. This structural change replaces the company’s complex “capped-profit” model with a more standard arrangement where employees will own stock directly, similar to other AI companies like Anthropic and X. The nonprofit will become a major shareholder in the PBC, generating resources to fund initiatives ensuring AI benefits diverse communities. OpenAI made this decision after consulting with attorneys general and other officials in California and Delaware. The shift allows the company to raise more funding to pursue its goal of artificial general intelligence, but is less of a radical change than shifting full control of the company to the for-profit arm. (OpenAI)

Mistral AI releases Medium 3 language model

Mistral AI launched Mistral Medium 3, a new language model priced at $0.40 per million tokens for input and $2 for output. The company claims the model outperforms Llama 4 Maverick, is comparable to GPT-4o, and approaches Claude Sonnet 3.7 on benchmarks while being significantly less expensive. Mistral Medium 3 can be deployed on systems with four or more GPUs and is designed for coding and STEM tasks. The model includes enterprise features such as on-premises deployment options and custom training capabilities. It’s currently available on Mistral’s platform and Amazon SageMaker, with planned releases on IBM WatsonX, NVIDIA NIM, Azure AI Foundry, and Google Cloud Vertex. (Mistral)

Anthropic launches web search via its API

Anthropic added web search to its Claude API, allowing developers to build applications that access current information from the internet. When Claude receives a request that would benefit from up-to-date information, it can generate targeted search queries, retrieve relevant results, and provide comprehensive answers with source citations. The feature includes administrative controls like domain allow lists and block lists to help organizations maintain control over information sources. Web search is available for Claude 3.7 Sonnet, Claude 3.5 Sonnet, and Claude 3.5 Haiku at $10 per 1,000 searches plus standard token costs. (Anthropic)

Alibaba develops ZeroSearch to improve LLM search capabilities

A new reinforcement learning framework called ZeroSearch enhances large language models’ search capabilities without requiring access to actual search engines. The system transforms an LLM into a retrieval module through supervised fine-tuning, then uses a curriculum-based strategy that progressively introduces more challenging retrieval scenarios during training, enabling the LLM to progressively find the most relevant documents. Experiments show that a 7 billion parameter retrieval module achieves comparable performance to traditional search engines, while a 14 billion module can surpass them. ZeroSearch could eliminate the high API costs typically associated with search-based LLM training while avoiding the unpredictable document quality issues that occur when using live search engines. (GitHub)

OpenAI launches program to develop global AI infrastructure

OpenAI announced “OpenAI for Countries,” a new initiative to help nations build AI infrastructure and capabilities. The program will partner with governments to develop in-country data centers, provide customized ChatGPT services to citizens, implement security controls, and create national startup funds to foster local AI ecosystems. OpenAI plans to pursue 10 initial projects with individual countries or regions, targeting nations that commit to using AI according to democratic principles. The initiative follows the Paris AI Action Summit, where multiple international leaders expressed interest in creating their own versions of the Stargate project, which aims to invest $500 billion in AI infrastructure within the US. (OpenAI)


Still want to know more about what matters in AI right now?

Read this week’s issue of The Batch for in-depth analysis of news and research.

This week, Andrew Ng announced that AI Fund has closed $190M for a new venture fund and shared key lessons on how speed drives success in AI startups.

“Many factors go into the success of a startup. But if I had to pick just one, it would be speed. Startups live or die based on their ability to make good decisions and execute fast.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: Alibaba released the Qwen3 family of open-source language models, offering optional reasoning capabilities that rival top models like DeepSeek-R1; OpenAI rolled back its GPT-4o update after users flagged overly flattering, sycophantic behavior; Johnson & Johnson unveiled a revised AI strategy, offering new insights into how big medical companies are using the technology; and researchers demonstrated that fine-tuning a language model with just 1,000 examples can significantly boost its reasoning abilities.


Subscribe to Data Points

Share

Subscribe to Data Points

Your accelerated guide to AI news and research