Perplexity unveils new Sonar model with Deep Research Baidu says it will make Ernie open source and Ernie Bot free

Published
Feb 17, 2025
Reading time
4 min read
A large AI coding competition with a highly diverse group of coders.

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:

  • Early version of o3 wows on Codeforces and IOI tests
  • Adobe debuts integrated Firefly web app with new video model
  • Mistral’s Arabic-language Saba model scores high on benchmarks
  • LM2 updates the transformer architecture with dedicated memory

But first:

Perplexity gets a boost with search model and Deep Research tool

Perplexity unveiled two major upgrades to its AI-powered search platform for Pro users: an improved Sonar model and a new Deep Research feature. Sonar, now built on Llama 3.3 70B, outperforms similar models in user satisfaction tests and uses Cerebras’s inference platform to generate answers at 1200 tokens per second. Perplexity’s new Deep Research tool conducts long-form analysis on complex topics, performing multiple searches and synthesizing information into detailed reports faster than Google and OpenAI’s competing tools. Both updates score well on accuracy and readability, with Sonar outperforming comparably-sized models on IFEval and MMLU, and Deep Research achieving high scores on industry benchmarks like Humanity’s Last Exam and SimpleQA. (Perplexity and Perplexity)

Baidu to offer Ernie Bot for free and open source its AI model

Baidu announced it will make its Ernie Bot chatbot free starting April 1 and make its forthcoming Ernie 4.5 model openly available from June 30, although the company did not disclose the specific license or terms. Company sources also said Ernie 5 would debut before the end of 2025. The Chinese search giant faces growing competition from startups like DeepSeek, which offers free AI services claimed to match OpenAI’s capabilities at lower costs. Offering Ernie Bot for free aims to boost Baidu’s market share in China’s AI sector, where it currently lags behind DeepSeek and ByteDance’s Doubao in monthly active users. (Reuters)

OpenAI reasoning models match elite human programmers

OpenAI’s large reasoning models demonstrated significant improvements in competitive programming and software engineering tasks. The o1 model achieved a CodeForces rating of 1673, placing it in the 89th percentile, while o1-ioi (a model specially designed to perform well on such tests) reached the 98th percentile with a rating of 2214 using specialized test-time strategies. But an early checkpoint of o3 surpassed both — without relying on hand-engineered heuristics, just through sheer reinforcement learning —  achieving a 2724 rating in the 99.8th percentile and earning a gold medal score on the 2024 International Olympiad in Informatics problems. These results suggest that AI systems can now match or exceed top human programmers in complex problem-solving tasks, potentially transforming software development and algorithmic research in a wide range of fields. (arXiv)

Adobe unveils Firefly Video Model and new paid plans

Adobe made its Firefly Video Model available, calling it an IP-friendly and commercially safe generative AI tool for video creation. The model allows users to generate video clips from text prompts or images, with advanced controls for camera angles, motion, and keyframes. Adobe also announced new Firefly Standard ($10/month) and Pro ($30/month) plans, offering tiered access to premium video and audio features and unlimited access to imaging and vector capabilities. Adobe’s offerings give users another choice in video generation while tying into its popular media editing tools, potentially making sophisticated video creation more accessible to a wider range of creators and businesses. (Adobe)

Mistral releases Arabic-focused language model for Middle East market

French AI startup Mistral launched Mistral Saba, a 24-billion-parameter language model designed for Arabic-speaking countries. The model outperforms Mistral’s general-purpose small model in Arabic content and perhaps surprisingly overperforms with Indian-origin languages. Saba’s release extends Mistral’s commitment to local language support and represents a strategic move to gain traction among users in the Middle East and potentially attract regional investors. (TechCrunch)

Dedicated memory module boosts transformer’s long-context reasoning

Researchers at Convergence Labs introduced a new memory-augmented transformer architecture called Large Memory Model (LM2) to enhance long-term reasoning capabilities. LM2 incorporates a dedicated memory module that interacts with input tokens via cross attention and updates through gating mechanisms, while maintaining the original transformer information flow. Experimental results show LM2 outperforms state-of-the-art memory models on long context reasoning tasks by up to 37.1 percent, while also improving performance on general language tasks. (arXiv)


Still want to know more about what matters in AI right now?

Read last week’s issue of The Batch for in-depth analysis of news and research.

Last week, Andrew Ng advocated for shifting the conversation from “AI safety” to “responsible AI” at the Artificial Intelligence Action Summit in Paris and emphasized the importance of focusing on AI opportunities rather than hypothetical risks.

“In a world where AI is becoming pervasive, if we can shift the conversation away from ‘AI safety’ toward responsible [use of] AI, we will speed up AI’s benefits and do a better job of addressing actual problems. That will actually make people safer.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: OpenAI’s Deep Research agent generates detailed reports by analyzing web sources; Google revised its AI principles, lifting a self-imposed ban on weapons and surveillance applications; Alibaba debuted Qwen2.5-VL, a powerful family of open vision-language models; and researchers demonstrated how tree search enhances AI agents’ ability to browse the web and complete tasks.


Subscribe to Data Points

Share

Subscribe to Data Points

Your accelerated guide to AI news and research