Atlas ushers in OpenAI’s browser era DeepSeek’s efficient new OCR model

Published
Oct 24, 2025
Reading time
5 min read
A humanoid robot sits in a library, reading a book, surrounded by stacks with a laptop displaying code.

Welcome back! In today’s edition of Data Points, you’ll learn more about:

  • Claude Code’s launch on the web and mobile
  • Reddit’s lawsuit against Perplexity and web-scraping firms
  • Meta and Hugging Face’s secure environments for agents
  • GigaBrain’s use of world models to better train robots

But first:

OpenAI launches its own agentic web browser

OpenAI released ChatGPT Atlas, a new web browser that integrates ChatGPT and agent mode directly into the application. Atlas’s AI can understand page content, remember context across sessions, and complete tasks without users leaving their current page. The browser includes an optional “browser memories” feature that lets ChatGPT recall details from previously visited sites to provide more personalized assistance, but users control what information is stored or deleted. Atlas also features agent mode in preview for paid users, enabling ChatGPT to autonomously do web research, fill shopping carts, or compile documents in the browser. Atlas reflects OpenAI’s push toward agentic AI systems that can handle routine computing tasks, though the company acknowledges risks including mistakes and vulnerability to malicious instructions. ChatGPT Atlas is available now on macOS for Free, Plus, Pro, and Go users, with Windows, iOS, and Android versions coming soon. (OpenAI and X)

DeepSeek pilots text-compressing optical character recognition model

DeepSeek released DeepSeek-OCR, a vision-language model that converts text documents into compact visual representations using far fewer tokens than the original text. The model achieves 97 percent accuracy when compressing text at a 10-to-1 ratio and maintains 60 percent accuracy even at 20-to-1 compression by rendering text as images and encoding them into visual tokens that language models decode back into text. On the OmniDocBench benchmark, DeepSeek-OCR outperforms competing models while using significantly fewer tokens — just 100 tokens per page compared to 256 for GOT-OCR2.0 and fewer than 800 tokens versus over 6,000 for MinerU2.0. This compression technique could enable more efficient processing of long contexts in large language models by converting older conversation history into progressively smaller images, similar to how human memory fades over time. The model’s code and weights are publicly available on GitHub. (arXiv and GitHub)

Claude Code launches on the web with parallel agents in the cloud

Anthropic released a web-based version of Claude Code that lets developers run multiple coding tasks simultaneously across different GitHub repositories from their browser. The service operates on Anthropic-managed cloud infrastructure, with each task running in an isolated sandbox environment that includes network and filesystem restrictions to protect code and credentials. As with the command-line and IDE versions, developers can use Claude Code’s web interface for bug fixes, routine tasks, testing, backend changes, pull requests, and documentation. The cloud-based approach, similar to OpenAI’s Codex, suggests a shift toward AI agents handling development work independently in managed environments, rather than requiring developers to run coding assistants locally on their own machines, potentially making development more accessible while introducing new security challenges. (Anthropic also launched an early mobile version of Claude Code in its iOS app.) Claude Code for Web is available now in research preview for Claude Pro and Max subscribers. (Anthropic)

Reddit accuses Perplexity AI and scraping firms of data theft

Reddit sued Perplexity AI and three other companies — Oxylabs, AWMProxy, and SerpApi — alleging they illegally scraped millions of user comments for commercial use. The lawsuit, filed in New York federal court, accuses the companies of bypassing Reddit’s anti-scraping protections and extracting content from Google’s search results when direct access was blocked. Reddit used a novel technique, creating a test post that could only be crawled by Google search, then showing that within hours, data from the post appeared on Perplexity. The lawsuit highlights tensions over how AI companies acquire training data, as Reddit has separately licensed its content to Google and OpenAI for payment and sued Anthropic alleging unauthorized scraping. Perplexity and the other defendants denied the allegations and said they would defend themselves in court. (Associated Press and The New York Times)

Meta and Hugging Face launch hub for shared agentic environments

OpenEnv Hub is a new community platform where developers can build, share, and explore standardized environments for AI agents. Agentic environments define the tools, APIs, credentials, and execution context an agent needs to perform specific tasks in secure, sandboxed settings that work for both training and deployment. The hub launches soon with initial environments that developers can test by interacting as human agents or enlisting models to solve tasks, while an OpenEnv 0.1 specification has already been released for community feedback. The initiative addresses a key challenge in AI agent development: large language models need access to appropriate tools, but exposing millions of tools directly isn’t safe or practical, requiring instead carefully defined environments with clear semantics and security guarantees. Meta is integrating OpenEnv with its TorchForge RL library and collaborating with open-source projects including verl, TRL, and SkyRL to expand compatibility. (Hugging Face)

GigaBrain-0 uses synthetic data to train more capable robots

Researchers introduced GigaBrain-0, a vision-language-action model that trains robots using synthetic data generated by world models rather than expensive real-world demonstrations. The system generates training scenarios by altering object appearances, placements, lighting conditions, and camera viewpoints, getting more diverse training data than most robots get from real-world observation. GigaBrain-0 incorporates depth sensing for spatial reasoning and uses “embodied Chain-of-Thought” supervision to break complex tasks into intermediate steps. Tests on arm manipulation, long tasks, and mobile manipulation showed GigaBrain-0 outperformed the baseline π0 model by 10–30 percent. The team also released GigaBrain-0-Small, a lightweight version that runs 10 times faster on edge devices while maintaining comparable performance. (arXiv and GitHub)


Still want to know more about what matters in AI right now? 

Read this week’s issue of The Batch for in-depth analysis of news and research.

This week, Andrew Ng talked about the importance of error analysis in agentic AI development, best practices for identifying and addressing performance gaps in AI workflows, and the evolving nature of workflow design due to rapid improvements in LLMs.

“A basic error analysis procedure might involve gathering a sample set of topics where the output is subpar, and reading the results of every step of the workflow — called the traces — to see which step most frequently generated results materially worse than a human would have. This is very valuable for deciding what step to focus on improving.” 

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth:


Subscribe to Data Points

Share

Subscribe to Data Points

Your accelerated guide to AI news and research