AI Agents

18 Posts

CB Insights AI 100 2025 infographic showing top AI startups across sectors like healthcare, robotics, and infrastructure.
AI Agents

Up-and-Coming Startups: AI agents and infrastructure dominate CB Insights’ Top 100 AI Startups list

AI agents and infrastructure made a strong showing on CB Insights’s latest list of the top 100 AI startups.
Diagram of Modal Context Protocol showing MCP client-server architecture, APIs, and local and remote data sources.
AI Agents

Open Standard for Tool Use and Data Access Gains Momentum: OpenAI adopts Model Context Protocol to boost LLM tool integration

OpenAI embraced Model Context Protocol, providing powerful support for an open standard that connects large language models to tools and data.
AI co-scientist workflow diagram showing a research goal assigned to specialized AI agents for hypothesis testing and ranking
AI Agents

Science Research Proposals Made to Order: AI Co-Scientist, an agent that generates research hypotheses, aiding drug discovery

An AI agent synthesizes novel scientific research hypotheses. It's already making an impact in biomedicine.
Table comparing Claude 3.7, 3.5, o1, o3-mini, DeepSeek R1, and Grok 3 Beta on reasoning, coding, tools, visuals, and math.
AI Agents

Budget for Reasoning to the Token: Claude 3.7 Sonnet adds extended thinking mode

Anthropic’s Claude 3.7 Sonnet implements a hybrid reasoning approach that lets users decide how much thinking they want the model to do before it renders a response.
A person typing a prompt in an AI-powered mobile app with a button to improve the input.
AI Agents

Mobile Apps to Order: Replit’s agent-powered mobile app expands to full app development

Replit, an AI-driven integrated development environment, updated its mobile app to generate further mobile apps to order.
Diagram showing GPT-4o with and without search, highlighting task execution success and failure.
AI Agents

Tree Search for Web Agents: How tree search improves AI agents’ ability to browse the web and complete tasks

Browsing the web to achieve a specific goal can be challenging for agents based on large language models and even for vision-language models that can process onscreen images of a browser.
ChatGPT interface drafting a research report on retail trends, including AI, e-commerce, and inflation.
AI Agents

Agents Go Deep: OpenAI’s Deep Research agent generates detailed reports by analyzing web sources

OpenAI introduced a state-of-the-art agent that produces research reports by scouring the web and reasoning over what it finds.
Diagram illustrating Moshi’s use of an LLM to process user audio input, inner monologue, and output.
AI Agents

Okay, But Please Don’t Stop Talking: Moshi, an open alternative to OpenAI’s Realtime API for Speech

Even cutting-edge, end-to-end, speech-to-speech systems like ChatGPT’s Advanced Voice Mode tend to get interrupted by interjections like “I see” and “uh-huh” that keep human conversations going. Researchers built an open alternative that’s designed to go with the flow of overlapping speech.
Flowchart illustrating the automation of opening, editing, and saving a Word document using PyAutoGUI.
AI Agents

Training for Computer Use: UI-TARS shows strong computer use capabilities in benchmarks

As Anthropic, Google, OpenAI, and others roll out agents that are capable of computer use, new work shows how underlying models can be trained to do this.
AI assistant processes ‘Find me a family-friendly campsite’ and suggests options.
AI Agents

Computer Use Gains Momentum: OpenAI’s Operator automates online tasks with a new AI agent

OpenAI introduced an AI agent that performs simple web tasks on a user’s behalf.
MUSTAFA SULEYMAN
AI Agents

Mustafa Suleyman: Agents of action

In 2025, AI will have learned to see, it will be way smarter and more accurate, and it will start to do things on your behalf.
Santas in line with gifts and a ‘Photos with Santa’ sign.
AI Agents

Agents Ascendant: LLMs evolve with agentic workflows, enabling autonomous reasoning and collaboration

The AI community laid the foundation for systems that can act by prompting large language models iteratively, leading to much higher performance across a range of applications.
Berkeley Function Calling Leaderboard with metrics like accuracy, latency, and relevance.
AI Agents

Competitive Performance, Competitive Prices: Amazon introduces Nova models for text, image, and video

Amazon introduced a range of models that confront competitors head-on.
Flow diagram of an application using LLMs to process prompts and tools for responses.
AI Agents

Agents Open the Wallet: Stripe builds ecommerce agent toolkit for AI to securely spend money

One of the world’s biggest payment processors is enabling large language models to spend real money.
OpenDevin animation illustrating open-source AI model collaboration.
AI Agents

Free Agents: OpenHands launches as an open toolkit for advanced code generation and automation

An open source package inspired by the commercial agentic code generator Devin aims to automate computer programming and more.
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox