What you'll learn

Learn the fundamentals of autonomous web agents, what they are, how they work, their limitations, and the decision-making strategies taken to optimize their performance.

Build autonomous web agents that can perform tasks such as finding, scraping, and summarizing a webpage, filling out forms, and signing up for newsletters.

Explore the AgentQ framework, which uses a combination of Monte Carlo Tree Search (MCTS), self-critique mechanism, and Direct Preference Optimization (DPO) to teach agents to self-correct.

About this course

Learn how to build AI agents that interact with websites in Building AI Browser Agents, taught by Div Garg and Naman Garg, Co-founders of AGI Inc, and built in partnership with AGI Inc.

AI browser agents can log into websites, fill out forms, click through web pages, or even place an online order for you. They use both visual information, like screenshots, and structural data, like the HTML or Document Object Model (DOM) of a web page, to reason and take actions.

With the complexity of web pages and many possible actions at each step, it can be challenging for an AI browser agent to complete an assigned task. A single error—like clicking the wrong button or misreading a field—can compound into unexpected outcomes.

In this course, you’ll understand how autonomous web agents work, their current limitations, and how AgentQ enables them to improve through self-correction.

In detail, you’ll:

Learn what web agents are, how they automate tasks online, their architecture, key components, limitations, and an overview of their decision-making strategies.
Build a web agent that can scrape the Deeplearning.AI website and return course recommendations in structured output format based on natural language instructions.
Build an autonomous web agent that can execute multiple tasks, such as finding and summarizing webpages, filling out a form, and signing up for a newsletter.
Explore AgentQ, a framework that enables agents to self-correct through a combination of Monte Carlo Tree Search (MCTS), Self-critique mechanism, and Direct Preference Optimization (DPO), a reinforcement learning algorithm.
Deep dive into MCTS, learn how it finds the most optimal path with an example of Gridworld animation, and use AgentQ to complete web tasks.
Understand the current state and future directions of AI agents—including key factors shaping their evolution, such as hardware, algorithms, and data availability.

By the end of this course, you will have hands-on experience building browser agents and a deeper understanding of how to make them more robust and reliable.

Building AI Browser Agents

What you'll learn

About this course

Who should join?

Course Outline

Elevate your learning experience with Pro

Instructors

Div Garg

Naman Garg

Building AI Browser Agents

Want to learn more about Generative AI?