Short CourseIntermediate 55 Minutes

Building AI Browser Agents

Instructors: Div Garg, Naman Garg

AGI Inc
  • Intermediate
  • 55 Minutes
  • 8 Video Lessons
  • 4 Code Examples
  • Instructors: Div Garg, Naman Garg
    • AGI Inc
    AGI Inc

What you'll learn

  • Learn the fundamentals of autonomous web agents, what they are, how they work, their limitations, and the decision-making strategies taken to optimize their performance.

  • Build autonomous web agents that can perform tasks such as finding, scraping, and summarizing a webpage, filling out forms, and signing up for newsletters.

  • Explore the AgentQ framework, which uses a combination of Monte Carlo Tree Search (MCTS), self-critique mechanism, and Direct Preference Optimization (DPO) to teach agents to self-correct.

About this course

Learn how to build AI agents that interact with websites in Building AI Browser Agents, taught by Div Garg and Naman Garg, Co-founders of AGI Inc, and built in partnership with AGI Inc.

AI browser agents can log into websites, fill out forms, click through web pages, or even place an online order for you. They use both visual information, like screenshots, and structural data, like the HTML or Document Object Model (DOM) of a web page, to reason and take actions.

With the complexity of web pages and many possible actions at each step, it can be challenging for an AI browser agent to complete an assigned task. A single error—like clicking the wrong button or misreading a field—can compound into unexpected outcomes.

In this course, you’ll understand how autonomous web agents work, their current limitations, and how AgentQ enables them to improve through self-correction.

In detail, you’ll:

  • Learn what web agents are, how they automate tasks online, their architecture, key components, limitations, and an overview of their decision-making strategies.
  • Build a web agent that can scrape the Deeplearning.AI website and return course recommendations in structured output format based on natural language instructions.
  • Build an autonomous web agent that can execute multiple tasks, such as finding and summarizing webpages, filling out a form, and signing up for a newsletter.
  • Explore AgentQ, a framework that enables agents to self-correct through a combination of Monte Carlo Tree Search (MCTS), Self-critique mechanism, and Direct Preference Optimization (DPO), a reinforcement learning algorithm.
  • Deep dive into MCTS, learn how it finds the most optimal path with an example of Gridworld animation, and use AgentQ to complete web tasks.
  • Understand the current state and future directions of AI agents—including key factors shaping their evolution, such as hardware, algorithms, and data availability.

By the end of this course, you will have hands-on experience building browser agents and a deeper understanding of how to make them more robust and reliable.

Who should join?

This course is ideal for learners with basic Python skills who want to explore how to build autonomous agents that interact with the web.

Course Outline

8 Lessons・4 Code Examples
  • Introduction

    Video2 mins

  • Intro to Web Agents

    Video11 mins

  • Building a Simple Web Agent

    Video with code examples7 mins

  • Building an Autonomous Web Agent

    Video with code examples9 mins

  • Agent Q

    Video8 mins

  • Deep Dive into AgentQ and MCTS

    Video with code examples9 mins

  • Future of AI Agents

    Video5 mins

  • Conclusion

    Video1 min

  • Appendix – Tips and Help

    Code examples1 min

Instructors

Div Garg

Div Garg

Co-founder of AGI Inc

Naman Garg

Naman Garg

Co-founder of AGI Inc

Course access is free for a limited time during the DeepLearning.AI learning platform beta!

Want to learn more about Generative AI?

Keep learning with updates on curated AI news, courses, and events, as well as Andrew’s thoughts from DeepLearning.AI!