Claude Desktop Can Now Control your Browser

Discover how AI can revolutionize your browser tasks with the "browser use" capability, automating actions like forms and data gathering. Easily integrate it into Claude Desktop using MCP technology. Follow our guide to set up and test this feature, transforming browser interactions for work and ...

Claude Desktop Can Now Control your Browser

Some time ago, I wrote about a new technology from Anthropic called “computer use,” which allows AI to control your entire computer.

This announcement generated a lot of excitement and inspired similar projects from companies like OpenAI and Manus (though using different underlying technology). The potential impact on various industries is huge if this becomes widely adopted. You can read my full article on that topic here:

Has Anthropic just wiped out an entire industry?
Anthropic’s new “Computer Use” feature in its Claude API marks a transformative leap in AI, automating tasks traditionally requiring human input. By interpreting commands and interacting with desktop environments, it promises to disrupt industries like customer service, healthcare, and software t…

More recently (around 5 months ago, I believe), the open-source community came up with a clever idea for controlling just the web browser using AI, often referred to as “browser use.”

The concept is straightforward:

tasks you normally do manually in your browser — like filling out forms, gathering data from different websites, or anything similar — can now potentially be automated by AI with a higher success rate than ever before.

In this post, I’ll show you how to integrate this “browser use” capability into your Claude Desktop application. We’ll let Claude collect information from a website like Meetup, organize it, and present it to you directly in the chat window.

You can apply the same method for searching for jobs, apartments, or filling out online forms for jobs, tenders, visa etc.

Background

Some background information might help you understand the setup better.

How Browser Use Works in a nutshell

You can find details about the technology in the official documentation, but essentially, it uses the computer vision capabilities of large language models (LLMs) like those from OpenAI or open-source alternatives to analyze and control websites.

Data Privacy | Imprint