Has Anthropic Claude just wiped out an entire industry? (Part 3/3)

Unleash the potential of Anthropic's Claude with our step-by-step guide on leveraging its powerful Computer Use feature! From booking appointments to automating tasks, learn how to start using Claude effectively. Don't miss essential tips on security, configurations, and saving costs while diving...

Has Anthropic Claude just wiped out an entire industry? (Part 3/3)

In the first part of this series, I wrote about an incredible feature called Computer Use that Anthropic released a few weeks ago. This could be a bombshell for the IT industry and many other sectors (I'm afraid the resonance is yet to come).

In the second part of the blog post, we talked about a real-world (simple) use case: booking a doctor appointment.

Overview of This Series

Part I: The announcement and the early implications on the economy
Part II: A real-life example — Letting Claude accomplish a daunting task all by himself
Part III: Do it yourself — Quickstart guide for non-techies, tips and tricks

In this third part of the series, I will try to show you how to try it yourself using a step-by-step guide including tips and important considerations.

So, let's jump in and get our hands dirty.

Outline of Steps:

  1. Get API Keys for Claude.
  2. Install Docker.
  3. Run the application.
  4. Configure the appliance.
  5. Start using Claude's Computer Use feature.

This tutorial is based on the GitHub repo and the appliance provided by Anthropic:

https://github.com/anthropics/anthropic-quickstarts/blob/main/computer-use-demo/README.md

For this setup, you will need to have Docker up and running. And here is how:

Step #1 Get the API Keys

If you haven't yet, sign up and add $10 to your Anthropic account to get started.

Then get API Keys for Claude.

https://console.anthropic.com/

Step #2: Install Docker

Install Docker if you haven't already.

Docker in a sentence: Docker helps you run so-called containerized applications in a safe sandboxed environment on your PC or Mac. It is the preferred method here mainly because of safety reasons where the model cannot cause any harm to your real physical PC (though on real websites it still can).

After starting Docker, go to your console/shell and enter the command below, making sure to enter the API key you just copied from the Anthropic console:

export ANTHROPIC_API_KEY=%your_api_key%
docker run \
    -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
    -v $HOME/.anthropic:/home/computeruse/.anthropic \
    -p 5900:5900 \
    -p 8501:8501 \
    -p 6080:6080 \
    -p 8080:8080 \
    -it ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest

Step #3: Run the application

If the command above went smoothly, you should now be able to open the appliance and start chatting with Claude Computer Use.

Open your browser and hit:

http://localhost:8080

As you can see, the screen is divided into two parts. On the left, you can chat and give instructions, and on the right, you can see the actual robot behavior.

Step #4: Configure the appliance

Next, configure the model provider to use. The options are Anthropic, Bedrock, and Vertex from Google. I use Bedrock because, unlike Anthropic Claude, it has fewer restrictions on token usage per minute and per day. On the other hand, it has no prompt caching, which leads to higher token usage costs.

However, for a quick start, getting an API key from Anthropic is the easier method, which is why I chose it here.

Remember, you can still get a key from AWS when you want to go one step further.

To configure the model and enter the API key, hit the arrow at the top left corner of the application.

Then select the model (in this case, choose Anthropic) and enter the API key if it's not already provided (mentioned in the Docker command above).

Setup Screenshot

Side note:

The other arrow in the top right corner is very useful when you need to regain control and manage your screen yourself. This is very useful if the model gets stuck or if you need to enter a username and password for instance.

You can also stop the application at any time by pressing the stop button here. This is actually a Streamlit app feature, not a Claude feature, but it is super helpful when using the device.
Stop Button Screenshot

Step #5: Let the magic begin

Now that we have configured the model, we can start giving Claude instructions for tasks to accomplish in the browser and beyond that (including file operations like saving and reading files).

In this tutorial, we will only do browser-based tasks, but feel free to experiment in the sandbox environment as you like. I highly recommend testing with harmless websites before you dig deeper to get a feeling of what Computer Use is all about.

Note:
The model has some limitations in terms of privacy, but I have always managed to let it log in to various websites for testing purposes. However, I strongly recommend NOT doing this. Instead, I recommend logging in first and then giving it instructions to do the rest of the magic by itself. You can see in the screenshots below how to stop the chat and transfer the control to yourself (Human in the Loop) to log in. In this example, however, we will let Claude do the complete task without any manual intervention.

Here is a prompt I used in the second part of this series. Feel free to use a prompt that matches interesting use cases for your domain.

Prompt:

Book a general practice appointment any time this month after 4 pm at doctap.co.uk. Jane Smith Date of Birth: 1990-04-25 Gender: Female Address Line 1: 456 Elm Street Address Line 2: Apt 12B City: Manchester Postcode: M1 2AB Contact Number: +44 7911 123456 Email Address: jane.smith@example.com

And ... Go.

Feel free to watch and read carefully what Claude is doing.

Don't forget to pay attention to the incurring cost in the Anthropic dashboard.


A few practical usage tips

If you want to go beyond a "hello world", here are a few practical tips:

  • The model performs pretty well in the default settings, but there are a few things you need to pay special attention to, to save money and time in lengthy tasks that go beyond a simple test.
  • If you seek 100% autonomous accomplishment of your task, make sure you provide Claude with all necessary data. In the example above, this was the user data (unless you are not logged in). Note that if it comes across a step where it does not know what to do, it is likely to stop and ask you for input, which is better than guessing sometimes.
  • Many websites have this annoying "cookie" dialog. If you expect Claude to navigate through multiple websites, I highly recommend you install "I Don't Care About Cookies for Firefox"before it starts.
  • Log in in advance. In the example above, we use random test data. However, if you use existing accounts, I highly recommend you log in to all required sites before you start to not only avoid sharing your credentials with Claude but also to speed up the process (especially if you have two-factor authentication).
  • The Docker container exposes different ports, among others 8080 which is the default, but you can also watch the session directly in the Streamlit app running on Port 8501. If you do only browser automation and limited window size, this might be the better option.
  • You can export the conversation anytime simply using the print feature as you can see below.
Export Conversation Screenshot
  • Reduce the history of the screenshots sent to Claude to 5 or so. This is in most cases sufficient for it to catch up from previous experiences. Remember, Claude still has access to the whole text conversation all the time, so it knows quite a bit as it progresses.
History Screenshot
  • Let Claude save intermediate results if you have lengthy conversations, for example to a file. This will come in handy if the session is interrupted and you want it to simply continue from where it left off. In my experiences, this happened only for difficult tasks.

Recap of known limitations:

  • I talked in previous blog posts about general limitations of Claude in lengthy conversations, and similar limitations apply here as well.
  • Anthropic has hard token limit sizes for minute and daily limits. This could be cumbersome after some time, especially if you transfer all 10 screenshots with each iteration. Bedrock, on the other hand, does not have such rigid limitations, which is why I always use it for Computer Use (though higher cost because of lack of prompt caching).
  • The virtualization solution provided in this appliance could take a significant amount of memory. Pay attention that you don't hit a memory limit on your device in the middle of a session, with possible loss of the whole conversation.
  • Anthropic Claude has support for prompt caching, which can significantly reduce the input token usage cost (at the cost of daily limits). Bedrock does not have prompt caching yet, which leads to more cost over time. It's up to you to decide what is more important for you (either try again when you hit the limit and pay less or just let it do its thing and pay more).
  • Security restrictions may occur during steps like registration for obvious reasons, as noted earlier. Without restrictions, people could create numerous fake accounts, which is harmful to everyone.
  • If the conversation gets too long, you might reach a context limit, as you may know from Claude Web. You can extend this limit by reducing the screenshots during sessions and by asking Claude to save lengthy results to files instead of displaying them in the chat.

Wrap Up:

This piece of technology is just the beginning, but it is already so powerful that it could literally save us an incredible amount of time on boring tasks like booking a doctor's appointment, finding a flight, etc.

But with great power comes great responsibility and care. So even if it can do all the magic on its own:

Don't share sensitive information.
Double check before you submit
And never trust the AI blindly.


How did Claude Computer Use perform on your task? Feel free to share with us.

Data Privacy | Imprint