From Concept to Reality: Building a Semantic Highlighter with Claude-Dev and GPT-4

Dive into the transformative world of AI code generation with our hands-on tutorial! Learn to build an AI-powered document highlighting tool using Next.js and Claude-dev. We'll guide you through project bootstrapping, integrating AI seamlessly, and troubleshooting common issues. Unlock the potent...

The capability of code generation with AI is one of the most mind-blowing advances in AI we are facing. It is incredibly exciting because it is the foundation of everything around us. Software is eating the world, and those who master making this software will have immense power.

In the last two years or so, many solutions for code generation have emerged, some commercial and some open source. From Copilot to Tabnine to Claude-dev, they all do more or less the same: they are integrated into an IDE and help you analyze code, fix parts of the code, add new features, etc. It is a great aid without knowing much about how they work. However, there are a few caveats and knobs you can adjust to get the maximum out of these tools, and this is exactly what we are going to discuss in this tutorial.

At the bingging I thought of compiling a list o do's and don't but then I decieded to do a hands on tutorial with a real example to demonstrate when you might come cross which problem and how to possibly solve it.

The idea For simplicity's sake, I chose an application that I find useful: an AI-powered document highlighting tool.
The idea is that I use GPT to get answers to my questions, and most AI tools do a great job at that. But sometimes it's hard to put the results in context to validate the result and make sure it's really coming from the document at hand and not a hallucination. To solve this, we can reduce the scope of the search to a single document and locate the results within that document, allowing navigation between results. It's essentially like a normal browser search, but in natural language. Let's keep it simple and define the requirements.

Task: Build a NextJS application with the following features:

1. **Document Upload**:
   - The user uploads a document.

2. **Document Display**:
   - After the upload, the content of the document is displayed in the application.

3. **Chat Interaction**:
   - The user is provided with a chat interface where they can enter a prompt to search for specific highlights within the document using AI.

4. **AI Integration**:
   - The user input (prompt) is sent to GPT Mini.
   - GPT Mini searches the document for matches based on the prompt.
   - The results are returned in the same format as they appear in the document.

5. **Example Workflow**:
   - The user uploads a CV document.
   - The user enters the prompt: "Cloud experience".
   - The application sends the prompt to the API and receives a list of matches from the CV.
   - The application highlights the relevant areas in the document where "Cloud experience" or related content is mentioned.

6. **Additional Search**:
   - The user can continue entering new prompts and search for additional terms or phrases.

We should get something like this in the end.

Step 1: Bootstrapping the Project from the Chat Window

The first step is to bootstrap the project and generate the skeleton. I use the Claude web Chat Console to create a bash script that includes everything in one shot. I can then copy and paste this script into my IDE and run it from there.

Tip: It is certainly possible to bootstrap the project with all necessary files from the chat window, e.g., VSCode. However, if you already have a subscription, it might be cheaper to let Claude generate a script that does all the nitty-gritty of generating all necessary files for your application from the console. But keep in mind that if the code is too long, this might not work anymore, and you would have to start from the chat window.

Generate a bash script that creates all the necessary files and folder structures for a Next.js application.

< REQUIREMENTS GO HERE >

Step 2: Run the Project Generator

Now copy-paste the generated code to generate the new project.

vscode ➜ /workspaces/ai-highlighter $ touch generate-project.sh
vscode ➜ /workspaces/ai-highlighter $ chmod +x generate-project.sh
vscode ➜ /workspaces/ai-highlighter $ ./generate-project.sh

If all goes well, you should now see something like this:

npm warn deprecated @humanwhocodes/config-array@0.13.0: Use @eslint/config-array instead
npm warn deprecated rimraf@3.0.2: Rimraf versions prior to v4 are no longer supported
npm warn deprecated @humanwhocodes/object-schema@2.0.3: Use @eslint/object-schema instead
npm warn deprecated glob@7.2.3: Glob versions prior to v9 are no longer supported

added 361 packages, and audited 362 packages in 22s

137 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities
Initialized a git repository.

Success! Created nextjs-document-chat-app at /workspaces/ai-highlighter/nextjs-document-chat-app

added 10 packages, and audited 372 packages in 4s

138 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities
Next.js application structure created successfully!
vscode ➜ /workspaces/ai-highlighter $

You should now be able to start the app using npm run dev.

vscode ➜ /workspaces/ai-highlighter/nextjs-document-chat-app (main) $ npm run dev

> nextjs-document-chat-app@0.1.0 dev
> next dev

  ▲ Next.js 14.2.12
  - Local:        http://localhost:3004

 ✓ Starting...
 ✓ Ready in 1190ms

Now you can open it in your browser. I use port 3004 because the other 3000 ports are occupied. You can change this as you desire by setting the environmental variable PORT to whatever suits you.

When I open the app in the browser, it starts without any errors, but it does not look very appealing though 😄.

If we look into the code, we will see the reason for that: Claude created only the skeleton but did not actually implement anything.

This is okay; we can fix that.

Step 3: Implementing the Application

To start implementing, we can simply repeat the requirements to Claude-dev and let it do the magic of actually implementing the application.

Before we start, make sure to commit all changes in case Claude messes things up.

Claude sometimes generates code that does not meet your expectations, e.g., it creates placeholders instead of real code. In these cases, it is wise to have committed your code before in order to be able to roll back.

Fix the application to meet these requirements:

<ADD YOUR REQUIREMENTS HERE (AGAIN)>

Claude-dev will start doing the magic.

Keep an eye on the highlighted parameters here:

- Total number of tokens for the current chat (that might span multiple requirements and iterations).

- Cache: This is an incredible feature currently only Claude offers, which can save up to 90% of the token cost. I already discussed this in a previous post.

- API COST: This is the total cost of the current chat. Make sure it does not exceed your budget, though honestly, it rarely does 😄.

Now let's accept all the changes suggested by Claude-dev and inspect the changes before actually applying them. This will spawn a few files, so be patient.

While the updates are running, you can have a quick look at the website - note that it is being updated live, so you will see errors while the code is being updated/generated. Don't worry, we have to wait until the end and hope those errors will disappear.

In case you come across limit errors, you basically have two options:
1. Wait until the limits disappear either per minute or daily limits.
2. Create multiple accounts and switch back and forth between the APIs.
3. Use AWS Bedrock.
Note, however, that the third option is the most expensive, because as of today, it does not support caching yet, so you will have to pay the full price for all code generation. With sonnet -3.5, as of today, roughly $3 per million tokens, which is honestly still very decent. You can find the current prices here

After a while, the generation is finished, but we got an error in the console.

You can certainly analyze this error by inspecting the code OR if you are lazy, you can simply copy-paste it into Claude-dev and ask it to fix it.

There is a button "Start new task" which is very appealing. I advise you, however, NOT TO click it if you want to save a few pennies. Why? Because it resets the cache of Claude, causing you to pay to fill that cache in the new session. Note, however, that the cache also expires anyway (afaik after 5 minutes).

Another piece of advice: If you manually change any file, it will likely be overwritten by Claude-dev if it uses the Claude cache. Instead, either invalidate the cache by starting a new session or let Claude-dev make the change for you. Unfortunately, I haven't found any other workaround. If you know one, I'd be glad to hear it.

Claude-dev has found the cause of the error and seems to know how to fix it.

Accept all changes (note, you can still roll back with git checkout anytime) and boom... it looks much better now :)

Before actually running the application, I highly recommend understanding what Claude-dev actually did, especially how it integrated GPT if it did at all. If you search for GPT, you might see something like this (note that your generated code might look different because of the nature of unpredictability of LLMs) but the idea is the same: Try to understand the magic.

As you can see, it did not integrate GPT at all 😄.

So now let's fix that using Claude-dev.

We will create a new prompt and provide it an example of how to use GPT. For the sake of simplicity, I will use the code generated by OpenAI.

And ask Claude-dev to fix it.

And after some analysis, it already has a fix.

Especially when you work with LLM and GPT in particular, ensure that the generated code includes the model you intend to use, e.g., if you intend to use GPT-4o-mini and not the much more expensive GPT-4. Why? Because Claude does not necessarily have the latest model names and sometimes tends to use the last known powerful model which could become an expensive undertaking if you are not really planning to use it.

Note that this applies also later during code generation of other parts of the code that do not necessarily have anything to do with the LLM. Claude-dev might still change the model thinking that it is a necessary optimization. That's why it is very wise to inspect the changes before applying them or you will end up with an expensive experiment :)

Anyway, this suggestion looks good. I will accept it.

Before testing it, make sure you install any necessary libraries. Claude already recognized and suggested the installation of OpenAI.

Also, make sure to set the OPENAI_API_KEY as an environmental variable and restart the application.

npm install --save openai
export OPENAI_API_KEY=sk-xxxxx
npm run dev

Alternatively, you can simply add the environmental variable in the .env.local file.

Now the application should be up and running.

Unfortunately, it still does not. Instead, it loads the form and then disappears. In the browser console, I see this message... Oh... it seems Claude has messed it up and generated the actual call on the client side. This is, from a security perspective, a serious issue. We have to fix it.

If you come across any other errors (token limits, runtime errors, etc.), you can use the same strategies as described above. Let's ask Claude-dev for a fix.

The searchDocument and the call to the LLM must be on the server side.

After a few back and forth, it is fixed and we can eventually start using the app.

Step 4: Testing

I uploaded data about a fictive company "Mambo" and started asking for things to highlight.

And let's start asking what to highlight: note that this is semantic highlighting, not keyword highlighting. The result is exactly what I was expecting: everything related to cloud technology. I am impressed 😄.

This looks now pretty good. Let's do some final finishing to make it visually appealing and the typical stuff like menu items, etc.

Eventually, we are finished and can now actually use it. Not bad for a 3-hour project, don't you think?

Side note:

Claude-dev now has support for other models. I tried many of them, but I have to admit that Claude is by far the most reliable. If you have similar or different experiences, please share. Thanks.

Conclusion:

Claude-dev is incredibly powerful, but even more if you know how to use it properly. I shared here and in previous posts a few tips that might help you get more out of Claude and Claude-dev and code generation in general. and will do more in the future.

I hope you enjoyed it. If you have suggestions, corrections, or ideas for other projects, please don't hesitate to send me a PM.