From PDF to Podcast: The MIT Tool That Goes Beyond NotebookLM
Google's NotebookLM, known for transforming texts into engaging podcast-style conversations, faces limitations like lack of customization and API. However, open-source alternatives on platforms like Hugging Face offer customizable solutions to generate podcasts from documents, offering more contr...

NotebookLM is one of Google’s more creative AI products, introduced a couple of months ago. Many people were amazed by its abilities—especially the idea of turning a long text into an interesting conversation between two podcast hosts. NotebookLM offers more than that, such as chatting (Q&A) and even generating mind maps. If you haven’t tried NotebookLM yet, I highly encourage you to experience it yourself. It’s free and really easy to use.
The Problem
This is great, but some advanced users, including developers, want more and quickly run into limitations—at least in the free version:
- You can’t select the hosts’ character.
- You can’t change the prompt.
- You can’t select the length or depth of the conversation.
- … and pretty much anything else that goes beyond uploading the document and telling NotebookLM what to focus on, which was added recently.
- Most importantly, it has no API… yet (at least not in the free version)
---
The Solution
As we often see these days, for almost every commercial solution, open-source alternatives appear—if not many. The same goes for NotebookLM.
When you search on GitHub for NotebookLM, you will find plenty of projects.

If you just want to test a solution without installing or configuring anything, there is a Hugging Face space that can do exactly that.
Generates an engaging two-host podcast from any uploaded document
Many thanks to the developers of this lamm-mit space:
https://huggingface.co/spaces/lamm-mit/PDF2Audio
What You Need
- A Hugging Face account (free)
- An OpenAI API key. We will be using the TTS API for that.
The Cost
To get an idea of how affordable inference and TTS can be, in the experiment I share here, I made a podcast of a technology trend report with about 50 pages, resulting in a 15-minute podcast. It cost me around 26 cents.
Here is how it works in a nutshell:
- The user uploads a document (for example, a PDF).
- It uses OpenAI models (the user can choose which one) to generate the dialogue.
- Then it uses OpenAI TTS with user-selected characters (hosts) to turn it into a podcast. That’s it.
Let’s get started.
---
Step 1: Get the Space
Open the space and optionally clone it:
https://huggingface.co/spaces/lamm-mit/PDF2Audio
---
Step 2: Upload the Document and Customize
Upload any document you want to turn into a podcast. In this example, I used a 50-page tech report.
The best part is that you can customize everything:
- The prompt
- The model selection
- The characters
- … etc.

To avoid lengthy prompting attempts, you can also choose from a predefined set of instruction templates that set the prompts for you.

If you want to test the characters’ voices first, you can use the OpenAI playground to hear these voices: