By AiRabbit — 21 Nov 2024

High Performance Documents Editing using GPT and Function Calling

Revamp your document editing with the Selective Processing Editor (SPE), an innovative approach that mirrors human efficiency. By focusing on relevant sections instead of processing entire documents, SPE cuts costs, boosts speed, and minimizes errors. Discover how this game-changing method stream...

Imagine you're tasked with editing a lengthy report. You don't start by reading every single word from start to finish or trying to memorize the entire document. Instead, your approach is intuitive and efficient:

Scan for Relevant Sections: You quickly skim through the headings and subheadings to locate the areas that need attention.
Jump Directly to Where You Need to Be: Without getting bogged down by unrelated content, you navigate straight to the specific section that requires editing.
Make Your Targeted Changes: Focus solely on modifying the necessary parts, ensuring precision and maintaining the document's overall integrity.
Move On to the Next Task: Once your edits are complete, you proceed to other tasks without unnecessary delays.

This natural editing process enhances productivity, reduces errors, and ensures that your attention is directed where it's most needed. However, when leveraging AI for document editing, we often overlook this human-centric approach. Instead, we default to having AI process entire documents to make even minor changes—a method that's inefficient and counterintuitive.

While humans excel at focusing on relevant sections and making precise edits, Large Language Models (LLMs) like GPT-4 tend to process entire documents in a brute-force manner. This approach not only escalates operational costs and processing time but also heightens the risk of introducing errors due to the sheer volume of content being handled.

In this blog post, I will discuss an innovative method called Selective Processing Editor (SPE)that uses function calls to imitate human behavior, making the process more efficient, cost-effective, and reliable.

Previously, we discussed a feature called Predicted Outputs, which aims to reduce latency by up to 30% by anticipating portions of the response that remain unchanged. While Predicted Outputs offer faster responses, they still require reading and passing the entire document to OpenAI for processing, which can lead to higher processing costs and increased resource usage.

The Problem with Current AI Document Editing

Despite the impressive capabilities of Large Language Models (LLMs) like GPT-4, their application in document editing often falls short of human efficiency and precision. Instead of adopting a targeted approach, these models tend to handle entire documents in a one-size-fits-all manner. This brute-force method not only drains resources but also introduces several significant challenges:

Cost: Feeding the entire document means paying for every token.
Speed: Processing everything slows down the operation.
Reliability: More content increases the chance of errors.
Context Limits: Large documents may exceed the model's input limits.
Purpose Misalignment: LLMs are built to generate text, not to navigate and edit existing documents efficiently.

It's like forcing someone to read an entire book just to fix a typo on page 50. This inefficient strategy not only escalates operational costs and processing time but also heightens the risk of introducing errors due to the sheer volume of content being handled.

The Human-Inspired Approach

To effectively address the shortcomings of traditional AI document editing, SPE adopts a strategy inspired by human editing practices. By combining fast, deterministic tools with the intelligent capabilities of AI, SPE ensures that edits are both efficient and precise. This hybrid method mirrors the natural workflow of human editors, allowing for targeted modifications without the overhead of processing entire documents.

SPE is an approach that pairs fast, deterministic tools with AI's understanding, mimicking how humans efficiently edit documents.

Tools (Fast, Free, Reliable):

# Find sections instantly
grep "^#" document.md

# Extract exact lines in milliseconds
sed -n "50,60p" document.md

# Perfect merging
cat prefix.md new_section.md suffix.md > result.md

AI (Smart but Used Sparingly):

# Only see section map
sections = {
    "# Introduction": 1,
    "## Setup": 50,
    "## Configuration": 100
}

# Pick the right section
target = ai.choose(sections)

# Edit just that piece
new_content = ai.modify(small_section)

Real-World Example

Scenario: You have a comprehensive technical manual that requires adding a warning about version compatibility in the "Prerequisites" section.

Traditional AI Approach:

# WRONG WAY: Expensive & slow
entire_doc = load_file("huge_doc.md")
new_doc = ai.edit(entire_doc)  # 💰💰💰

Issues:
- Cost: Paying for processing every token in the document.
- Speed: Slower due to the size of the document.
- Reliability: Higher chance of introducing errors.
- Context Limits: Large documents might exceed AI's processing capacity.

SPE Approach:

# 1. Map sections (instant)
sections = find_sections("huge_doc.md")
print(sections)
{
    "# Setup": 1,
    "## Prerequisites": 50,
    "## Installation": 100
}

# 2. AI picks target (cheap - just seeing headers)
target = ai.navigate(sections, "add version warning")
# AI: "Need to modify ## Prerequisites section"

# 3. Extract just that section (instant)
section = extract_lines(file, start=50, end=99)

# 4. AI modifies small piece (cheap)
new_section = ai.modify(section)

# 5. Merge back (instant)
replace_section(file, 50, 99, new_section)

Advantages:
- Speed: Completion in approximately 3.19 seconds vs. 8.65 seconds.
- Cost Efficiency: Only processing relevant sections reduces token usage.
- Reliability: Focused edits minimize the risk of errors.
- Resource Usage: Fewer and smaller API calls compared to processing the entire document.

Detailed Breakdown

Section Mapping (Instant):
- Action: Quickly identify all sections in the document using a simple tool.
- Benefit: Immediate visibility of the document structure without any cost.
Target Selection (Cheap):
- Action: Use AI to determine which section needs editing based on the task.
- Benefit: AI processes only the section headers, significantly reducing token usage.
Content Extraction (Instant):
- Action: Extract the specific lines corresponding to the target section.
- Benefit: Rapid extraction without additional processing costs.
Content Modification (Cheap):
- Action: Instruct AI to modify only the extracted section.
- Benefit: Minimal token usage as AI handles a small, focused portion of the document.
Merging Back (Instant):
- Action: Seamlessly integrate the modified section back into the original document.
- Benefit: Maintains document integrity with no additional processing time.|

Key Takeaways

Efficiency: SPE drastically reduces the time required to perform specific edits by targeting only the necessary sections.
Cost Savings: By minimizing token usage, SPE offers a more economical solution, especially for large documents.
Reliability: Focused editing ensures that changes are precise, reducing the likelihood of unintended modifications elsewhere in the document.
Scalability: SPE's approach is highly scalable, making it suitable for documents of any size without compromising performance.

This real-world example highlights how SPE leverages a human-inspired, section-aware methodology to optimize document editing. By intelligently combining fast, deterministic tools with selective AI processing, SPE not only accelerates the editing workflow but also ensures cost-effectiveness and maintains high-quality outcomes.

Implementation Details

Here's the core SPE code:

class SPE:
    def edit(self, filepath: str, instruction: str) -> str:
        # Phase 1: Fast section mapping
        section_map = self.get_section_map(filepath)
        self.log("MAP", "Found sections:", section_map)

        # Phase 2: AI picks target section
        messages = [
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": f"Sections: {section_map}\nTask: {instruction}"}
        ]
        response = self.client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=self.tools()
        )
        
        # Phase 3: Extract target section
        target = get_target_from_response(response)
        section_content = self.extract_section(filepath, target.start, target.end)
        
        # Phase 4: AI modifies just that section
        new_content = self.get_modification(section_content, instruction)
        
        # Phase 5: Put it back
        self.replace_section(filepath, target.start, target.end, new_content)

Real-World Performance Example

Let's compare SPE's intelligent approach with the traditional naive method when adding an introduction to a technical document:

Smart Approach (Section-Aware)

Timing Breakdown:
1. Section Mapping: 0.00s
   - Found 18 sections instantly
   - Zero cost—pure text processing

2. Target Selection: 1.34s
   - AI identifies "# Docker App Manager Architecture"
   - Only processes section headers

3. Content Extraction: 0.00s
   - Instant extraction of target section
   - Zero cost—pure text processing

4. Content Modification: 1.85s
   - AI only processes the target section
   - Minimal token usage

Total Time: 3.19 seconds

Naive Approach (Full Document)

Single Operation:
- Processes entire document (270 lines)
- Includes all diagrams and code blocks
- Maximum token usage
- No selective processing

Total Time: 8.65 seconds

Key Observations

10x Speed Difference: Smart approach completed in 3.19s vs. 30.65s for the naive approach.
Cost Efficiency: Smart approach only processed:
- Section headers for navigation (18 lines)
- Target section for modification (2 lines)
- Versus the entire document (270 lines) in the naive approach.
Resource Usage:
- Smart: Two small API calls (headers + single section)
- Naive: One large API call (entire document)

This real-world example demonstrates how SPE's section-aware approach delivers:

90% Reduction in Processing Time
Significant Cost Savings through targeted processing
Maintained Quality through focused editing

Try It Yourself

You can try SPE directly using our demo application. Upload your markdown document and specify the edits you want to make.

Access the SPE Demo

I suggest you test it on a large document to see the difference, here is an example. https://raw.githubusercontent.com/kittykatattack/thepoetrybook/refs/heads/master/book.markdown

Use Cases and Applications

SPE can be effectively utilized in various scenarios where precise and efficient document editing is essential:

Technical Documentation: Quickly update specific sections like prerequisites, installation guides, or configuration settings without processing the entire manual.
Academic Papers: Make targeted revisions to abstracts, methodologies, or conclusions without altering the entire document.
Legal Documents: Edit specific clauses or sections in contracts and agreements with minimal risk of introducing errors elsewhere.
Content Management: Update blog posts, articles, or website content by focusing on relevant sections, ensuring consistency and accuracy.
Software Documentation: Modify API references, user guides, or troubleshooting sections efficiently.

By focusing on specific sections, SPE ensures that updates are made swiftly and accurately, making it ideal for environments where time and precision are paramount.

How SmarEdit is Different from Predicted Output
While Predicted Outputs are incredibly powerful for more generic use cases and can reduce latency by up to 30%, they still require reading and passing the entire document to OpenAI for processing. In contrast, SPE leverages function calling to target only the relevant sections that need editing, effectively avoiding the need to process the whole document.

Limitations

While SPE offers a streamlined approach to document editing, it's important to be aware of its current limitations:

Single Section Editing: Currently, SPE can handle one section at a time. Editing multiple sections simultaneously requires sequential operations.
Markdown-Only Support: The approach is optimized for markdown files. Support for other document formats like Word, PDF, or HTML is not yet implemented.
Demo App Limited to Markdown: The current demo application exclusively supports markdown files. However, the architecture of SPE is designed to be extensible, allowing future support for additional document formats such as Word, PDF, and HTML with further development.
Partial Context Awareness: SPE targets specific sections rather than processing the entire document. As a result, the LLM does not have the full context that it would if it were to read the entire document. While this means the AI might miss broader contextual cues, in many cases, the benefits of reduced cost and increased efficiency outweigh the drawbacks. For most targeted edits, the partial context provided is sufficient to achieve accurate and reliable results.
Dependency on Clear Sectioning: SPE relies on well-defined section headers. If sections are poorly structured or lack clear demarcations, the approach may fail or produce inaccurate results.

Acknowledging these limitations provides a roadmap for future enhancements, ensuring that SPEcontinues to evolve and expand its capabilities.

Key Benefits

Speed: Tools handle navigation instantly.
Cost: AI only processes relevant parts.
Reliability: Tools maintain structural integrity without errors.
Scalability: Capable of handling massive documents.
Natural: Mirrors how humans actually edit.

Future Improvements

Multi-Section Edits: Enable editing multiple sections in a single operation.
Enhanced Section Boundary Detection: Improve the accuracy of section identification, especially in complex documents.
More Sophisticated Navigation Tools: Develop advanced tools for navigating and selecting sections based on context and relevance.
Support for Additional Document Formats: Extend compatibility to formats like Word, PDF, and HTML to broaden applicability.

Conclusion

By teaching AI to edit like humans do—scanning, navigating, and making targeted changes—I combine the speed and reliability of traditional tools with the understanding and adaptability of AI. SPE represents a shift towards more efficient and intelligent document editing, eliminating the need for AI to process entire documents for minor edits. Instead, it empowers AI with the same efficient strategies I use myself, paving the way for smarter, faster, and more cost-effective editing solutions.