PageIndex: Vectorless Reasoning-Based RAG with Claude
White Paper

PageIndex: Vectorless Reasoning-Based RAG with Claude

Jake McCluskeyUpdated
Back to white papers

Source post: @datasciencebrain Telegram/Instagram, topic "Vector databases are no longer required for RAG"

Scraped claim: "98.7% on FinanceBench", state-of-the-art on SEC filings and earnings reports

Tool: PageIndex by VectifyAI, MIT licensed, open source

Stack used here: PageIndex + Claude (swapping the default OpenAI for Claude via LiteLLM)

The core idea

Traditional RAG = embed chunks, run vector similarity search, stuff top-k into the prompt. It fails on long-form documents where context matters (financial reports, legal docs, scientific papers) because similarity isn't the same as relevance.

PageIndex does retrieval the way a human would:

  1. Build a table-of-contents tree from the document (titles, sections, subsections, summaries per node)
  2. Let the LLM walk the tree to find relevant sections, reading summaries and descending into likely branches

No embeddings. No chunking. No vector DB. The LLM does the retrieval by reasoning over structure.

Why this works with Claude specifically

  • Claude's long context (200K) is a perfect fit for tree walking. It can hold large portions of the tree at once.
  • Claude's reasoning on structured JSON is strong. Tree navigation is a natural format.
  • No embedding model lock-in. You only need a text LLM.

Complete setup

1. Install

git clone https://github.com/VectifyAI/PageIndex.git
cd PageIndex
pip install -r requirements.txt
pip install anthropic litellm  # for Claude

2. Configure Claude

# .env
ANTHROPIC_API_KEY=sk-ant-...

PageIndex supports LiteLLM, which routes to Claude natively.

3. Build the tree from a PDF

python3 run_pageindex.py \
  --pdf_path ./docs/tesla_10k_2024.pdf \
  --model claude-opus-4-7 \
  --max-pages-per-node 10 \
  --if-add-node-summary yes

Output (tesla_10k_2024.json):

{
  "doc_description": "Tesla 2024 10-K annual report...",
  "tree": [
    {
      "title": "Part I, Business",
      "node_id": "0001",
      "start_index": 3,
      "end_index": 28,
      "summary": "Overview of operations, products, manufacturing...",
      "nodes": [
        {
          "title": "Risk Factors",
          "node_id": "0002",
          "start_index": 15,
          "end_index": 24,
          "summary": "Supply chain, regulatory, competitive risks..."
        }
      ]
    },
    {
      "title": "Management Discussion & Analysis",
      "node_id": "0003",
      "start_index": 40,
      "end_index": 65,
      "summary": "Revenue growth YoY, margin compression...",
      "nodes": [...]
    }
  ]
}

4. Query via tree reasoning, using Claude directly

import json
import anthropic
import pdfplumber

client = anthropic.Anthropic()
MODEL = "claude-opus-4-7"

with open("tesla_10k_2024.json") as f:
    tree = json.load(f)

def get_pages(pdf_path, start, end):
    with pdfplumber.open(pdf_path) as pdf:
        return "\n\n".join(p.extract_text() or "" for p in pdf.pages[start-1:end])

def navigate_tree(question: str, tree: list, pdf_path: str, max_hops: int = 3):
    """Let Claude walk the tree, then read the selected section."""
    current = tree
    trail = []

    for hop in range(max_hops):
        # Show Claude the current level's nodes with summaries
        options = [{
            "node_id": n["node_id"],
            "title": n["title"],
            "summary": n.get("summary", ""),
            "has_children": bool(n.get("nodes")),
            "page_range": [n["start_index"], n["end_index"]],
        } for n in current]

        resp = client.messages.create(
            model=MODEL,
            max_tokens=500,
            messages=[{
                "role": "user",
                "content": (
                    f"Question: {question}\n\n"
                    f"Current tree level:\n{json.dumps(options, indent=2)}\n\n"
                    f"Which node_id is most likely to contain the answer? "
                    f"If the node has children and the answer likely lies deeper, respond "
                    f'{{"action":"descend","node_id":"..."}}. If this is the right section, '
                    f'respond {{"action":"read","node_id":"..."}}.'
                ),
            }],
        )

        decision = json.loads(resp.content[0].text.strip())
        chosen = next(n for n in current if n["node_id"] == decision["node_id"])
        trail.append(chosen["title"])

        if decision["action"] == "read" or not chosen.get("nodes"):
            pages = get_pages(pdf_path, chosen["start_index"], chosen["end_index"])
            answer = client.messages.create(
                model=MODEL,
                max_tokens=1024,
                messages=[{
                    "role": "user",
                    "content": (
                        f"Using ONLY this section, answer: {question}\n\n"
                        f"Section '{chosen['title']}' (pages {chosen['start_index']}-{chosen['end_index']}):\n\n{pages}"
                    ),
                }],
            )
            return {
                "answer": answer.content[0].text,
                "path": trail,
                "pages": [chosen["start_index"], chosen["end_index"]],
            }

        current = chosen["nodes"]

    return {"answer": "Could not locate section within max hops.", "path": trail}


# Usage
result = navigate_tree(
    "What are Tesla's primary supply chain risks?",
    tree["tree"],
    "./docs/tesla_10k_2024.pdf",
)
print(result["answer"])
print("Navigated:", " → ".join(result["path"]))
print("Cited pages:", result["pages"])

5. Agentic version (multi-question, tool-use style)

Claude can expose "tree walk" and "read section" as tools and plan multi-hop retrieval itself:

TOOLS = [
    {
        "name": "list_children",
        "description": "List child sections of a node in the document tree.",
        "input_schema": {
            "type": "object",
            "properties": {"node_id": {"type": "string"}},
            "required": ["node_id"],
        },
    },
    {
        "name": "read_section",
        "description": "Read the full text of a section by node_id.",
        "input_schema": {
            "type": "object",
            "properties": {"node_id": {"type": "string"}},
            "required": ["node_id"],
        },
    },
]

# Build a flat node lookup for fast dispatch
def flatten(nodes, out=None):
    out = out if out is not None else {}
    for n in nodes:
        out[n["node_id"]] = n
        if n.get("nodes"):
            flatten(n["nodes"], out)
    return out

NODE_BY_ID = flatten(tree["tree"])

def handle_tool(name, args, pdf_path):
    node = NODE_BY_ID[args["node_id"]]
    if name == "list_children":
        return [{"node_id": c["node_id"], "title": c["title"], "summary": c.get("summary","")}
                for c in node.get("nodes", [])]
    if name == "read_section":
        return get_pages(pdf_path, node["start_index"], node["end_index"])

def ask(question, pdf_path):
    messages = [{"role": "user", "content": (
        f"Doc: {tree['doc_description']}\n"
        f"Top-level sections: {[{'id':n['node_id'],'title':n['title']} for n in tree['tree']]}\n\n"
        f"Question: {question}\n"
        f"Use tools to navigate and answer with citations."
    )}]

    while True:
        resp = client.messages.create(
            model=MODEL, max_tokens=2048, tools=TOOLS, messages=messages
        )
        messages.append({"role": "assistant", "content": resp.content})

        if resp.stop_reason == "end_turn":
            return next(b.text for b in resp.content if b.type == "text")

        tool_results = []
        for block in resp.content:
            if block.type == "tool_use":
                result = handle_tool(block.name, block.input, pdf_path)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": json.dumps(result)[:15000],  # cap
                })
        messages.append({"role": "user", "content": tool_results})

6. Drop-in alternative: PageIndex MCP server

If you're using Claude Desktop or Claude Code, there's an MCP server. Add it to your claude_desktop_config.json:

{
  "mcpServers": {
    "pageindex": {
      "command": "npx",
      "args": ["-y", "pageindex-mcp"],
      "env": {"ANTHROPIC_API_KEY": "sk-ant-..."}
    }
  }
}

Claude can then natively query PDFs via PageIndex with no custom code.

Why 98.7% on FinanceBench

FinanceBench has questions whose answers depend on reading a specific SEC filing section (e.g. "What was the restructuring charge in Q3?"). Vector RAG fails because:

  • "Restructuring charge" might appear in three unrelated sections (MD&A, footnotes, risk factors)
  • Embeddings can't tell "the actual number" from "a discussion of the concept"

PageIndex wins because the LLM reasons: "I need the MD&A, then operating expenses, then the restructuring line item," the same path a human analyst takes.

When to use this vs. traditional RAG

Use PageIndex whenUse vector RAG when
Documents have clear structure (reports, textbooks, contracts)Flat content (forum posts, tickets, chats)
Answers depend on which sectionAnswers depend on matching language
Traceability and citations matterSpeed matters more than explainability
You have Claude or GPT-4 class modelsYou need cheap retrieval at scale

Resume angle

"Built a vectorless RAG system using PageIndex: documents compiled into semantic tree indexes, Claude reasons over the tree via tool use to retrieve precise sections with page-level citations. Achieved FinanceBench-style accuracy on 10-K Q&A without embeddings or chunking."

Common questions

Frequently asked

What is PageIndex and how does it differ from traditional vector-based RAG?

PageIndex is an open-source RAG system that builds a table-of-contents tree from documents and lets the LLM walk that tree to find relevant sections, the same way a human analyst would navigate a report. Unlike traditional RAG which embeds chunks and uses vector similarity search, PageIndex uses no embeddings, no chunking, and no vector database. The LLM does retrieval by reasoning over the document structure itself.

Why does PageIndex work well with Claude specifically?

Claude's 200K token context window is ideal for holding large portions of the document tree during navigation. Claude also has strong reasoning capabilities on structured JSON, which makes tree navigation a natural fit. Additionally, using PageIndex with Claude eliminates embedding model lock-in since you only need a text LLM.

What accuracy did PageIndex achieve on FinanceBench and why does it outperform vector RAG on financial documents?

PageIndex achieved 98.7% accuracy on FinanceBench, a benchmark for SEC filing and earnings report question answering. It outperforms vector RAG because embeddings cannot distinguish between a discussion of a concept and the actual data, and similarity search fails when the same term appears in multiple unrelated sections. PageIndex succeeds because the LLM reasons through the document structure the way a human analyst would, navigating directly to the correct section like MD&A or operating expenses.

When should I use PageIndex instead of traditional vector RAG?

Use PageIndex when documents have clear structure like reports, textbooks, or contracts, when answers depend on which specific section contains the information, when traceability and citations matter, and when you have access to Claude or GPT-4 class models. Use traditional vector RAG for flat content like forum posts or tickets, when answers depend on matching language rather than document location, when speed matters more than explainability, or when you need cheap retrieval at scale.

How does PageIndex navigation work in practice with Claude?

PageIndex presents Claude with the current tree level showing node titles, summaries, and metadata. Claude decides whether to descend deeper into child nodes or read the current section based on the question. Once Claude identifies the target section through this reasoning process, it retrieves the actual page content from that section and answers the question using only that grounded text, producing page-level citations.

READY TO IMPLEMENT

Want to talk through this in your business?

The paper above is the thinking. Let's spend 30 minutes on what it would actually look like to ship in your shop, no pitch, just a real scoping conversation.

PageIndex: Vectorless Reasoning-Based RAG with Claude