Back to white papers
White Paper

PageIndex: Vectorless Reasoning-Based RAG with Claude

Jake McCluskey
PageIndex: Vectorless Reasoning-Based RAG with Claude

Source post: @datasciencebrain Telegram/Instagram, topic "Vector databases are no longer required for RAG"

Scraped claim: "98.7% on FinanceBench", state-of-the-art on SEC filings and earnings reports

Tool: PageIndex by VectifyAI, MIT licensed, open source

Stack used here: PageIndex + Claude (swapping the default OpenAI for Claude via LiteLLM)

The core idea

Traditional RAG = embed chunks, run vector similarity search, stuff top-k into the prompt. It fails on long-form documents where context matters (financial reports, legal docs, scientific papers) because similarity isn't the same as relevance.

PageIndex does retrieval the way a human would:

  1. Build a table-of-contents tree from the document (titles, sections, subsections, summaries per node)
  2. Let the LLM walk the tree to find relevant sections, reading summaries and descending into likely branches

No embeddings. No chunking. No vector DB. The LLM does the retrieval by reasoning over structure.

Why this works with Claude specifically

  • Claude's long context (200K) is a perfect fit for tree walking. It can hold large portions of the tree at once.
  • Claude's reasoning on structured JSON is strong. Tree navigation is a natural format.
  • No embedding model lock-in. You only need a text LLM.

Complete setup

1. Install

git clone https://github.com/VectifyAI/PageIndex.git
cd PageIndex
pip install -r requirements.txt
pip install anthropic litellm  # for Claude

2. Configure Claude

# .env
ANTHROPIC_API_KEY=sk-ant-...

PageIndex supports LiteLLM, which routes to Claude natively.

3. Build the tree from a PDF

python3 run_pageindex.py \
  --pdf_path ./docs/tesla_10k_2024.pdf \
  --model claude-opus-4-7 \
  --max-pages-per-node 10 \
  --if-add-node-summary yes

Output (tesla_10k_2024.json):

{
  "doc_description": "Tesla 2024 10-K annual report...",
  "tree": [
    {
      "title": "Part I — Business",
      "node_id": "0001",
      "start_index": 3,
      "end_index": 28,
      "summary": "Overview of operations, products, manufacturing...",
      "nodes": [
        {
          "title": "Risk Factors",
          "node_id": "0002",
          "start_index": 15,
          "end_index": 24,
          "summary": "Supply chain, regulatory, competitive risks..."
        }
      ]
    },
    {
      "title": "Management Discussion & Analysis",
      "node_id": "0003",
      "start_index": 40,
      "end_index": 65,
      "summary": "Revenue growth YoY, margin compression...",
      "nodes": [...]
    }
  ]
}

4. Query via tree reasoning, using Claude directly

import json
import anthropic
import pdfplumber

client = anthropic.Anthropic()
MODEL = "claude-opus-4-7"

with open("tesla_10k_2024.json") as f:
    tree = json.load(f)

def get_pages(pdf_path, start, end):
    with pdfplumber.open(pdf_path) as pdf:
        return "\n\n".join(p.extract_text() or "" for p in pdf.pages[start-1:end])

def navigate_tree(question: str, tree: list, pdf_path: str, max_hops: int = 3):
    """Let Claude walk the tree, then read the selected section."""
    current = tree
    trail = []

    for hop in range(max_hops):
        # Show Claude the current level's nodes with summaries
        options = [{
            "node_id": n["node_id"],
            "title": n["title"],
            "summary": n.get("summary", ""),
            "has_children": bool(n.get("nodes")),
            "page_range": [n["start_index"], n["end_index"]],
        } for n in current]

        resp = client.messages.create(
            model=MODEL,
            max_tokens=500,
            messages=[{
                "role": "user",
                "content": (
                    f"Question: {question}\n\n"
                    f"Current tree level:\n{json.dumps(options, indent=2)}\n\n"
                    f"Which node_id is most likely to contain the answer? "
                    f"If the node has children and the answer likely lies deeper, respond "
                    f'{{"action":"descend","node_id":"..."}}. If this is the right section, '
                    f'respond {{"action":"read","node_id":"..."}}.'
                ),
            }],
        )

        decision = json.loads(resp.content[0].text.strip())
        chosen = next(n for n in current if n["node_id"] == decision["node_id"])
        trail.append(chosen["title"])

        if decision["action"] == "read" or not chosen.get("nodes"):
            pages = get_pages(pdf_path, chosen["start_index"], chosen["end_index"])
            answer = client.messages.create(
                model=MODEL,
                max_tokens=1024,
                messages=[{
                    "role": "user",
                    "content": (
                        f"Using ONLY this section, answer: {question}\n\n"
                        f"Section '{chosen['title']}' (pages {chosen['start_index']}-{chosen['end_index']}):\n\n{pages}"
                    ),
                }],
            )
            return {
                "answer": answer.content[0].text,
                "path": trail,
                "pages": [chosen["start_index"], chosen["end_index"]],
            }

        current = chosen["nodes"]

    return {"answer": "Could not locate section within max hops.", "path": trail}


# Usage
result = navigate_tree(
    "What are Tesla's primary supply chain risks?",
    tree["tree"],
    "./docs/tesla_10k_2024.pdf",
)
print(result["answer"])
print("Navigated:", " → ".join(result["path"]))
print("Cited pages:", result["pages"])

5. Agentic version (multi-question, tool-use style)

Claude can expose "tree walk" and "read section" as tools and plan multi-hop retrieval itself:

TOOLS = [
    {
        "name": "list_children",
        "description": "List child sections of a node in the document tree.",
        "input_schema": {
            "type": "object",
            "properties": {"node_id": {"type": "string"}},
            "required": ["node_id"],
        },
    },
    {
        "name": "read_section",
        "description": "Read the full text of a section by node_id.",
        "input_schema": {
            "type": "object",
            "properties": {"node_id": {"type": "string"}},
            "required": ["node_id"],
        },
    },
]

# Build a flat node lookup for fast dispatch
def flatten(nodes, out=None):
    out = out if out is not None else {}
    for n in nodes:
        out[n["node_id"]] = n
        if n.get("nodes"):
            flatten(n["nodes"], out)
    return out

NODE_BY_ID = flatten(tree["tree"])

def handle_tool(name, args, pdf_path):
    node = NODE_BY_ID[args["node_id"]]
    if name == "list_children":
        return [{"node_id": c["node_id"], "title": c["title"], "summary": c.get("summary","")}
                for c in node.get("nodes", [])]
    if name == "read_section":
        return get_pages(pdf_path, node["start_index"], node["end_index"])

def ask(question, pdf_path):
    messages = [{"role": "user", "content": (
        f"Doc: {tree['doc_description']}\n"
        f"Top-level sections: {[{'id':n['node_id'],'title':n['title']} for n in tree['tree']]}\n\n"
        f"Question: {question}\n"
        f"Use tools to navigate and answer with citations."
    )}]

    while True:
        resp = client.messages.create(
            model=MODEL, max_tokens=2048, tools=TOOLS, messages=messages
        )
        messages.append({"role": "assistant", "content": resp.content})

        if resp.stop_reason == "end_turn":
            return next(b.text for b in resp.content if b.type == "text")

        tool_results = []
        for block in resp.content:
            if block.type == "tool_use":
                result = handle_tool(block.name, block.input, pdf_path)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": json.dumps(result)[:15000],  # cap
                })
        messages.append({"role": "user", "content": tool_results})

6. Drop-in alternative: PageIndex MCP server

If you're using Claude Desktop or Claude Code, there's an MCP server. Add it to your claude_desktop_config.json:

{
  "mcpServers": {
    "pageindex": {
      "command": "npx",
      "args": ["-y", "pageindex-mcp"],
      "env": {"ANTHROPIC_API_KEY": "sk-ant-..."}
    }
  }
}

Claude can then natively query PDFs via PageIndex with no custom code.

Why 98.7% on FinanceBench

FinanceBench has questions whose answers depend on reading a specific SEC filing section (e.g. "What was the restructuring charge in Q3?"). Vector RAG fails because:

  • "Restructuring charge" might appear in three unrelated sections (MD&A, footnotes, risk factors)
  • Embeddings can't tell "the actual number" from "a discussion of the concept"

PageIndex wins because the LLM reasons: "I need the MD&A, then operating expenses, then the restructuring line item," the same path a human analyst takes.

When to use this vs. traditional RAG

Use PageIndex whenUse vector RAG when
Documents have clear structure (reports, textbooks, contracts)Flat content (forum posts, tickets, chats)
Answers depend on which sectionAnswers depend on matching language
Traceability and citations matterSpeed matters more than explainability
You have Claude or GPT-4 class modelsYou need cheap retrieval at scale

Resume angle

"Built a vectorless RAG system using PageIndex: documents compiled into semantic tree indexes, Claude reasons over the tree via tool use to retrieve precise sections with page-level citations. Achieved FinanceBench-style accuracy on 10-K Q&A without embeddings or chunking."