PageIndex: Vectorless Reasoning-Based RAG with Claude

Source post: @datasciencebrain Telegram/Instagram, topic "Vector databases are no longer required for RAG"
Scraped claim: "98.7% on FinanceBench", state-of-the-art on SEC filings and earnings reports
Tool: PageIndex by VectifyAI, MIT licensed, open source
Stack used here: PageIndex + Claude (swapping the default OpenAI for Claude via LiteLLM)
The core idea
Traditional RAG = embed chunks, run vector similarity search, stuff top-k into the prompt. It fails on long-form documents where context matters (financial reports, legal docs, scientific papers) because similarity isn't the same as relevance.
PageIndex does retrieval the way a human would:
- Build a table-of-contents tree from the document (titles, sections, subsections, summaries per node)
- Let the LLM walk the tree to find relevant sections, reading summaries and descending into likely branches
No embeddings. No chunking. No vector DB. The LLM does the retrieval by reasoning over structure.
Why this works with Claude specifically
- Claude's long context (200K) is a perfect fit for tree walking. It can hold large portions of the tree at once.
- Claude's reasoning on structured JSON is strong. Tree navigation is a natural format.
- No embedding model lock-in. You only need a text LLM.
Complete setup
1. Install
git clone https://github.com/VectifyAI/PageIndex.git
cd PageIndex
pip install -r requirements.txt
pip install anthropic litellm # for Claude
2. Configure Claude
# .env
ANTHROPIC_API_KEY=sk-ant-...
PageIndex supports LiteLLM, which routes to Claude natively.
3. Build the tree from a PDF
python3 run_pageindex.py \
--pdf_path ./docs/tesla_10k_2024.pdf \
--model claude-opus-4-7 \
--max-pages-per-node 10 \
--if-add-node-summary yes
Output (tesla_10k_2024.json):
{
"doc_description": "Tesla 2024 10-K annual report...",
"tree": [
{
"title": "Part I — Business",
"node_id": "0001",
"start_index": 3,
"end_index": 28,
"summary": "Overview of operations, products, manufacturing...",
"nodes": [
{
"title": "Risk Factors",
"node_id": "0002",
"start_index": 15,
"end_index": 24,
"summary": "Supply chain, regulatory, competitive risks..."
}
]
},
{
"title": "Management Discussion & Analysis",
"node_id": "0003",
"start_index": 40,
"end_index": 65,
"summary": "Revenue growth YoY, margin compression...",
"nodes": [...]
}
]
}
4. Query via tree reasoning, using Claude directly
import json
import anthropic
import pdfplumber
client = anthropic.Anthropic()
MODEL = "claude-opus-4-7"
with open("tesla_10k_2024.json") as f:
tree = json.load(f)
def get_pages(pdf_path, start, end):
with pdfplumber.open(pdf_path) as pdf:
return "\n\n".join(p.extract_text() or "" for p in pdf.pages[start-1:end])
def navigate_tree(question: str, tree: list, pdf_path: str, max_hops: int = 3):
"""Let Claude walk the tree, then read the selected section."""
current = tree
trail = []
for hop in range(max_hops):
# Show Claude the current level's nodes with summaries
options = [{
"node_id": n["node_id"],
"title": n["title"],
"summary": n.get("summary", ""),
"has_children": bool(n.get("nodes")),
"page_range": [n["start_index"], n["end_index"]],
} for n in current]
resp = client.messages.create(
model=MODEL,
max_tokens=500,
messages=[{
"role": "user",
"content": (
f"Question: {question}\n\n"
f"Current tree level:\n{json.dumps(options, indent=2)}\n\n"
f"Which node_id is most likely to contain the answer? "
f"If the node has children and the answer likely lies deeper, respond "
f'{{"action":"descend","node_id":"..."}}. If this is the right section, '
f'respond {{"action":"read","node_id":"..."}}.'
),
}],
)
decision = json.loads(resp.content[0].text.strip())
chosen = next(n for n in current if n["node_id"] == decision["node_id"])
trail.append(chosen["title"])
if decision["action"] == "read" or not chosen.get("nodes"):
pages = get_pages(pdf_path, chosen["start_index"], chosen["end_index"])
answer = client.messages.create(
model=MODEL,
max_tokens=1024,
messages=[{
"role": "user",
"content": (
f"Using ONLY this section, answer: {question}\n\n"
f"Section '{chosen['title']}' (pages {chosen['start_index']}-{chosen['end_index']}):\n\n{pages}"
),
}],
)
return {
"answer": answer.content[0].text,
"path": trail,
"pages": [chosen["start_index"], chosen["end_index"]],
}
current = chosen["nodes"]
return {"answer": "Could not locate section within max hops.", "path": trail}
# Usage
result = navigate_tree(
"What are Tesla's primary supply chain risks?",
tree["tree"],
"./docs/tesla_10k_2024.pdf",
)
print(result["answer"])
print("Navigated:", " → ".join(result["path"]))
print("Cited pages:", result["pages"])
5. Agentic version (multi-question, tool-use style)
Claude can expose "tree walk" and "read section" as tools and plan multi-hop retrieval itself:
TOOLS = [
{
"name": "list_children",
"description": "List child sections of a node in the document tree.",
"input_schema": {
"type": "object",
"properties": {"node_id": {"type": "string"}},
"required": ["node_id"],
},
},
{
"name": "read_section",
"description": "Read the full text of a section by node_id.",
"input_schema": {
"type": "object",
"properties": {"node_id": {"type": "string"}},
"required": ["node_id"],
},
},
]
# Build a flat node lookup for fast dispatch
def flatten(nodes, out=None):
out = out if out is not None else {}
for n in nodes:
out[n["node_id"]] = n
if n.get("nodes"):
flatten(n["nodes"], out)
return out
NODE_BY_ID = flatten(tree["tree"])
def handle_tool(name, args, pdf_path):
node = NODE_BY_ID[args["node_id"]]
if name == "list_children":
return [{"node_id": c["node_id"], "title": c["title"], "summary": c.get("summary","")}
for c in node.get("nodes", [])]
if name == "read_section":
return get_pages(pdf_path, node["start_index"], node["end_index"])
def ask(question, pdf_path):
messages = [{"role": "user", "content": (
f"Doc: {tree['doc_description']}\n"
f"Top-level sections: {[{'id':n['node_id'],'title':n['title']} for n in tree['tree']]}\n\n"
f"Question: {question}\n"
f"Use tools to navigate and answer with citations."
)}]
while True:
resp = client.messages.create(
model=MODEL, max_tokens=2048, tools=TOOLS, messages=messages
)
messages.append({"role": "assistant", "content": resp.content})
if resp.stop_reason == "end_turn":
return next(b.text for b in resp.content if b.type == "text")
tool_results = []
for block in resp.content:
if block.type == "tool_use":
result = handle_tool(block.name, block.input, pdf_path)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result)[:15000], # cap
})
messages.append({"role": "user", "content": tool_results})
6. Drop-in alternative: PageIndex MCP server
If you're using Claude Desktop or Claude Code, there's an MCP server. Add it to your claude_desktop_config.json:
{
"mcpServers": {
"pageindex": {
"command": "npx",
"args": ["-y", "pageindex-mcp"],
"env": {"ANTHROPIC_API_KEY": "sk-ant-..."}
}
}
}
Claude can then natively query PDFs via PageIndex with no custom code.
Why 98.7% on FinanceBench
FinanceBench has questions whose answers depend on reading a specific SEC filing section (e.g. "What was the restructuring charge in Q3?"). Vector RAG fails because:
- "Restructuring charge" might appear in three unrelated sections (MD&A, footnotes, risk factors)
- Embeddings can't tell "the actual number" from "a discussion of the concept"
PageIndex wins because the LLM reasons: "I need the MD&A, then operating expenses, then the restructuring line item," the same path a human analyst takes.
When to use this vs. traditional RAG
| Use PageIndex when | Use vector RAG when |
|---|---|
| Documents have clear structure (reports, textbooks, contracts) | Flat content (forum posts, tickets, chats) |
| Answers depend on which section | Answers depend on matching language |
| Traceability and citations matter | Speed matters more than explainability |
| You have Claude or GPT-4 class models | You need cheap retrieval at scale |
Resume angle
"Built a vectorless RAG system using PageIndex: documents compiled into semantic tree indexes, Claude reasons over the tree via tool use to retrieve precise sections with page-level citations. Achieved FinanceBench-style accuracy on 10-K Q&A without embeddings or chunking."