Claude Tool Use Fundamentals: The Foundation of Every Agent
White Paper

Claude Tool Use Fundamentals: The Foundation of Every Agent

Jake McCluskeyUpdated
Back to white papers

Topic: The foundation behind every AI agent, teaching Claude to call your code.

Stack: anthropic Python SDK plus Claude (Opus 4.7 / Sonnet 4.6)

Why this is THE foundational skill

Every "agent," every "RAG system," every "AI assistant" boils down to one loop: the LLM decides when to call your functions, you run them, and you feed results back. Once you've got tool use down, every framework out there (LangGraph, CrewAI, LangChain) is just an implementation detail on top of it.

The loop: 90% of agents look like this

1. User prompt + tool schemas → Claude
2. Claude returns either:
   a. text (done)
   b. tool_use blocks (wants you to run functions)
3. You execute the tools, package results
4. Send [user msg, assistant msg with tool_use, tool_results] back to Claude
5. Repeat until Claude returns text only (stop_reason == "end_turn")

Minimal working example: a weather + calculator agent

1. Setup

pip install anthropic requests
export ANTHROPIC_API_KEY=sk-ant-...

2. Define your tools (JSON Schema)

import anthropic, json, requests

client = anthropic.Anthropic()
MODEL = "claude-opus-4-7"

TOOLS = [
    {
        "name": "get_weather",
        "description": "Get current temperature and conditions for a city.",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name, e.g. 'Austin, TX'"},
                "units": {"type": "string", "enum": ["celsius", "fahrenheit"], "default": "fahrenheit"}
            },
            "required": ["city"],
        },
    },
    {
        "name": "calculate",
        "description": "Evaluate a math expression. Use for any arithmetic or unit conversion.",
        "input_schema": {
            "type": "object",
            "properties": {
                "expression": {"type": "string", "description": "e.g. '(72-32) * 5/9'"},
            },
            "required": ["expression"],
        },
    },
]

3. Implement the tool dispatcher

def get_weather(city: str, units: str = "fahrenheit") -> dict:
    # Free Open-Meteo API, no key required
    geo = requests.get(
        "https://geocoding-api.open-meteo.com/v1/search",
        params={"name": city, "count": 1},
    ).json()
    if not geo.get("results"):
        return {"error": f"city '{city}' not found"}
    loc = geo["results"][0]
    unit_param = "fahrenheit" if units == "fahrenheit" else "celsius"
    wx = requests.get(
        "https://api.open-meteo.com/v1/forecast",
        params={
            "latitude": loc["latitude"],
            "longitude": loc["longitude"],
            "current": "temperature_2m,weather_code",
            "temperature_unit": unit_param,
        },
    ).json()["current"]
    return {
        "city": loc["name"],
        "temperature": wx["temperature_2m"],
        "units": units,
    }

def calculate(expression: str) -> dict:
    try:
        # SAFETY: in real code use sympy or a proper parser, not eval
        return {"result": eval(expression, {"__builtins__": {}}, {})}
    except Exception as e:
        return {"error": str(e)}

DISPATCH = {"get_weather": get_weather, "calculate": calculate}

4. The agent loop

def run_agent(user_prompt: str, max_turns: int = 10):
    messages = [{"role": "user", "content": user_prompt}]

    for turn in range(max_turns):
        resp = client.messages.create(
            model=MODEL,
            max_tokens=2048,
            tools=TOOLS,
            messages=messages,
        )

        # Append Claude's response to history (tool_use blocks included)
        messages.append({"role": "assistant", "content": resp.content})

        if resp.stop_reason == "end_turn":
            # Return the final text
            return next(b.text for b in resp.content if b.type == "text")

        if resp.stop_reason == "tool_use":
            # Execute every tool_use block, gather results
            tool_results = []
            for block in resp.content:
                if block.type == "tool_use":
                    fn = DISPATCH.get(block.name)
                    result = fn(**block.input) if fn else {"error": "unknown tool"}
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": json.dumps(result),
                    })
            messages.append({"role": "user", "content": tool_results})
            continue

    return "Max turns reached."

# Try it
print(run_agent("What's the temperature in Austin in Celsius?"))
# → Claude calls get_weather(city="Austin", units="fahrenheit")
#   → reads 68°F
#   → calls calculate("(68-32)*5/9")
#   → reads 20.0
#   → returns: "It's currently about 20°C (68°F) in Austin."

Four patterns you'll use constantly

A. Force Claude to use a tool (tool_choice)

resp = client.messages.create(
    model=MODEL, max_tokens=1024, tools=TOOLS, messages=messages,
    tool_choice={"type": "tool", "name": "get_weather"},  # force this specific tool
)
# Other options: {"type": "any"} (must use SOME tool), {"type": "auto"} (default)

Use case: structured extraction. Force a record_fact tool and you get typed JSON out of prose, no parsing gymnastics.

B. Parallel tool calls

Claude often emits multiple tool_use blocks in one response when the calls are independent. The loop above already handles this. It iterates over resp.content and collects every result before sending back. Don't serialize them.

C. Tool errors and retries

tool_results.append({
    "type": "tool_result",
    "tool_use_id": block.id,
    "content": json.dumps({"error": "API rate limited, retry in 60s"}),
    "is_error": True,   # <-- tells Claude this is a failure, not data
})

Claude will typically retry, adjust its plan, or explain the failure to the user. Always return a structured error rather than raising an exception.

D. Long-running tools and tool_result placement

If a tool takes 30 seconds, just wait. The token clock runs on the next messages.create call, not during tool execution. Claude sees the result in a fresh turn.

Structured output via tool use (instead of JSON mode)

Need guaranteed JSON? Define the schema as a tool and force it:

TOOLS = [{
    "name": "record_resume_data",
    "description": "Extract resume data into structured format.",
    "input_schema": {
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "email": {"type": "string"},
            "years_experience": {"type": "integer"},
            "skills": {"type": "array", "items": {"type": "string"}},
        },
        "required": ["name", "email", "years_experience", "skills"],
    },
}]

resp = client.messages.create(
    model=MODEL, max_tokens=1024, tools=TOOLS,
    messages=[{"role": "user", "content": resume_text}],
    tool_choice={"type": "tool", "name": "record_resume_data"},
)
structured = next(b.input for b in resp.content if b.type == "tool_use")
# structured is guaranteed to match the schema

This is more reliable than JSON mode because Claude is trained heavily on tool use, and the schema gets enforced.

Common pitfalls

  1. Forgetting to append the assistant's tool_use response. The message history has to include Claude's tool_use blocks, not just the final text. Otherwise the tool_result has nothing to reference.
  2. tool_use_id mismatch. Every tool_result MUST match a tool_use_id from the prior assistant message. Copy the ID straight from block.id.
  3. Running tools during streaming. Don't try to execute tools mid-stream. Wait for stop_reason == "tool_use", then run them.
  4. Over-defining tools. Too many tools confuses the model. Keep it to 10-15 per call. If you need more, cluster them into "get_data" dispatch tools that take an operation argument.
  5. Missing descriptions. Tool descriptions are prompt real estate. "Get weather" is weak. "Get current temperature and conditions for a city. Use when user asks about weather or for planning outdoor activities." is what you want.

Cost optimization: prompt caching for tools

Tool definitions are pure repeat tokens across turns. Cache them:

resp = client.messages.create(
    model=MODEL, max_tokens=2048,
    system=[{"type": "text", "text": "You are a helpful assistant.",
             "cache_control": {"type": "ephemeral"}}],
    tools=TOOLS,  # tools are automatically cached when system prompt is cached
    messages=messages,
)

After the first call, every subsequent turn reuses cached tool definitions at 10% cost. For agents with lots of turns, that's 5-10x in savings.

Resume angle

"Implemented agentic tool-use loops directly against the Claude API, no framework overhead. Handled parallel tool calls, error recovery, forced tool selection for structured extraction, and prompt caching for multi-turn efficiency."

Common questions

Frequently asked

What is the difference between stop_reason end_turn and stop_reason tool_use in Claude API responses?

When Claude returns stop_reason end_turn, it means the model has finished its response with text only and no further tool calls are needed. When stop_reason is tool_use, Claude has generated one or more tool_use blocks requesting you to execute specific functions and return the results before continuing.

How do you force Claude to use a specific tool instead of letting it choose?

Pass the tool_choice parameter to messages.create with the structure {type: tool, name: tool_name} to force a specific tool, or {type: any} to require Claude to use some tool. This is especially useful for structured extraction where you want guaranteed JSON output matching a tool schema.

What happens if you forget to include Claude's tool_use blocks in the message history?

If you do not append the assistant message containing the tool_use blocks to your message history before sending tool_result messages back, the tool_result will have nothing to reference and the conversation context breaks. Every tool_result must match a tool_use_id from the prior assistant message.

How does prompt caching reduce costs for multi-turn agent conversations?

Tool definitions are repeated tokens across every turn. When you cache the system prompt, tool definitions are automatically cached along with it. After the first API call, subsequent turns reuse the cached tool definitions at 10% of the normal cost, resulting in 5 to 10 times cost savings for agents with many turns.

Why is tool use more reliable than JSON mode for structured output?

Claude is heavily trained on tool use and the input schema gets enforced at the API level. When you force a tool with tool_choice, the model must return data matching the exact schema you defined, making it more reliable than asking Claude to produce JSON in its text response.

READY TO IMPLEMENT

Want to talk through this in your business?

The paper above is the thinking. Let's spend 30 minutes on what it would actually look like to ship in your shop, no pitch, just a real scoping conversation.

Claude Tool Use Fundamentals | Elite AI Advantage