Claude Tool Use Fundamentals: The Foundation of Every Agent

Topic: The foundation behind every AI agent, teaching Claude to call your code.

Stack: anthropic Python SDK plus Claude (Opus 4.7 / Sonnet 4.6)

Why this is THE foundational skill

Every "agent," every "RAG system," every "AI assistant" boils down to one loop: the LLM decides when to call your functions, you run them, and you feed results back. Once you've got tool use down, every framework out there (LangGraph, CrewAI, LangChain) is just an implementation detail on top of it.

The loop: 90% of agents look like this

1. User prompt + tool schemas → Claude
2. Claude returns either:
   a. text (done)
   b. tool_use blocks (wants you to run functions)
3. You execute the tools, package results
4. Send [user msg, assistant msg with tool_use, tool_results] back to Claude
5. Repeat until Claude returns text only (stop_reason == "end_turn")

Minimal working example: a weather + calculator agent

1. Setup

pip install anthropic requests
export ANTHROPIC_API_KEY=sk-ant-...

2. Define your tools (JSON Schema)

import anthropic, json, requests

client = anthropic.Anthropic()
MODEL = "claude-opus-4-7"

TOOLS = [
    {
        "name": "get_weather",
        "description": "Get current temperature and conditions for a city.",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name, e.g. 'Austin, TX'"},
                "units": {"type": "string", "enum": ["celsius", "fahrenheit"], "default": "fahrenheit"}
            },
            "required": ["city"],
        },
    },
    {
        "name": "calculate",
        "description": "Evaluate a math expression. Use for any arithmetic or unit conversion.",
        "input_schema": {
            "type": "object",
            "properties": {
                "expression": {"type": "string", "description": "e.g. '(72-32) * 5/9'"},
            },
            "required": ["expression"],
        },
    },
]

3. Implement the tool dispatcher

def get_weather(city: str, units: str = "fahrenheit") -> dict:
    # Free Open-Meteo API, no key required
    geo = requests.get(
        "https://geocoding-api.open-meteo.com/v1/search",
        params={"name": city, "count": 1},
    ).json()
    if not geo.get("results"):
        return {"error": f"city '{city}' not found"}
    loc = geo["results"][0]
    unit_param = "fahrenheit" if units == "fahrenheit" else "celsius"
    wx = requests.get(
        "https://api.open-meteo.com/v1/forecast",
        params={
            "latitude": loc["latitude"],
            "longitude": loc["longitude"],
            "current": "temperature_2m,weather_code",
            "temperature_unit": unit_param,
        },
    ).json()["current"]
    return {
        "city": loc["name"],
        "temperature": wx["temperature_2m"],
        "units": units,
    }

def calculate(expression: str) -> dict:
    try:
        # SAFETY: in real code use sympy or a proper parser, not eval
        return {"result": eval(expression, {"__builtins__": {}}, {})}
    except Exception as e:
        return {"error": str(e)}

DISPATCH = {"get_weather": get_weather, "calculate": calculate}

4. The agent loop

def run_agent(user_prompt: str, max_turns: int = 10):
    messages = [{"role": "user", "content": user_prompt}]

    for turn in range(max_turns):
        resp = client.messages.create(
            model=MODEL,
            max_tokens=2048,
            tools=TOOLS,
            messages=messages,
        )

        # Append Claude's response to history (tool_use blocks included)
        messages.append({"role": "assistant", "content": resp.content})

        if resp.stop_reason == "end_turn":
            # Return the final text
            return next(b.text for b in resp.content if b.type == "text")

        if resp.stop_reason == "tool_use":
            # Execute every tool_use block, gather results
            tool_results = []
            for block in resp.content:
                if block.type == "tool_use":
                    fn = DISPATCH.get(block.name)
                    result = fn(**block.input) if fn else {"error": "unknown tool"}
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": json.dumps(result),
                    })
            messages.append({"role": "user", "content": tool_results})
            continue

    return "Max turns reached."

# Try it
print(run_agent("What's the temperature in Austin in Celsius?"))
# → Claude calls get_weather(city="Austin", units="fahrenheit")
#   → reads 68°F
#   → calls calculate("(68-32)*5/9")
#   → reads 20.0
#   → returns: "It's currently about 20°C (68°F) in Austin."

Four patterns you'll use constantly

A. Force Claude to use a tool (`tool_choice`)

resp = client.messages.create(
    model=MODEL, max_tokens=1024, tools=TOOLS, messages=messages,
    tool_choice={"type": "tool", "name": "get_weather"},  # force this specific tool
)
# Other options: {"type": "any"} (must use SOME tool), {"type": "auto"} (default)

Use case: structured extraction. Force a record_fact tool and you get typed JSON out of prose, no parsing gymnastics.

B. Parallel tool calls

Claude often emits multiple tool_use blocks in one response when the calls are independent. The loop above already handles this. It iterates over resp.content and collects every result before sending back. Don't serialize them.

C. Tool errors and retries

tool_results.append({
    "type": "tool_result",
    "tool_use_id": block.id,
    "content": json.dumps({"error": "API rate limited, retry in 60s"}),
    "is_error": True,   # <-- tells Claude this is a failure, not data
})

Claude will typically retry, adjust its plan, or explain the failure to the user. Always return a structured error rather than raising an exception.

D. Long-running tools and tool_result placement

If a tool takes 30 seconds, just wait. The token clock runs on the next messages.create call, not during tool execution. Claude sees the result in a fresh turn.

Structured output via tool use (instead of JSON mode)

Need guaranteed JSON? Define the schema as a tool and force it:

TOOLS = [{
    "name": "record_resume_data",
    "description": "Extract resume data into structured format.",
    "input_schema": {
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "email": {"type": "string"},
            "years_experience": {"type": "integer"},
            "skills": {"type": "array", "items": {"type": "string"}},
        },
        "required": ["name", "email", "years_experience", "skills"],
    },
}]

resp = client.messages.create(
    model=MODEL, max_tokens=1024, tools=TOOLS,
    messages=[{"role": "user", "content": resume_text}],
    tool_choice={"type": "tool", "name": "record_resume_data"},
)
structured = next(b.input for b in resp.content if b.type == "tool_use")
# structured is guaranteed to match the schema

This is more reliable than JSON mode because Claude is trained heavily on tool use, and the schema gets enforced.

Common pitfalls

Forgetting to append the assistant's tool_use response. The message history has to include Claude's tool_use blocks, not just the final text. Otherwise the tool_result has nothing to reference.
tool_use_id mismatch. Every tool_result MUST match a tool_use_id from the prior assistant message. Copy the ID straight from block.id.
Running tools during streaming. Don't try to execute tools mid-stream. Wait for stop_reason == "tool_use", then run them.


Over-defining tools. Too many tools confuses the model. Keep it to 10-15 per call. If you need more, cluster them into "get_data" dispatch tools that take an operation argument.
Missing descriptions. Tool descriptions are prompt real estate. "Get weather" is weak. "Get current temperature and conditions for a city. Use when user asks about weather or for planning outdoor activities." is what you want.



Cost optimization: prompt caching for tools
Tool definitions are pure repeat tokens across turns. Cache them:
resp = client.messages.create(
    model=MODEL, max_tokens=2048,
    system=[{"type": "text", "text": "You are a helpful assistant.",
             "cache_control": {"type": "ephemeral"}}],
    tools=TOOLS,  # tools are automatically cached when system prompt is cached
    messages=messages,
)

After the first call, every subsequent turn reuses cached tool definitions at 10% cost. For agents with lots of turns, that's 5-10x in savings.

Resume angle
"Implemented agentic tool-use loops directly against the Claude API, no framework overhead. Handled parallel tool calls, error recovery, forced tool selection for structured extraction, and prompt caching for multi-turn efficiency."