How Do I Write the Canonical Claude Tool-Calling Loop in Python?

If you understand one Claude pattern in Python, make it the tool-calling loop. Every agent, every workflow, every "Claude uses my API" integration is some variation of this same 40-line loop. Once you can write it from memory, you can build anything. Here's the canonical implementation with all the edge cases that trip people up the first time.
Why this matters
Tool use in Claude is conceptually simple — Claude responds with a tool_use block instead of text, you execute the tool, you send the result back, Claude continues. But the first time you write the loop, three things bite:
- Constructing the next
messagesarray correctly (assistant content from the response, user content as tool results). - Handling multiple tool calls in a single response.
- Knowing when to stop.
Get those right, and the loop runs forever. Get them wrong, and you get confusing 400 errors or infinite loops. This guide is the 40 lines and the gotchas.
Before you start
You need:
- Python 3.10+.
- An Anthropic API key (
export ANTHROPIC_API_KEY=sk-ant-...). - 10 minutes. The loop itself is short; the gotchas are where the time goes.
Step 1: Install the SDK
pip install anthropicStep 2: Define one tool
Start with one. Resist the urge to define five until the loop is solid with one.
from anthropic import Anthropic
client = Anthropic()
tools = [
{
"name": "get_weather",
"description": "Get the current weather for a city.",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string"},
},
"required": ["city"],
},
},
]
def execute_tool(name: str, tool_input: dict) -> str:
if name == "get_weather":
# In real code, call an API. For the example, return canned data.
return f"The weather in {tool_input['city']} is 72F and sunny."
raise ValueError(f"Unknown tool: {name}")Step 3: Write the loop
def run(prompt: str, max_iterations: int = 10) -> str:
messages = [{"role": "user", "content": prompt}]
for _ in range(max_iterations):
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=4096,
tools=tools,
messages=messages,
)
# Append assistant response to history regardless of stop reason
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason == "end_turn":
# Extract final text and return
final = "".join(
block.text for block in response.content if block.type == "text"
)
return final
if response.stop_reason == "tool_use":
# Execute every tool_use block in the response
tool_results = []
for block in response.content:
if block.type == "tool_use":
try:
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result,
})
except Exception as e:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"Error: {e}",
"is_error": True,
})
messages.append({"role": "user", "content": tool_results})
continue
# Any other stop reason (max_tokens, stop_sequence) — bail
return f"Unexpected stop: {response.stop_reason}"
raise RuntimeError(f"Exceeded {max_iterations} iterations")That's the complete loop. Let's walk the gotchas.
Step 4: Gotcha — pass response.content verbatim to messages
The biggest mistake people make is reconstructing the assistant message from response.content[0].text or similar. Don't. Pass response.content directly. It already has the mixed text and tool_use blocks in the correct shape, and any reconstruction will lose the tool_use IDs you need to reference in the next turn.
Correct:
messages.append({"role": "assistant", "content": response.content})Wrong:
# This loses tool_use blocks entirely and breaks the next turn.
text = "".join(b.text for b in response.content if b.type == "text")
messages.append({"role": "assistant", "content": text})Step 5: Gotcha — tool results go as a user message with a list of tool_result blocks
The API wants tool results as a single user message with one or more tool_result blocks, not as multiple messages. Even if Claude called five tools in one response, you respond with one message that has five tool_result entries.
messages.append({
"role": "user",
"content": [
{"type": "tool_result", "tool_use_id": "toolu_1", "content": "..."},
{"type": "tool_result", "tool_use_id": "toolu_2", "content": "..."},
],
})Every tool_use_id from the assistant turn must have a matching tool_result. Miss one, and the next API call errors with "unexpected tool_use_id."
Step 6: Gotcha — tool_result content can be a string or a list
content in a tool_result can be a string (most common) or a list of content blocks (useful for returning images or multimodal output). Start with strings; move to the list form only if you have a concrete reason.
For structured data, stringify it:
import json
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result), # Claude parses the JSON in-conversation
})Step 7: Gotcha — stopping conditions
The loop stops when response.stop_reason == "end_turn". Other stop reasons exist — max_tokens (you hit the output limit), stop_sequence (rare), or errors. Handle them explicitly:
if response.stop_reason == "max_tokens":
# Output was cut off. Either raise max_tokens and retry, or accept partial.
...Always cap total iterations. An agent can loop indefinitely if its prompt is underspecified. 10-20 iterations is plenty for most tasks.
Step 8: Run it
if __name__ == "__main__":
out = run("What's the weather in Austin, and should I bring a jacket?")
print(out)Claude should: call get_weather(city="Austin"), receive the result, and respond with both the weather and the jacket recommendation. If that works end to end, the loop is correct.
Verify it worked
1. One round trip works. The weather prompt above should produce exactly one tool call and one final answer. If it loops multiple times, your tool result content might be too vague.
2. Multiple tool calls in one turn work. Ask "What's the weather in Austin and Seattle?" Claude often calls get_weather twice in one response. Your loop must execute both and return both results in a single user message.
3. Errors are handled gracefully. Throw an exception in execute_tool. The is_error: true flag should let Claude recover — either retry differently or report the error in the final answer. If the loop crashes, wrap execute_tool properly.
Where this breaks
- Mismatched tool_use_ids. Every
tool_usein a turn needs a matchingtool_resultin the next. Missing one errors immediately. Iterate alltool_useblocks before moving on. - Forgetting to stringify tool output. Raw Python objects in
contentfields produce confusing errors. Alwaysstr()orjson.dumps(). - Infinite loops on vague prompts. "Research everything about X" with web-search tools can loop endlessly. Always cap iterations and include stop criteria in the system prompt ("stop when you have 3-5 sources").
- Exceeding the model's context on long chains. Every iteration appends to
messages. After 15 tool calls with verbose results, you can blow through context. Strategies: summarize older turns into a single assistant message ("Earlier in the conversation, Claude searched for X and found Y"), or use prompt caching on the tool-definition prefix. - Streaming mismatch. The example above uses non-streaming. Streaming tool use is supported but requires a different API —
client.messages.stream()with event handlers. Start non-streaming; migrate only when you need it.
What to try next
- How Do I Build a Research Pipeline with the Agent SDK? — the loop above, scaled up with real tools and production concerns.
- How Do I Keep My Claude Prompt Cache Hit Rate High? — for production loops, caching the tool definitions and system prompt is where the real cost savings come from.
- How Do I Give My Claude Agent Persistent Memory Across Sessions? — the loop stops at end of run; memory is what makes it useful across runs.
Let's talk about your AI + SEO stack
If you'd rather skip the how-to and have it shipped for you, that's what I do. Start a conversation and we'll figure out the fastest path to results.
Let's Talk