If you've shipped an LLM prototype that works on your laptop and dies the moment a real user touches it, AgentScope is the framework that's been missing. It's an open-source agent runtime built by Alibaba's research team for the part everyone skips, making agents survive in production. Here's how to install it, build your first ReAct agent, wire in tools and memory, and run a multi-agent workflow you can actually deploy.
Why this matters
Most agent code I see is one Python file, one OpenAI client, and a while True loop with a TODO comment that says "add error handling later." That's fine for a demo. It falls apart the second you need parallel tool calls, persistent memory, human approval, or a deployment story that isn't "I run it on my MacBook."
AgentScope handles those for you. It's a real framework, not a toy, not a wrapper around LangChain, designed to take an agent from notebook to Kubernetes without rewriting it.
Before you start
You need:
- Python 3.10+ installed.
- An LLM provider key, OpenAI, Anthropic, DashScope, or any OpenAI-compatible endpoint. I'll use Anthropic in the examples; swap as needed.
- A clean virtualenv. Don't pip install this into your system Python.
- About 30 minutes for the full walkthrough.
Step 1: Install AgentScope
python -m venv .venv
source .venv/bin/activate
pip install agentscopeThat gives you the core. If you want the development extras (Studio dashboard, evaluator, the works), install with the option group:
pip install "agentscope[full]"Set your API key as an env var so you don't paste it into code:
export ANTHROPIC_API_KEY=sk-ant-...Step 2: Build a single ReAct agent
ReAct (Reasoning + Acting) is the loop where the agent thinks, picks a tool, runs it, sees the result, and decides what to do next. AgentScope ships a ReActAgent class that handles the whole loop for you.
import agentscope
from agentscope.agents import ReActAgent
from agentscope.models import AnthropicChatModel
agentscope.init(
model_configs=[{
"config_name": "claude_sonnet",
"model_type": "anthropic_chat",
"model_name": "claude-sonnet-4-5-20250929",
}]
)
researcher = ReActAgent(
name="Researcher",
sys_prompt="You research topics and cite sources.",
model_config_name="claude_sonnet",
tools=[],
max_iters=8,
)
response = researcher(agentscope.message.Msg("user", "What's the cache hit rate sweet spot for Claude prompt caching?", role="user"))
print(response.content)Three things doing real work here:
agentscope.initregisters model configs once for the whole process, no client construction in every agent.ReActAgentruns the think → act → observe loop until the model stops calling tools or hitsmax_iters.- The
Msgobject is AgentScope's message envelope. Every interaction passes through it, which is what makes logging, persistence, and replay tractable later.
Step 3: Add tools the agent can call
A ReAct agent without tools is just a chatbot with extra steps. Pass plain Python functions as tools and AgentScope auto-generates the schema from the type hints and docstring.
from agentscope.service import ServiceToolkit, ServiceResponse, ServiceExecStatus
import requests
def get_pypi_downloads(package: str) -> ServiceResponse:
"""Get last-month download count for a PyPI package.
Args:
package (str): Exact package name on PyPI.
"""
r = requests.get(f"https://pypistats.org/api/packages/{package}/recent")
if r.status_code != 200:
return ServiceResponse(ServiceExecStatus.ERROR, f"HTTP {r.status_code}")
data = r.json()["data"]
return ServiceResponse(ServiceExecStatus.SUCCESS, data)
toolkit = ServiceToolkit()
toolkit.add(get_pypi_downloads)
researcher = ReActAgent(
name="Researcher",
sys_prompt="You research Python packages.",
model_config_name="claude_sonnet",
service_toolkit=toolkit,
max_iters=8,
)Now when the agent reasons "I need download stats for agentscope," it will call your function and observe the result before deciding what to say.
Step 4: Add memory so the agent remembers across turns
Out of the box, AgentScope agents have a temporary memory buffer that holds the current conversation. For longer-lived state, swap in a persistent store:
from agentscope.memory import TemporaryMemory
researcher.memory = TemporaryMemory(config={"max_size": 20})For production, point memory at Redis or a vector DB (Milvus, Qdrant). The memory interface is intentionally small, three methods (add, get, clear), so you can drop in your own.
Step 5: Run a multi-agent workflow
This is where AgentScope earns the install. Build a small team, researcher, writer, editor, and let a supervisor coordinate them with a workflow primitive.
from agentscope.agents import ReActAgent
from agentscope.pipelines import sequential_pipeline
researcher = ReActAgent(name="Researcher", sys_prompt="Pull facts and cite sources.", model_config_name="claude_sonnet", service_toolkit=toolkit)
writer = ReActAgent(name="Writer", sys_prompt="Turn research notes into a 300-word brief in Jake's voice.", model_config_name="claude_sonnet")
editor = ReActAgent(name="Editor", sys_prompt="Cut hype, tighten, return final.", model_config_name="claude_sonnet")
brief_topic = agentscope.message.Msg("user", "Best practices for monorepo Claude Code usage", role="user")
final = sequential_pipeline([researcher, writer, editor], brief_topic)
print(final.content)sequential_pipeline passes each agent's output to the next as input. AgentScope also has MsgHub (broadcast pattern), forlooppipeline (iterate), and a graph-based workflow if you need branching logic.
Verify it worked
Two quick checks:
- Run the single-agent script. You should see reasoning + tool calls printed, ending in a final answer that references the tool result.
- Open AgentScope Studio. If you installed the full extras, run
as_studioin a terminal and openhttp://localhost:5000. You'll see every agent run, every tool call, latency per step, and token usage. This is the production debugging surface most agent code is missing.
as_studioIf both work, you're set up.
Where this breaks
- Token bills. ReAct loops can run away, set
max_itersto a sane ceiling (8-12) and watch your token usage in Studio. - Tool function failures. If a tool returns garbage, the agent will reason itself in circles. Always return
ServiceExecStatus.ERRORwith a clear message, the agent uses it to course-correct. - Model drift between providers. A prompt that runs clean on Claude may loop on a smaller open model. Keep the model config in one place so you can swap and re-run benchmarks.
- Memory unbounded growth. TemporaryMemory keeps everything until cleared. For long-running agents, set
max_sizeor rotate.
What to try next
Let's talk about your AI + SEO stack
If you'd rather skip the how-to and have it shipped for you, that's what I do. Start a conversation and we'll figure out the fastest path to results.
Let's Talk