How Do I Build AI Agents at Scale with AgentScope?

If you've shipped an LLM prototype that works on your laptop and dies the moment a real user touches it, AgentScope is the framework that's been missing. It's an open-source agent runtime built by Alibaba's research team for the part everyone skips — making agents survive in production. Here's how to install it, build your first ReAct agent, wire in tools and memory, and run a multi-agent workflow you can actually deploy.
Why this matters
Most agent code I see is one Python file, one OpenAI client, and a while True loop with a TODO comment that says "add error handling later." That's fine for a demo. It falls apart the second you need parallel tool calls, persistent memory, human approval, or a deployment story that isn't "I run it on my MacBook."
AgentScope handles those for you. It's a real framework — not a toy, not a wrapper around LangChain — designed to take an agent from notebook to Kubernetes without rewriting it.
Before you start
You need:
- Python 3.10+ installed.
- An LLM provider key — OpenAI, Anthropic, DashScope, or any OpenAI-compatible endpoint. I'll use Anthropic in the examples; swap as needed.
- A clean virtualenv. Don't pip install this into your system Python.
- About 30 minutes for the full walkthrough.
Step 1: Install AgentScope
python -m venv .venv
source .venv/bin/activate
pip install agentscopeThat gives you the core. If you want the development extras (Studio dashboard, evaluator, the works), install with the option group:
pip install "agentscope[full]"Set your API key as an env var so you don't paste it into code:
export ANTHROPIC_API_KEY=sk-ant-...Step 2: Build a single ReAct agent
ReAct (Reasoning + Acting) is the loop where the agent thinks, picks a tool, runs it, sees the result, and decides what to do next. AgentScope ships a ReActAgent class that handles the whole loop for you.
import agentscope
from agentscope.agents import ReActAgent
from agentscope.models import AnthropicChatModel
agentscope.init(
model_configs=[{
"config_name": "claude_sonnet",
"model_type": "anthropic_chat",
"model_name": "claude-sonnet-4-5-20250929",
}]
)
researcher = ReActAgent(
name="Researcher",
sys_prompt="You research topics and cite sources.",
model_config_name="claude_sonnet",
tools=[],
max_iters=8,
)
response = researcher(agentscope.message.Msg("user", "What's the cache hit rate sweet spot for Claude prompt caching?", role="user"))
print(response.content)Three things doing real work here:
agentscope.initregisters model configs once for the whole process — no client construction in every agent.ReActAgentruns the think → act → observe loop until the model stops calling tools or hitsmax_iters.- The
Msgobject is AgentScope's message envelope. Every interaction passes through it, which is what makes logging, persistence, and replay tractable later.
Step 3: Add tools the agent can call
A ReAct agent without tools is just a chatbot with extra steps. Pass plain Python functions as tools and AgentScope auto-generates the schema from the type hints and docstring.
from agentscope.service import ServiceToolkit, ServiceResponse, ServiceExecStatus
import requests
def get_pypi_downloads(package: str) -> ServiceResponse:
"""Get last-month download count for a PyPI package.
Args:
package (str): Exact package name on PyPI.
"""
r = requests.get(f"https://pypistats.org/api/packages/{package}/recent")
if r.status_code != 200:
return ServiceResponse(ServiceExecStatus.ERROR, f"HTTP {r.status_code}")
data = r.json()["data"]
return ServiceResponse(ServiceExecStatus.SUCCESS, data)
toolkit = ServiceToolkit()
toolkit.add(get_pypi_downloads)
researcher = ReActAgent(
name="Researcher",
sys_prompt="You research Python packages.",
model_config_name="claude_sonnet",
service_toolkit=toolkit,
max_iters=8,
)Now when the agent reasons "I need download stats for agentscope," it will call your function and observe the result before deciding what to say.
Step 4: Add memory so the agent remembers across turns
Out of the box, AgentScope agents have a temporary memory buffer that holds the current conversation. For longer-lived state, swap in a persistent store:
from agentscope.memory import TemporaryMemory
researcher.memory = TemporaryMemory(config={"max_size": 20})For production, point memory at Redis or a vector DB (Milvus, Qdrant). The memory interface is intentionally small — three methods (add, get, clear) — so you can drop in your own.
Step 5: Run a multi-agent workflow
This is where AgentScope earns the install. Build a small team — researcher, writer, editor — and let a supervisor coordinate them with a workflow primitive.
from agentscope.agents import ReActAgent
from agentscope.pipelines import sequential_pipeline
researcher = ReActAgent(name="Researcher", sys_prompt="Pull facts and cite sources.", model_config_name="claude_sonnet", service_toolkit=toolkit)
writer = ReActAgent(name="Writer", sys_prompt="Turn research notes into a 300-word brief in Jake's voice.", model_config_name="claude_sonnet")
editor = ReActAgent(name="Editor", sys_prompt="Cut hype, tighten, return final.", model_config_name="claude_sonnet")
brief_topic = agentscope.message.Msg("user", "Best practices for monorepo Claude Code usage", role="user")
final = sequential_pipeline([researcher, writer, editor], brief_topic)
print(final.content)sequential_pipeline passes each agent's output to the next as input. AgentScope also has MsgHub (broadcast pattern), forlooppipeline (iterate), and a graph-based workflow if you need branching logic.
Verify it worked
Two quick checks:
- Run the single-agent script. You should see reasoning + tool calls printed, ending in a final answer that references the tool result.
- Open AgentScope Studio. If you installed the full extras, run
as_studioin a terminal and openhttp://localhost:5000. You'll see every agent run, every tool call, latency per step, and token usage. This is the production debugging surface most agent code is missing.
as_studioIf both work, you're set up.
Where this breaks
- Token bills. ReAct loops can run away — set
max_itersto a sane ceiling (8–12) and watch your token usage in Studio. - Tool function failures. If a tool returns garbage, the agent will reason itself in circles. Always return
ServiceExecStatus.ERRORwith a clear message — the agent uses it to course-correct. - Model drift between providers. A prompt that runs clean on Claude may loop on a smaller open model. Keep the model config in one place so you can swap and re-run benchmarks.
- Memory unbounded growth. TemporaryMemory keeps everything until cleared. For long-running agents, set
max_sizeor rotate.
What to try next
Let's talk about your AI + SEO stack
If you'd rather skip the how-to and have it shipped for you, that's what I do. Start a conversation and we'll figure out the fastest path to results.
Let's Talk