AgentScope: Build AI Agents at Scale in Python

What it is
AgentScope is a production-ready open-source framework for building AI agents and multi-agent systems. It's not "another LangChain wrapper." It's a complete agent ecosystem purpose-built to take you from prototype to production without rewriting your stack when you scale.
In one sentence: if you've hit the wall where your toy agent breaks the moment you add a second agent, parallel tool calls, or real infrastructure, AgentScope is the framework designed for what comes next.
What you can actually do with it
- Build agents in minutes. ReAct loops, tool use, persistent memory, all first-class.
- Run multi-agent workflows. Orchestrator plus specialists, peer-to-peer, pipelines, fan-out/fan-in.
- Add human-in-the-loop control. Approval gates, pause/resume, mid-task input.
- Deploy anywhere. Local process, serverless function, Kubernetes cluster.
- Monitor, debug, evaluate. Built-in tracing, replay, eval harness, dashboards.
Why you should be doing this
Three reasons it belongs in your stack:
1. "Prototype to production" is where most agent projects die
Every team ships a LangChain demo in a week, then spends three months trying to make it reliable, observable, and multi-tenant. AgentScope bakes production concerns in from the first line of code: async-first runtime, typed messages, built-in tracing, and deployment adapters for serverless and K8s.
2. Multi-agent is the default, not an afterthought
The winning agent architectures in 2025 are multi-agent with a clear coordination pattern (orchestrator plus specialists usually beats single-agent on non-trivial tasks). AgentScope treats multi-agent as the primary API, not a plugin stapled onto a single-agent core.
3. Human-in-the-loop is a button, not a project
Every serious agent needs approval gates for irreversible actions (send email, place order, deploy code). In most frameworks, you build that yourself. AgentScope ships it.
How to do it, step by step
Step 1. Install
pip install agentscope
Requires Python 3.10+. Set your model key:
export ANTHROPIC_API_KEY=... # or OPENAI_API_KEY / DASHSCOPE_API_KEY
Step 2. Your first agent in about 15 lines
import agentscope
from agentscope.agents import ReActAgent
from agentscope.models import AnthropicChatModel
from agentscope.tools import BasicToolkit
agentscope.init(
model_configs=[AnthropicChatModel(
config_name="claude", model_name="claude-opus-4-7"
)],
)
toolkit = BasicToolkit() # includes bash, file I/O, web fetch
agent = ReActAgent(
name="researcher",
sys_prompt="You are a careful senior analyst. Gather, act, verify.",
model_config_name="claude",
toolkit=toolkit,
)
reply = agent("Summarize the top 3 news items about LLM agents this week.")
print(reply.content)
That's a complete ReAct agent with tools. No boilerplate.
Step 3. Add memory
from agentscope.memory import TemporaryMemory
agent = ReActAgent(
name="researcher",
sys_prompt="...",
model_config_name="claude",
memory=TemporaryMemory(), # in-process
# or VectorMemory(persist_dir="./mem") for embedding-based long-term memory
)
agent("I'm building a SaaS in Next.js.")
agent("What stack did I just mention?") # remembers
Step 4. Multi-agent workflow, orchestrator plus specialists
from agentscope.pipelines import SequentialPipeline
from agentscope.message import Msg
planner = ReActAgent(name="planner", model_config_name="claude", ...)
writer = ReActAgent(name="writer", model_config_name="claude", ...)
reviewer = ReActAgent(name="reviewer", model_config_name="claude", ...)
pipeline = SequentialPipeline([planner, writer, reviewer])
result = pipeline(Msg("user", "Write a post on LLM agent patterns.", "user"))
For parallel fan-out (e.g., three research specialists, then a synthesizer):
from agentscope.pipelines import ForLoopPipeline, parallel_pipeline
specialists = [finance_agent, tech_agent, regulatory_agent]
results = parallel_pipeline(specialists, Msg("user", "Research Apple", "user"))
synthesis = synthesizer(Msg("user", str(results), "user"))
Step 5. Human-in-the-loop for irreversible actions
from agentscope.tools import require_approval
@require_approval(prompt="Send this email?")
def send_email(to: str, subject: str, body: str) -> str:
... # real send
toolkit.register(send_email)
When the agent calls send_email, AgentScope pauses and surfaces the call plus args to a human via console, web UI, or Slack (configurable). Approve and it sends. Reject and the agent replans.
Step 6. Deploy: local, serverless, or K8s
Local (dev):
python my_agent.py
Serverless (e.g., AWS Lambda): AgentScope ships an ASGI adapter.
from agentscope.server import create_asgi_app
app = create_asgi_app(agent) # wrap with Mangum for Lambda
Kubernetes: there's a Helm chart with a worker plus orchestrator split.
helm install my-agents agentscope/agents -f values.yaml
Same agent code runs in all three environments. You pick the deployment target per workload.
Step 7. Monitor, debug, evaluate
Tracing. Enable it once and you get a timeline of every agent turn, every tool call, every token.
agentscope.init(
...,
studio_url="http://localhost:5000", # local observability UI
)
Open localhost:5000 and you see a live dashboard: agent graphs, call traces, token usage, latency histograms.
Evaluation. Attach an eval set to any agent:
from agentscope.evals import Benchmark
bench = Benchmark.from_csv("support-cases.csv")
score = bench.run(agent, metrics=["exact_match", "llm_judge"])
print(score.summary())
Run it in CI. Block merges on regressions.
Replay. Any trace can be replayed after a bug fix to confirm the same case now passes.
Architecture at a glance
┌────────────────────────────────────────────────────────────┐
│ │
│ User request │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Orchestrator│ ─┐ │
│ └─────────────┘ │ plans │
│ │ │ │
│ ├─▶ ReActAgent(finance) ──uses──▶ Tools + Memory │
│ ├─▶ ReActAgent(web) ──uses──▶ Tools + Memory │
│ └─▶ ReActAgent(writer) ──uses──▶ Tools + Memory │
│ │
│ ┌─────────────┐ │
│ │ H-i-L gate │ ◀── approval required for risky tools │
│ └─────────────┘ │
│ │
│ ┌─────────────┐ │
│ │ Tracer │ ── all events ──▶ Studio UI / OTel │
│ └─────────────┘ │
│ │
│ Deploys to: local | serverless | K8s (same code) │
│ │
└────────────────────────────────────────────────────────────┘
When to pick AgentScope vs. alternatives
| Use case | Best choice |
|---|---|
| Code-first multi-agent, Python-native, deploy to K8s | AgentScope |
| Visual drag-and-drop builder, OpenAI-only | AgentKit |
| Browser/web agent with human steering UX | Magentic-UI |
| Single-agent code assistant | Claude Agent SDK |
| LangChain ecosystem integration required | LangGraph |
One thing to watch out for
Multi-agent looks cool on demo. In production, it's only worth the overhead when the task actually has parallelizable subtasks, a verification step, or specialization that pays off. A well-designed single ReAct agent beats a poorly-designed five-agent swarm. Start single, add agents only when you measure a clear win. (This isn't an AgentScope-specific warning. It's the #1 lesson from every team that's shipped multi-agent systems.)
What changes once you have this
Before:
- Demo works. Prod crashes. You rebuild twice.
- Multi-agent means "copy the same class three times and hope."
- Human approval is a hacked-in
input("y/n"). - No traces when it fails. No evals when it regresses.
After:
- Same code path in laptop, Lambda, and K8s.
- Multi-agent is one import.
- H-i-L is a decorator.
- Studio UI shows the full trace with one env var.
- Evals block bad deploys in CI.
You stop building around your agent framework and start building with it.