AgentScope: Build AI Agents at Scale in Python

What it is

AgentScope is a production-ready open-source framework for building AI agents and multi-agent systems. It's not "another LangChain wrapper." It's a complete agent ecosystem purpose-built to take you from prototype to production without rewriting your stack when you scale.

In one sentence: if you've hit the wall where your toy agent breaks the moment you add a second agent, parallel tool calls, or real infrastructure, AgentScope is the framework designed for what comes next.

What you can actually do with it

Build agents in minutes. ReAct loops, tool use, persistent memory, all first-class.
Run multi-agent workflows. Orchestrator plus specialists, peer-to-peer, pipelines, fan-out/fan-in.
Add human-in-the-loop control. Approval gates, pause/resume, mid-task input.
Deploy anywhere. Local process, serverless function, Kubernetes cluster.
Monitor, debug, evaluate. Built-in tracing, replay, eval harness, dashboards.

Why you should be doing this

Three reasons it belongs in your stack:

1. "Prototype to production" is where most agent projects die

Every team ships a LangChain demo in a week, then spends three months trying to make it reliable, observable, and multi-tenant. AgentScope bakes production concerns in from the first line of code: async-first runtime, typed messages, built-in tracing, and deployment adapters for serverless and K8s.

2. Multi-agent is the default, not an afterthought

The winning agent architectures in 2025 are multi-agent with a clear coordination pattern (orchestrator plus specialists usually beats single-agent on non-trivial tasks). AgentScope treats multi-agent as the primary API, not a plugin stapled onto a single-agent core.

3. Human-in-the-loop is a button, not a project

Every serious agent needs approval gates for irreversible actions (send email, place order, deploy code). In most frameworks, you build that yourself. AgentScope ships it.

How to do it, step by step

Step 1. Install

pip install agentscope

Requires Python 3.10+. Set your model key:

export ANTHROPIC_API_KEY=...   # or OPENAI_API_KEY / DASHSCOPE_API_KEY

Step 2. Your first agent in about 15 lines

import agentscope
from agentscope.agents import ReActAgent
from agentscope.models import AnthropicChatModel
from agentscope.tools import BasicToolkit

agentscope.init(
    model_configs=[AnthropicChatModel(
        config_name="claude", model_name="claude-opus-4-7"
    )],
)

toolkit = BasicToolkit()           # includes bash, file I/O, web fetch
agent   = ReActAgent(
    name="researcher",
    sys_prompt="You are a careful senior analyst. Gather, act, verify.",
    model_config_name="claude",
    toolkit=toolkit,
)

reply = agent("Summarize the top 3 news items about LLM agents this week.")
print(reply.content)

That's a complete ReAct agent with tools. No boilerplate.

Step 3. Add memory

from agentscope.memory import TemporaryMemory

agent = ReActAgent(
    name="researcher",
    sys_prompt="...",
    model_config_name="claude",
    memory=TemporaryMemory(),   # in-process
    # or VectorMemory(persist_dir="./mem") for embedding-based long-term memory
)

agent("I'm building a SaaS in Next.js.")
agent("What stack did I just mention?")   # remembers

Step 4. Multi-agent workflow, orchestrator plus specialists

from agentscope.pipelines import SequentialPipeline
from agentscope.message import Msg

planner   = ReActAgent(name="planner",   model_config_name="claude", ...)
writer    = ReActAgent(name="writer",    model_config_name="claude", ...)
reviewer  = ReActAgent(name="reviewer",  model_config_name="claude", ...)

pipeline = SequentialPipeline([planner, writer, reviewer])
result   = pipeline(Msg("user", "Write a post on LLM agent patterns.", "user"))

For parallel fan-out (e.g., three research specialists, then a synthesizer):

from agentscope.pipelines import ForLoopPipeline, parallel_pipeline

specialists = [finance_agent, tech_agent, regulatory_agent]
results     = parallel_pipeline(specialists, Msg("user", "Research Apple", "user"))
synthesis   = synthesizer(Msg("user", str(results), "user"))

Step 5. Human-in-the-loop for irreversible actions

from agentscope.tools import require_approval

@require_approval(prompt="Send this email?")
def send_email(to: str, subject: str, body: str) -> str:
    ...  # real send

toolkit.register(send_email)

When the agent calls send_email, AgentScope pauses and surfaces the call plus args to a human via console, web UI, or Slack (configurable). Approve and it sends. Reject and the agent replans.

Step 6. Deploy: local, serverless, or K8s

Local (dev):

python my_agent.py

Serverless (e.g., AWS Lambda): AgentScope ships an ASGI adapter.

from agentscope.server import create_asgi_app
app = create_asgi_app(agent)   # wrap with Mangum for Lambda

Kubernetes: there's a Helm chart with a worker plus orchestrator split.

helm install my-agents agentscope/agents -f values.yaml

Same agent code runs in all three environments. You pick the deployment target per workload.

Step 7. Monitor, debug, evaluate

Tracing. Enable it once and you get a timeline of every agent turn, every tool call, every token.

agentscope.init(
    ...,
    studio_url="http://localhost:5000",   # local observability UI
)

Open localhost:5000 and you see a live dashboard: agent graphs, call traces, token usage, latency histograms.

Evaluation. Attach an eval set to any agent:

from agentscope.evals import Benchmark

bench = Benchmark.from_csv("support-cases.csv")
score = bench.run(agent, metrics=["exact_match", "llm_judge"])
print(score.summary())

Run it in CI. Block merges on regressions.

Replay. Any trace can be replayed after a bug fix to confirm the same case now passes.

Architecture at a glance

┌────────────────────────────────────────────────────────────┐
│                                                            │
│   User request                                             │
│        │                                                   │
│        ▼                                                   │
│   ┌─────────────┐                                          │
│   │ Orchestrator│ ─┐                                       │
│   └─────────────┘  │  plans                                │
│        │           │                                       │
│        ├─▶ ReActAgent(finance)  ──uses──▶ Tools + Memory   │
│        ├─▶ ReActAgent(web)      ──uses──▶ Tools + Memory   │
│        └─▶ ReActAgent(writer)   ──uses──▶ Tools + Memory   │
│                                                            │
│   ┌─────────────┐                                          │
│   │ H-i-L gate  │ ◀── approval required for risky tools    │
│   └─────────────┘                                          │
│                                                            │
│   ┌─────────────┐                                          │
│   │   Tracer    │ ── all events ──▶ Studio UI / OTel       │
│   └─────────────┘                                          │
│                                                            │
│   Deploys to: local | serverless | K8s (same code)         │
│                                                            │
└────────────────────────────────────────────────────────────┘

When to pick AgentScope vs. alternatives

Use case	Best choice
Code-first multi-agent, Python-native, deploy to K8s	AgentScope
Visual drag-and-drop builder, OpenAI-only	AgentKit
Browser/web agent with human steering UX	Magentic-UI
Single-agent code assistant	Claude Agent SDK
LangChain ecosystem integration required	LangGraph

One thing to watch out for

Multi-agent looks cool on demo. In production, it's only worth the overhead when the task actually has parallelizable subtasks, a verification step, or specialization that pays off. A well-designed single ReAct agent beats a poorly-designed five-agent swarm. Start single, add agents only when you measure a clear win. (This isn't an AgentScope-specific warning. It's the #1 lesson from every team that's shipped multi-agent systems.)

What changes once you have this

Before:

Demo works. Prod crashes. You rebuild twice.
Multi-agent means "copy the same class three times and hope."
Human approval is a hacked-in input("y/n").
No traces when it fails. No evals when it regresses.

After:

Same code path in laptop, Lambda, and K8s.
Multi-agent is one import.
H-i-L is a decorator.
Studio UI shows the full trace with one env var.
Evals block bad deploys in CI.

You stop building around your agent framework and start building with it.