Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
c2c16d2
test(claude-agent-sdk): add test scaffold and mock helpers
viniciusdsmello May 12, 2026
5ca2531
feat(claude-agent-sdk): traced_query emits root AGENT step with cost/…
viniciusdsmello May 12, 2026
6ed5e0e
feat(claude-agent-sdk): capture assistant turns as CHAT_COMPLETION steps
viniciusdsmello May 12, 2026
decc9bf
feat(claude-agent-sdk): capture tool calls via composed Pre/PostToolU…
viniciusdsmello May 12, 2026
1ecd6dc
test(claude-agent-sdk): MCP tool name parsing
viniciusdsmello May 12, 2026
be044ea
feat(claude-agent-sdk): subagent messages nest under their Agent Tool…
viniciusdsmello May 12, 2026
eb86625
feat(claude-agent-sdk): capture error subtypes and tool failures
viniciusdsmello May 12, 2026
ec60af9
test(claude-agent-sdk): user hooks compose with Openlayer hooks
viniciusdsmello May 12, 2026
bcdbd81
test(claude-agent-sdk): redact MCP server env/headers from trace meta…
viniciusdsmello May 12, 2026
362743b
feat(claude-agent-sdk): add trace_claude_agent_sdk() global init
viniciusdsmello May 12, 2026
44c64ac
feat(claude-agent-sdk): auto-instrument ClaudeSDKClient on trace_clau…
viniciusdsmello May 12, 2026
2f55d09
test(claude-agent-sdk): wrapper preserves stream identity and order
viniciusdsmello May 12, 2026
2b7d09b
test(claude-agent-sdk): add live integration test (gated on ANTHROPIC…
viniciusdsmello May 12, 2026
e672679
docs(claude-agent-sdk): add example notebook and public re-export
viniciusdsmello May 12, 2026
9662e22
chore(claude-agent-sdk): satisfy ruff (import order, ARG001, T201)
viniciusdsmello May 12, 2026
351c4b4
fix(claude-agent-sdk): use stable step name instead of prompt-derived…
viniciusdsmello May 12, 2026
bd27823
feat(claude-agent-sdk): capture options.system_prompt and options.age…
viniciusdsmello May 12, 2026
d645bb2
test(claude-agent-sdk): enrich live test with system_prompt + max_tur…
viniciusdsmello May 12, 2026
089ac76
feat(claude-agent-sdk): capture rawOutput, prompt-context, and camelC…
viniciusdsmello May 12, 2026
b9b5ff2
docs(claude-agent-sdk): add multi-agent example notebook
viniciusdsmello May 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
292 changes: 292 additions & 0 deletions examples/tracing/claude_agent_sdk/claude_agent_sdk_multi_agent.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,292 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openlayer-ai/openlayer-python/blob/main/examples/tracing/claude_agent_sdk/claude_agent_sdk_multi_agent.ipynb)\n",
"\n",
"# Multi-agent Claude Agent SDK tracing with Openlayer\n",
"\n",
"This notebook builds a richer agent than the basic example: a **codebase analyzer** that orchestrates two subagents and an in-process MCP tool. It's designed to exercise every step type the Openlayer wrapper captures so you can see a full trace tree in your dashboard:\n",
"\n",
"- **Root `AGENT` step** \u2014 `Claude Agent SDK query` with the user prompt, the resolved agent_config, cost, tokens, session_id, the full final ResultMessage as `rawOutput`.\n",
"- **Per-turn `CHAT_COMPLETION` steps** \u2014 one per assistant turn, with model, prompt/completion tokens, thinking content (if any), tool_calls list, and the full assistant message as `raw_output`.\n",
"- **`TOOL` steps** \u2014 one per tool invocation, bracketed by `PreToolUse` / `PostToolUse` hooks for precise timing. MCP tools get `mcp_server` and `mcp_tool_name` metadata parsed from the `mcp__server__tool` naming convention.\n",
"- **Nested subagent steps** \u2014 messages from a subagent run carry `parent_tool_use_id` pointing at the spawning `Agent` tool call, so the wrapper nests them under that ToolStep automatically."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Install dependencies"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# !pip install openlayer 'claude-agent-sdk>=0.1.81'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Environment variables"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENLAYER_API_KEY\"] = \"YOUR_OPENLAYER_API_KEY\"\n",
"os.environ[\"OPENLAYER_INFERENCE_PIPELINE_ID\"] = \"YOUR_INFERENCE_PIPELINE_ID\"\n",
"os.environ[\"ANTHROPIC_API_KEY\"] = \"YOUR_ANTHROPIC_API_KEY\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Turn on tracing (one line)\n",
"\n",
"`trace_claude_agent_sdk()` monkey-patches `claude_agent_sdk.query` and `ClaudeSDKClient` so every call is auto-traced. Your own hooks (if any) are preserved \u2014 ours are composed on top, never replacing yours."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from openlayer.lib import trace_claude_agent_sdk\n",
"\n",
"trace_claude_agent_sdk()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Define a custom MCP tool\n",
"\n",
"The SDK lets you expose any in-process Python function as an MCP tool via `@tool` + `create_sdk_mcp_server`. We define a `count_files` tool that takes a directory and returns a count of files by extension. In the trace it'll appear as a `TOOL` step named `mcp__file-stats__count_files` with `mcp_server: \"file-stats\"` and `mcp_tool_name: \"count_files\"` in metadata."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from collections import Counter\n",
"from pathlib import Path\n",
"\n",
"from claude_agent_sdk import tool, create_sdk_mcp_server\n",
"\n",
"\n",
"@tool(\"count_files\", \"Count files in a directory grouped by extension\", {\"directory\": str})\n",
"async def count_files(args):\n",
" target = Path(args[\"directory\"]).expanduser().resolve()\n",
" if not target.exists() or not target.is_dir():\n",
" return {\n",
" \"content\": [{\"type\": \"text\", \"text\": f\"No such directory: {target}\"}],\n",
" \"isError\": True,\n",
" }\n",
" counts = Counter()\n",
" for f in target.rglob(\"*\"):\n",
" if f.is_file():\n",
" counts[f.suffix or \"(no ext)\"] += 1\n",
" body = \"\\n\".join(f\"{ext}: {n}\" for ext, n in counts.most_common(20))\n",
" return {\"content\": [{\"type\": \"text\", \"text\": body or \"(empty)\"}]}\n",
"\n",
"\n",
"file_stats_server = create_sdk_mcp_server(\"file-stats\", \"1.0.0\", tools=[count_files])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Define two subagents\n",
"\n",
"Subagents are registered under the built-in `Agent` tool. When the main agent calls `Agent(name=\"code-reviewer\", \u2026)`, the SDK runs that subagent in its own context and the wrapper nests every message the subagent emits under the spawning `Agent` ToolStep via `parent_tool_use_id`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from claude_agent_sdk import AgentDefinition\n",
"\n",
"subagents = {\n",
" \"code-reviewer\": AgentDefinition(\n",
" description=\"Briefly reviews a code file for clarity, correctness, and style.\",\n",
" prompt=(\n",
" \"You are a senior code reviewer. The user will tell you which file to inspect. \"\n",
" \"Read that file once, then return exactly one observation about its quality \"\n",
" \"(strength or weakness). Be specific and concise \u2014 two sentences max.\"\n",
" ),\n",
" tools=[\"Read\"],\n",
" model=\"claude-haiku-4-5\",\n",
" ),\n",
" \"summary-writer\": AgentDefinition(\n",
" description=\"Writes a one-paragraph summary of an agent's findings.\",\n",
" prompt=(\n",
" \"You synthesize prior agent findings into a single one-paragraph summary \"\n",
" \"(3-5 sentences). Be specific and concise; do not invent details that weren't reported.\"\n",
" ),\n",
" tools=[],\n",
" model=\"claude-haiku-4-5\",\n",
" ),\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Wire it all together and run\n",
"\n",
"The main agent gets: the in-process MCP server, both subagents, and a small set of built-in tools. The prompt walks it through the orchestration in three steps so the resulting trace is easy to follow."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from textwrap import dedent\n",
"\n",
"from claude_agent_sdk import ClaudeAgentOptions, ResultMessage, query\n",
"\n",
"\n",
"options = ClaudeAgentOptions(\n",
" model=\"claude-haiku-4-5\",\n",
" system_prompt=(\n",
" \"You are a codebase analysis agent. Follow the user's three-step plan exactly, \"\n",
" \"in order. Be terse \u2014 the final answer should be a 4-line markdown report.\"\n",
" ),\n",
" allowed_tools=[\n",
" \"Glob\",\n",
" \"Read\",\n",
" \"Agent\",\n",
" \"mcp__file-stats__count_files\",\n",
" ],\n",
" mcp_servers={\"file-stats\": file_stats_server},\n",
" agents=subagents,\n",
" max_turns=15,\n",
")\n",
"\n",
"# Point the agent at this repository's integrations directory. Change to any\n",
"# directory you'd like to analyze.\n",
"target_dir = os.path.abspath(\"../../../src/openlayer/lib/integrations\")\n",
"\n",
"prompt = dedent(\n",
" f\"\"\"\\\n",
" Analyze the directory at: {target_dir}\n",
"\n",
" Follow this plan exactly:\n",
"\n",
" 1. Call the count_files tool with that directory to get a file-extension breakdown.\n",
" 2. Use Glob to list .py files under the directory and pick exactly ONE non-trivial file.\n",
" Dispatch the code-reviewer subagent to review that file briefly.\n",
" 3. Dispatch the summary-writer subagent to produce a one-paragraph summary of\n",
" (a) the extension counts and (b) the code-reviewer's finding.\n",
"\n",
" Output a 4-line markdown report: file counts, file reviewed, reviewer's observation,\n",
" and the summary-writer's paragraph.\n",
" \"\"\"\n",
")\n",
"\n",
"# Top-level ``await`` works in Jupyter; for plain Python scripts wrap in\n",
"# ``asyncio.run(...)``. The SDK can raise a trailing exception after the\n",
"# ResultMessage \u2014 we tolerate it so the trace still publishes cleanly.\n",
"result = None\n",
"try:\n",
" async for message in query(prompt=prompt, options=options):\n",
" if isinstance(message, ResultMessage):\n",
" result = message\n",
"except Exception as exc:\n",
" print(f\"(SDK raised after result: {exc})\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. Final result"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(result.result if result else \"(no result)\")\n",
"print()\n",
"print(\"turns:\", result.num_turns)\n",
"print(\"cost:\", f\"${result.total_cost_usd:.4f}\")\n",
"print(\"session_id:\", result.session_id)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 8. What to look for in the Openlayer trace\n",
"\n",
"Open the trace in your inference pipeline and you should see:\n",
"\n",
"- A root **`Claude Agent SDK query`** AGENT step. On its metadata: `system_prompt`, `agents_defined` (both subagents with their prompts and tools), `agent_config` (resolved tools / mcp_servers / skills / plugins / permission_mode / cwd / model), `options` (model, max_turns, allowed_tools), `session_id`, `num_turns`, `stop_reason`, `model_usage` (per-model token + cost breakdown), and `rawOutput` (the full final ResultMessage as JSON).\n",
"- Multiple **`assistant turn N`** CHAT_COMPLETION steps under the root. Each has its own `prompt_tokens` / `completion_tokens`, `model`, thinking content (if present), tool_calls list, and `raw_output` containing the full assistant message.\n",
"- Multiple **TOOL** steps:\n",
" - `mcp__file-stats__count_files` \u2014 with `mcp_server`/`mcp_tool_name` parsed into metadata, and the directory listing as output.\n",
" - `Glob` and `Read` \u2014 the built-in file tools.\n",
" - `Agent` (twice) \u2014 one for `code-reviewer`, one for `summary-writer`. **Each Agent ToolStep contains nested CHAT_COMPLETION and TOOL steps** \u2014 those are the subagent's own assistant turns and tool calls, correlated via `parent_tool_use_id`.\n",
"\n",
"If you don't see some of the metadata fields in the trace UI's \"Other fields\" panel, click **+ Add field** on your inference pipeline and pick the columns you want to surface. The wrapper ships every field listed above; the pipeline decides which to render."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Optional: skills\n",
"\n",
"Claude Agent SDK loads skills from `.claude/skills/*/SKILL.md` files when `setting_sources=[\"project\"]` is enabled. They show up in `metadata.agent_config.skills` on the root step. If you want to test skill capture, add a `.claude/skills/example-skill/SKILL.md` file to this directory and re-run with:\n",
"\n",
"```python\n",
"options = ClaudeAgentOptions(..., setting_sources=[\"project\"])\n",
"```\n",
"\n",
"The skill names will appear in the agent_config metadata of the root step."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.10"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Loading
Loading