Once your agent is running, validate it before going live. Storyboards exercise a specific workflow end-to-end — media buy creation, creative sync, signals discovery. Each storyboard defines the exact tool call sequence a buyer agent makes and validates every response shape. Storyboards are available from the command line and interactively through Addie. They are also published alongside schemas atDocumentation Index
Fetch the complete documentation index at: https://agenticadvertisingorg-changeset-release-main.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
/compliance/{version}/ and bundled into the per-version protocol tarball at /protocol/{version}.tgz — see Schemas and SDKs for how to fetch them offline.
The
@adcp/sdk package also exports legacy TypeScript test runners under testing/scenarios/* (e.g. media-buy.ts, signals.ts). These predate comply() and are not the conformance specification. If you find yourself grepping those files to learn what AdCP requires, see Storyboards vs. scenarios for which surface is normative.Wrapping an upstream platform (DSP, SSP, retail data warehouse, creative server, signal marketplace)? Storyboards check your AdCP wire contract; they cannot tell whether the adapter behind the wire actually integrates with the upstream or returns shape-valid responses with synthetic data. See Validate adapter agents with mock upstream fixtures — published mock fixtures plus traffic counters give you façade-resistant compliance for adapters in any language.
Storyboard taxonomy
Storyboards are organized into three layers so agents declare only what they actually support:| Layer | Path | Who must pass it |
|---|---|---|
| Universal | /compliance/{version}/universal/ | Every AdCP agent (capability discovery, error handling, schema validation) |
| Protocol | /compliance/{version}/protocols/{protocol}/ | Any agent claiming a protocol (media-buy, creative, signals, governance, brand) |
| Specialism | /compliance/{version}/specialisms/{id}/ | Opt-in claims (e.g. sales-guaranteed, sales-broadcast-tv, creative-generative) — see the Compliance Catalog |
supported_protocols and specialisms in get_adcp_capabilities — the runner picks the matching storyboards automatically. See the Compliance Catalog for the full taxonomy.
Setup
Save your agent as a named alias so you can reference it by name:~/.adcp/config.json. You only need to do this once. Built-in aliases test-mcp and test-a2a point to the public test agents — no setup needed.
Run a storyboard
1. List available storyboards
2. Preview what a storyboard tests
3. Run the storyboard
--json for machine-readable results. Pass --debug to see full request/response payloads for each step.
4. Debug a failing step
If a step fails, run it individually:--context to provide state from earlier steps (account IDs, product IDs):
5. Run all storyboards
Run without a storyboard ID to test everything. The CLI discovers your agent’s tools viatools/list and selects matching storyboards automatically:
--json for structured output.
The storyboard runner operates in two modes depending on whether your agent implements the optional compliance test controller:
| Mode | When | What it tests |
|---|---|---|
| Observational | No test controller | Response schemas and buyer-initiated flows |
| Deterministic | Test controller present | Full lifecycle state machines, error codes, operation gates |
Validate through Addie
Addie provides interactive testing without any CLI setup. Paste your agent URL in any conversation to get started.Connectivity check
Ask Addie to check your agent. She’ll verify it’s online, list its advertised tools, and confirm the transport protocol (MCP or A2A). This is the quickest way to confirm your agent is reachable before running any tests.Storyboard coaching
Addie runs the same storyboards as the CLI but walks you through each step interactively. When a step fails, she explains what went wrong, shows the expected vs actual response, and suggests specific code changes. This is the fastest way to iterate when you’re building.RFP testing
Share a real RFP or campaign brief with Addie. She’ll parse it, call your agent’sget_products with the buyer’s actual requirements, and compare results against what your sales team would normally propose. This tests whether your agent can handle real buyer demand — not just synthetic briefs derived from your own inventory description.
IO execution testing
Share an insertion order with Addie. She’ll extract the line items, match them against your agent’s product catalog, and test whethercreate_media_buy can execute the deal. The output shows line-by-line matching quality (exact, close, weak, unmapped) and rate comparisons so you can see exactly where execution would break down.
Recommended testing sequence
- Connectivity — Is the agent online?
- Storyboards — Does it pass protocol compliance?
- RFP testing — Can it respond to real buyer demand?
- IO execution — Can it close real deals?
Sandbox mode
All storyboard runs use sandbox mode by default. The storyboard runner setssandbox: true on every account reference, so your agent processes requests without real platform calls or spend.
Your agent should declare sandbox support in get_adcp_capabilities:
sandbox: true in success responses.
See Sandbox mode for full implementation details and the two account model paths (implicit vs explicit).
Verifying cross-instance state
The protocol requires that(brand, account)-scoped state survive across agent process instances — a media buy created on one replica must be readable from any other. Single-instance storyboard success does not by itself prove that invariant. Choose a verification approach that fits your deployment.
Verify by architecture. If you run on a managed serverless platform with a shared datastore — Lambda + DynamoDB, Cloudflare Workers + D1, Cloud Run + Firestore, Vercel + Neon — the invariant holds by construction. Storyboards that pass against your deployed endpoint are sufficient. Document your storage pattern so it’s discoverable.
Verify by multi-instance testing. If you deploy long-running processes (containers, VMs, a classic app server behind a load balancer), put ≥2 replicas behind round-robin routing and run storyboards against the shared endpoint:
stateful: true — the write→read sequences most likely to catch in-process state. Stateless probes (capability discovery, auth rejection, schema validation) are unaffected.
A typical failure looks like:
Map or module-level variable.
Preparing to test uniform error responses
The uniform-response MUST requires byte-equivalent responses for “the id exists but the caller lacks access” and “the id does not exist” across every observable channel — error body, transport status, headers, side effects, and telemetry. Verifying this needs a paired-probe runner (adcp fuzz) that compares two responses per tool. The runner has two modes, and you need to plan tenant setup before you can exercise the strong one.
Baseline mode — single tenant. One auth token, two fresh UUIDs probed per tool. Catches id-echo in error bodies, header divergence outside the allowlist, MCP isError / A2A task.status.state divergence, and gross latency deltas. Cannot catch cross-tenant existence leaks, because neither probe resolves to a real resource.
Cross-tenant mode — two tenants. Tenant A seeds a resource (e.g., a property list, content standard, media buy, creative); tenant B probes against the seeded id plus a fresh UUID. Catches the full MUST, because it exercises the (exists, unauthorized) vs (does not exist) pair that baseline cannot construct.
Both modes exercise spec MUSTs. Only the cross-tenant path verifies the whole invariant.
Minimum tenant setup
Provision two isolated test accounts against your agent:- Tenant A — can create resources the invariant seeds (property lists, content standards, media buys, creatives). Sandbox-mode accounts are fine.
- Tenant B — read-only against shared discovery surfaces. MUST NOT share any per-tenant state with A beyond what your platform makes globally visible (e.g., published product catalogs).
Runner invocation
ADCP_AUTH_TOKEN and ADCP_AUTH_TOKEN_CROSS_TENANT. See the @adcp/sdk uniform-error-response invariant guide for the full flag list, the header allowlist, and the list of tools currently probed.
Testing with only one tenant
If you haven’t provisioned a second tenant yet, run baseline anyway — it still catches a meaningful class of leaks, and the CLI flags the run as baseline-only so operators can see coverage is partial. Treat single-tenant fuzz as a pre-check, not a conformance signal: a clean baseline run does not prove the MUST holds. Add the cross-tenant leg before you claim uniform-response conformance.The build-validate-fix loop
The typical development workflow:- Build — Point a coding agent at a skill file to generate your agent
- Run — Start the agent locally (
npx tsx agent.ts) - Validate — Run the matching storyboard (
npx @adcp/sdk@latest storyboard run my-agent media_buy_seller) - Fix — Address any failures (missing fields, wrong status values, invalid transitions)
- Repeat — Run the storyboard again until all steps pass
- Full check — Run
npx @adcp/sdk@latest storyboard run my-agent(no storyboard ID) for a full assessment before going live
For Practitioner certification, passing storyboard validation is the capstone — it proves your agent handles the complete protocol workflow for your chosen role track.
CLI reference
| Command | Description |
|---|---|
npx @adcp/sdk@latest storyboard list | List all available storyboards |
npx @adcp/sdk@latest storyboard show <id> | Preview storyboard structure |
npx @adcp/sdk@latest storyboard run <agent> [id] | Run one storyboard, or all matching if no ID given |
npx @adcp/sdk@latest storyboard step <agent> <id> <step> | Run a single step |
npx @adcp/sdk@latest <agent> [tool] [payload] | Call any tool directly |
npx @adcp/sdk@latest --save-auth <alias> <url> | Save agent alias |
npx @adcp/sdk@latest --list-agents | List saved aliases |
--json, --debug, --auth TOKEN, and --protocol mcp|a2a.
When a storyboard fails
- Storyboard troubleshooting — Error patterns mapped to root causes and fixes (missing fixtures, signature challenges, envelope drift, context echo, capability mismatches)
- Known spec ambiguities — Open spec gaps that affect conformance, with workarounds and issue links
What’s next
- Compliance test controller — Implement deterministic testing for full lifecycle coverage
- Task lifecycle — Status values, transitions, and polling
- Error handling — Error categories, codes, and recovery