AdCP — Get Test-Ready

Storyboards are the versioned buyer-simulation suite that decides whether your agent is published as conformant. Buyer agents filter on that status — overclaiming or failing storyboards is a public, permanent signal, not a CI warning. This page is the checklist between “I built an agent” and “I can run npx @adcp/sdk@latest storyboard run.”

The three surfaces the runner needs

The runner drives your agent through the same public tools a buyer would call, plus one sandbox-only tool for fixture setup. Three surfaces must be in place:

Surface	What it tells the runner	Where it lives
`get_adcp_capabilities`	Which protocols and specialisms you claim, and that you support sandbox	Your agent’s capability response
`sync_accounts` (or `list_accounts`)	How to obtain a sandbox account to run tests against	Your agent’s account tool
`comply_test_controller`	How to seed fixtures and force seller-side transitions deterministically	Your agent, sandbox-only

You ship these three surfaces. The runner owns storyboard selection, fixture ordering, and response comparison.

Step 1 — Declare capabilities honestly

get_adcp_capabilities is how the runner picks which storyboards apply to you. It is also the conformance contract: you are promising to pass every storyboard that matches what you declare. The example below is for a full-service guaranteed seller (proposal lifecycle enabled). A direct-buy guaranteed seller would set media_buy.supports_proposals: false (or omit it). A broadcast-TV seller would claim sales-broadcast-tv; a creative-only agent would claim the creative protocol with creative-ad-server or creative-generative specialisms; a signals provider would claim signals. The pattern is the same: declare only what you actually implement.

{
  "supported_protocols": ["media_buy", "creative"],
  "specialisms": ["sales-guaranteed"],
  "media_buy": {
    "supports_proposals": true
  },
  "account": {
    "sandbox": true,
    "require_operator_auth": false
  }
}

supported_protocols — Pulls in the matching protocol storyboards from /compliance/{version}/protocols/.
specialisms — Pulls in opt-in specialism storyboards (see the Compliance Catalog for the full enumeration).
account.sandbox: true — Signals that you honor sandbox semantics (no real spend, no production side effects).
account.require_operator_auth — Determines your sandbox bootstrap path (step 2).

Claiming sales-guaranteed when you only run RTB ships you into storyboards you will fail on record. Conformance status is part of the Verified badge buyer agents use to filter sellers — overclaim once, lose inclusion everywhere.

Step 2 — Pick your sandbox bootstrap path

The runner must obtain a sandbox account before it can do anything. Your require_operator_auth flag chooses the path: Implicit accounts (require_operator_auth: false). Your agent accepts sync_accounts from any authenticated buyer. The runner calls sync_accounts with sandbox: true to mint a test account on demand. Most new sales agents start here. Explicit accounts (require_operator_auth: true). Accounts must be pre-provisioned by a human on your side. The runner calls list_accounts with a sandbox filter to discover pre-existing test accounts. Publish a short note telling operators how to request one — include the contact, the expected turnaround, and what credentials they’ll receive. Full details and examples: Sandbox mode.

Step 3 — Implement the compliance test controller

Without a compliance test controller, the runner tests only buyer-initiated flows (observational mode) — schema conformance, auth rejection, happy-path buyer calls. That is enough for a first pass and for capability discovery, but conformance treats deterministic mode — full lifecycle walks enabled by the controller — as the bar for specialism coverage. comply_test_controller is a single sandbox-only tool with a scenario parameter covering three families:

Scenario family	What it does	When you need it
`seed_*`	Create fixtures (products, pricing options, creatives, plans, media buys) with caller-supplied IDs	Almost every storyboard — this replaces hardcoded-ID discovery
`force_*`	Drive entities through state transitions that are normally seller-initiated	Any storyboard that tests a state machine (creative approval, account suspension, etc.)
`simulate_*`	Inject delivery data or budget spend	Reporting and budgeting storyboards

See the Compliance test controller reference for scenario-by-scenario parameters and response shapes, and the Compliance Catalog for which scenarios each specialism requires.

Wiring the SDK scaffold

@adcp/sdk (6.x is the production GA on AdCP 3.0) ships createComplyController so you wire your data layer to the controller without reimplementing tool registration, param validation, error envelopes, or re-seed idempotency.

npm install @adcp/sdk

The scaffold is TypeScript/JavaScript. Python, Go, and Java sellers implement the tool directly against the schema — the contract below (adapters, error codes, idempotency semantics) applies the same way. SDKs for other languages are tracked in Choose your SDK.

import { createComplyController, TestControllerError } from '@adcp/sdk/testing';
// `server` is your AdcpServer or MCP server instance — see `createAdcpServer` in
// `@adcp/sdk/server` if you need a reference setup.

const controller = createComplyController({
  seed: {
    product: async ({ product_id, fixture }) => {
      await productRepo.upsert(product_id, fixture);
    },
    creative: async ({ creative_id, fixture }) => {
      await creativeRepo.upsert(creative_id, fixture);
    },
    // Add pricing_option, plan, media_buy as your claimed storyboards require.
  },

  force: {
    creative_status: async ({ creative_id, status, rejection_reason }) => {
      const previous = await creativeRepo.getStatus(creative_id);
      if (previous == null) {
        throw new TestControllerError('NOT_FOUND', `creative ${creative_id} not found`);
      }
      const result = await creativeRepo.transition(creative_id, status, rejection_reason);
      if (result.kind === 'invalid_transition') {
        throw new TestControllerError('INVALID_TRANSITION', result.message, previous);
      }
      return { success: true, previous_state: previous, current_state: result.status };
    },
    // Add account_status, media_buy_status, session_status as needed.
  },

  // simulate: { delivery, budget_spend } — add if you claim reporting/budget specialisms.
});

// Primary gate: register the tool only in sandbox deployments, so it never
// appears in production `tools/list`.
if (process.env.ADCP_SANDBOX === '1') {
  controller.register(server);
}

What the scaffold handles for you:

Tool registration and schema. controller.toolDefinition stays in sync with the published spec version.
Dispatch and UNKNOWN_SCENARIO. Scenarios you do not register return UNKNOWN_SCENARIO automatically — never a schema error.
Param validation. Invalid params produce INVALID_PARAMS with a readable error_detail without reaching your adapter.
Seed idempotency. Calling seed_product twice with the same product_id and an equivalent fixture returns previous_state: "existing"; a divergent fixture returns INVALID_PARAMS. Your adapter is only invoked on the first seed.
Typed error envelopes. Throw TestControllerError(code, message, currentState?) with code in 'INVALID_TRANSITION' | 'NOT_FOUND' | 'FORBIDDEN' | 'INVALID_PARAMS' from any adapter.

The scaffold does not own the state machine. Transition rules live in your adapters, so compliance testing and production share one source of truth — the mechanic the anti-teach-to-test section depends on.

Two layers of sandbox gating

The scaffold supports two gates. Ship both in any deployment that serves both sandbox and production traffic from the same process:

Registration gate (primary). Wrap controller.register(server) in an environment check. This is what keeps comply_test_controller out of production tools/list entirely. Without it, a leaked sandbox credential on a production endpoint exposes seller-side state-forcing.
Per-request gate (defense-in-depth). Pass a sandboxGate: (input) => boolean to createComplyController. The scaffold calls it on every request and returns FORBIDDEN when it returns false. Use this on shared-process deployments where the tool IS registered but some requests might still reference a production account.

sandboxGate receives the raw tool input (Record<string, unknown>). The SDK does not plumb auth context onto it — you decide what to inspect. A typical pattern is to pull the referenced entity ID out of params and verify it belongs to a sandbox account in your own data layer:

sandboxGate: async (input) => {
  const params = input.params as { account_id?: string; media_buy_id?: string } | undefined;
  const accountRef = params?.account_id
    ?? (params?.media_buy_id && await mediaBuyRepo.getAccountId(params.media_buy_id));
  return typeof accountRef === 'string' && await accountRepo.isSandbox(accountRef);
}

For custom MCP wrappers — AsyncLocalStorage for per-request auth, transport-level sandbox gating, session-backed stores — compose the lower-level handleTestControllerRequest, toMcpResponse, and TOOL_INPUT_SHAPE from @adcp/sdk/server directly.

Step 4 — Run the storyboard runner

Once the three surfaces are in place, the runner takes over:

npx @adcp/sdk@latest --save-auth my-agent http://localhost:3001/mcp
npx @adcp/sdk@latest storyboard run my-agent

The runner discovers your capabilities, obtains a sandbox account, seeds fixtures via the controller, and walks each matching storyboard. See Validate Your Agent for the full CLI, debug flags, and Addie workflows.

Avoiding the teach-to-test trap

Storyboards hardcode fixture IDs — "test-product", "campaign_hero_video", "acmeoutdoor.example". A controller that special-cases those strings passes the suite while silently failing on every real buyer. That is the exact industry cost conformance is trying to prevent: every post-conformance integration failure burns seller reputation, inflates buyer agent skepticism, and slows protocol adoption. The SDK scaffold already points you in the right direction: adapters receive product_id, creative_id, etc. as values, not as conditions. If your adapter contains a switch on product_id === "test-product", you have regressed. Two rules of thumb:

Implement seed scenarios generically. seed_product accepts any product_id and persists a product with that ID in your sandbox data layer. Your adapter is a thin wrapper over a real upsert against your sandbox store.
The fixture object is the contract, the ID is not. Storyboard authors set fixture to the minimum shape the test needs. Everything beyond that — discovery, filtering, authorization — is your normal code path, exercised on fixture-seeded data the same way it runs on production data.

To check: swap a storyboard’s fixture IDs for random UUIDs and rerun. If the run still passes, your controller is correct. If it breaks, you have hardcoded behavior to fix.

Readiness checklist

Before your first full storyboard sweep:

get_adcp_capabilities returns only protocols and specialisms you actually implement
account.sandbox: true is declared and honored — sandbox requests produce no real spend, no production platform calls, no persisted production state
sync_accounts (implicit) or list_accounts (explicit) handles sandbox requests per step 2
comply_test_controller is absent from tools/list on any production endpoint
Requests that reference a non-sandbox account are rejected with FORBIDDEN
Every seed scenario your claimed storyboards depend on persists fixtures generically, with no ID special-cases
Every force scenario uses the same state-transition rules as production, returning typed errors on invalid transitions
A full storyboard sweep still passes when fixture IDs are swapped for random UUIDs

What’s next

Validate Your Agent — CLI, Addie workflows, and multi-instance verification
Compliance test controller reference — Full scenario-by-scenario spec
Sandbox mode — The two account model paths in depth
Conformance — What “conformant” and “verified” mean once your runs pass

Documentation Index

​The three surfaces the runner needs

​Step 1 — Declare capabilities honestly

​Step 2 — Pick your sandbox bootstrap path

​Step 3 — Implement the compliance test controller

​Wiring the SDK scaffold

​Two layers of sandbox gating

​Step 4 — Run the storyboard runner

​Avoiding the teach-to-test trap

​Readiness checklist

​What’s next

The three surfaces the runner needs

Step 1 — Declare capabilities honestly

Step 2 — Pick your sandbox bootstrap path

Step 3 — Implement the compliance test controller

Wiring the SDK scaffold

Two layers of sandbox gating

Step 4 — Run the storyboard runner

Avoiding the teach-to-test trap

Readiness checklist

What’s next