Skip to main content

Documentation Index

Fetch the complete documentation index at: https://agenticadvertisingorg-changeset-release-main.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Compliance test controller

The compliance test controller is a dev/staging-only affordance, not a production-time concept. AAO grading does NOT require or use it. The AAO compliance heartbeat drives storyboards against the seller’s registered production URL with account.sandbox: true on every request, and the seller’s prod stack is responsible for honoring the flag — no controller endpoint needed.Sellers MAY implement the controller in their dev or staging environment to support their own integration testing — walking lifecycle state machines deterministically, seeding fixtures, forcing transitions that would otherwise require waiting for real time. That’s its purpose. It MUST NOT be exposed on production deployments (see Sandbox gating below).Confused about how the controller relates to AAO Verified (Sandbox)? See #4379 for the framing decision: (Sandbox) attests “real production endpoint correctly handles sandbox-flagged traffic across the full storyboard suite.” The controller is the developer-side affordance for your testing, not the AAO-side grading mechanism.
AdCP defines lifecycle state machines for accounts, creatives, media buys, SI sessions, and delivery reporting. Many transitions in these state machines are seller-initiated — creative approval, account suspension, budget depletion, delivery accrual. A storyboard runner can only exercise buyer-initiated flows, leaving seller-initiated transitions untested. The compliance test controller is an optional tool sellers expose in their dev/staging environment to support deterministic local testing. It allows a runner to trigger seller-side state transitions on demand, enabling end-to-end lifecycle verification during development.

Motivation

Without a test controller, compliance testing is observational: fire an action, read back whatever state exists, move on. This catches schema violations but not behavioral ones.
TrackObservational (today)Deterministic (with controller)
CreativeSync → observe initial statusWalk processingapprovedarchived; force rejected with reason
AccountRead existing statusesForce suspended → verify operation gates → reactivate
SI sessionsInitiate → message → terminateForce terminated with timeout reason → verify SESSION_NOT_FOUND on next call
ReportingCall get_media_buy_delivery → hope data existsSimulate delivery → verify rollups
BudgetingCreate buy with budget → read backSimulate spend to threshold → verify alerts and payment_required
Media buyCreate → pause → resumeForce seller-initiated rejected → verify terminal state

Sandbox gating

Sellers MUST NOT expose comply_test_controller on production deployments — to anyone, on any surface. The tool MUST be absent from tools/list (MCP) and from the agent card’s skills[] (A2A); the compliance_testing block MUST be absent from get_adcp_capabilities; dispatch MUST return the transport’s standard unknown-tool error (e.g., JSON-RPC -32601 Method not found for MCP, the unknown-skill rejection for A2A) — indistinguishable from the same-transport response of a seller that does not implement the tool. A production deployment that exposes the tool on any of these surfaces is non-conformant regardless of whether dispatch is gated. The canonical pattern is two deployments: one production (no controller wired), one sandbox/staging (controller wired for all comers). Sellers expose comply_test_controller only on sandbox/staging deployments; any principal that can authenticate to such a deployment can call it. Sellers MAY instead run a single deployment with mixed sandbox/live principals and project the tool per-principal, gating on the resolved account’s mode. This is an implementation pattern, not the canonical model. Sellers picking this pattern MUST gate all three surfaces consistently: tools/list (or skills[]), the compliance_testing capability block, and dispatch. Partial projection — e.g., gating tools/list but leaving the compliance_testing block visible to live principals, or returning FORBIDDEN (rather than unknown-tool) to a live principal who probes by name — is non-conformant; it reopens the discovery side channel that deployment-scoping closes. FORBIDDEN is reserved for the in-sandbox case where the caller is authorized to call the controller but params reference a non-sandbox account. Sandbox gating is enforced per-request on the account reference, not just at tool registration time. The mechanism for provisioning sandbox credentials and for separating production from sandbox/staging deployments is seller-specific and out of scope for this spec. Sellers MUST document their sandbox access mechanism so storyboard runners can connect appropriately. The storyboard runner MUST treat the presence of comply_test_controller in tools/list (or skills[]) or the presence of the compliance_testing block in get_adcp_capabilities on a connection it believes is production as a hard conformance failure.

Tool definition

Schemas: comply-test-controller-request.json | comply-test-controller-response.json Sellers that implement compliance test controller MUST:
  • Only expose the tool in sandbox mode (see sandbox gating above)
  • Enforce the same state transition rules as production — invalid transitions MUST return errors
  • Reflect forced state changes in subsequent reads (list_creatives, get_media_buys, etc.)
{
  "name": "comply_test_controller",
  "description": "Triggers seller-side state transitions for compliance testing. Sandbox only.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "scenario": {
        "type": "string",
        "enum": [
          "list_scenarios",
          "force_creative_status",
          "force_account_status",
          "force_media_buy_status",
          "force_create_media_buy_arm",
          "force_task_completion",
          "force_session_status",
          "simulate_delivery",
          "simulate_budget_spend",
          "seed_product",
          "seed_pricing_option",
          "seed_creative",
          "seed_plan",
          "seed_media_buy"
        ],
        "description": "The seller-side transition or fixture-seed to trigger."
      },
      "params": {
        "type": "object",
        "description": "Scenario-specific parameters. Omit for list_scenarios. force_creative_status: {creative_id, status, rejection_reason?}. force_account_status: {account_id, status}. force_media_buy_status: {media_buy_id, status, rejection_reason?}. force_create_media_buy_arm: {arm, task_id?, message?} — task_id required when arm = submitted. force_task_completion: {task_id, result}. force_session_status: {session_id, status, termination_reason?}. simulate_delivery: {media_buy_id, impressions?, clicks?, reported_spend?, conversions?}. simulate_budget_spend: {account_id|media_buy_id, spend_percentage}. seed_product: {product_id, fixture?}. seed_pricing_option: {product_id, pricing_option_id, fixture?}. seed_creative: {creative_id, fixture?}. seed_plan: {plan_id, fixture?}. seed_media_buy: {media_buy_id, fixture?}."
      }
    },
    "required": ["scenario"]
  }
}
The params description inlines param shapes for each scenario because MCP clients (including LLMs) read descriptions, not conditional schema branches. For formal validation schemas suitable for SDK code generation, see the per-scenario definitions below.

Scenarios

force_creative_status

Transitions a creative to the specified status. The seller MUST enforce valid transitions per the creative lifecycle state machine. Params:
FieldTypeRequiredDescription
creative_idstringYesCreative to transition
statusprocessing | approved | rejected | pending_review | archivedYesTarget status
rejection_reasonstringWhen status = rejectedReason for rejection
Example:
{
  "scenario": "force_creative_status",
  "params": {
    "creative_id": "cr-123",
    "status": "rejected",
    "rejection_reason": "Brand safety policy violation"
  }
}

force_account_status

Transitions an account to the specified status. The seller MUST enforce the account lifecycle rules — terminal states (rejected, closed) cannot be exited. Params:
FieldTypeRequiredDescription
account_idstringYesAccount to transition
statusactive | pending_approval | rejected | payment_required | suspended | closedYesTarget status
Example:
{
  "scenario": "force_account_status",
  "params": {
    "account_id": "acct-456",
    "status": "payment_required"
  }
}

force_media_buy_status

Transitions a media buy to the specified status. The seller MUST enforce the media buy lifecycle — rejected is only valid from pending_creatives or pending_start. Params:
FieldTypeRequiredDescription
media_buy_idstringYesMedia buy to transition
statuspending_creatives | pending_start | active | paused | completed | rejected | canceledYesTarget status
rejection_reasonstringWhen status = rejectedReason for rejection
Example:
{
  "scenario": "force_media_buy_status",
  "params": {
    "media_buy_id": "mb-789",
    "status": "rejected",
    "rejection_reason": "Policy violation"
  }
}

force_create_media_buy_arm

Shapes the next create_media_buy call from the caller’s authenticated sandbox account into a specific response arm. v1 supports two arms: submitted (the async task envelope, no media_buy_id yet) and input-required (the errors-branch). Unlike force_media_buy_status, no entity transitions — there is no media buy yet — so the response carries forced.arm rather than previous_state/current_state. The submitted-arm wire shape is otherwise implementation-dependent: most sellers route most buys synchronously and no buyer-side request shape reliably triggers async. This scenario lets storyboards pin the arm so a regressed seller (e.g., emitting media_buy_id under status: submitted) cannot pass conformance silently. Params:
FieldTypeRequiredDescription
armsubmitted | input-requiredYesTarget response arm for the next create_media_buy call
task_idstringWhen arm = submittedDeterministic task handle (max 128 chars) the seller MUST emit verbatim on the submitted envelope and MUST accept on subsequent tasks/get polls. Sandbox task_ids are caller-opaque strings; production task-id format rules do not apply.
messagestringNoHuman-readable explanation surfaced verbatim on the seller’s create_media_buy response. Plain text, max 2000 characters. Buyers consuming the resulting response MUST apply the prompt-injection sanitization documented for message on the submitted envelope — this scenario is the natural place for a runner to inject adversarial strings to test that buyer-side sanitization.
Example:
{
  "scenario": "force_create_media_buy_arm",
  "params": {
    "arm": "submitted",
    "task_id": "task_async_signed_io_q2",
    "message": "Awaiting IO signature from sales team; typical turnaround 2–4 hours"
  }
}
Response. A ForcedDirectiveSuccess shape carrying the registered directive:
{
  "success": true,
  "forced": {
    "arm": "submitted",
    "task_id": "task_async_signed_io_q2"
  },
  "message": "Next create_media_buy call will return the submitted arm with task_id task_async_signed_io_q2"
}
forced.task_id is present only when arm: submitted. Consumption and idempotency. The directive is keyed to the caller’s authenticated sandbox account (account + principal pair) and is consumed by the next create_media_buy call from that account. Subsequent calls without a fresh directive return the seller’s default arm. Buyer-side idempotency_key semantics are unchanged: if the caller replays a create_media_buy request that already consumed a directive, the seller MUST replay the cached response (the request idempotency cache wins) and MUST NOT re-evaluate against the now-empty directive slot. Sellers MUST NOT match a directive against a create_media_buy call from a different account or principal, even within the same transport connection. A second force_create_media_buy_arm call before the directive is consumed overwrites the prior one.

force_task_completion

Resolves a previously-submitted async task to completed with a buyer-supplied result payload. The companion to force_create_media_buy_arm: that scenario drives the seller into the submitted envelope; this one closes the loop by transitioning the task store entry to completed and stamping the registered result. The buyer observes completion via the seller’s push notification to push_notification_config.url (the canonical 3.0 delivery path for completion payloads) and via subsequent tasks/get calls reporting status: "completed". A typed result projection on the polling response is tracked for 3.1 in #3123. The submitted → completed lifecycle is otherwise non-deterministic — real task completions ride on out-of-band signals (IO countersignature, batch processor cron, governance human review). Storyboards cannot wait. This scenario lets a runner pin the completion deterministically immediately after registering the directive, so the buyer-side polling assertion fires on the same wire shape buyers will observe in production. Params:
FieldTypeRequiredDescription
task_idstringYesTask to resolve. MUST resolve within the caller’s authenticated sandbox account; sellers MUST return NOT_FOUND (not FORBIDDEN, per the multi-tenant convention above) for task_ids belonging to other accounts. Typically captured from the prior create_media_buy submitted-envelope response (or registered via force_create_media_buy_arm).
resultasync-response-dataYesCompletion payload to record. Validates against the same anyOf union the push-notification webhook and tasks/get polling responses use. For create_media_buy, this is a CreateMediaBuyResponse with media_buy_id and packages. Sellers MUST emit INVALID_PARAMS if result does not validate against the response branch for the task’s original method. Sellers MAY reject result payloads exceeding 256 KB with INVALID_PARAMS; storyboards MUST stay below this.
Example:
{
  "scenario": "force_task_completion",
  "params": {
    "task_id": "task_async_signed_io_q2",
    "result": {
      "media_buy_id": "mb_async_signed_io_q2",
      "status": "active",
      "packages": [
        { "package_id": "pkg-0", "product_id": "async_signed_io_q2", "budget": 30000 }
      ]
    }
  }
}
Response. Returns a state-transition success shape:
{
  "success": true,
  "previous_state": "submitted",
  "current_state": "completed",
  "message": "Task task_async_signed_io_q2 transitioned from submitted to completed"
}
Source state MUST be submitted, working, or input-required; any other source returns INVALID_TRANSITION. Sellers MUST emit NOT_FOUND if task_id is unknown to the caller’s account, and INVALID_TRANSITION if the task is already terminal (completed / failed / canceled). Forcing a task to failed is out of scope for this scenario; the input-required arm of force_create_media_buy_arm covers the buyer-input-needed failure path. Replay semantics. Replays with identical params before the task is terminal are idempotent no-ops. Replays with diverging params before the task is terminal MUST overwrite the registered result (last-write-wins) — same precedent as force_create_media_buy_arm’s “second call overwrites.” After the task is terminal, every replay returns INVALID_TRANSITION regardless of params. Cross-protocol obligations.
  • Push notifications. If the buyer registered push_notification_config.url on the original create_media_buy, forcing completion MUST fire the webhook with the registered result payload (the canonical 3.0 delivery path for completion data). Otherwise the storyboard can only test polling for terminal status, not push delivery of the result.
  • simulate_delivery / simulate_budget_spend. Once forced to completed with a valid CreateMediaBuyResponse carrying media_buy_id, the resulting media buy MUST be addressable by those scenarios. Round-tripping through force_task_completion is the supported path for storyboards that need a media buy without going through the synchronous flow.
Buyer-side observation. After this scenario runs, the registered result is delivered to the buyer’s push_notification_config.url (3.0 canonical path) with all caller-supplied fields preserved. Sellers MAY augment with seller-controlled fields (e.g., created_at, dsp_* IDs, normalized currency casing) but MUST NOT overwrite caller-supplied values. A subsequent tasks/get(task_id) MUST return status: "completed". The result payload is buyer-controlled in sandbox and round-trips through the seller’s store — buyers receiving it via webhook MUST treat the payload as untrusted seller output (per AdCP convention) regardless of the fact that they originated the bytes. This makes force_task_completion the natural place for a runner to inject adversarial payloads when testing buyer-side sanitization on the webhook delivery path.

force_session_status

Transitions an SI session to a terminal status. Enables testing timeout and termination scenarios that would otherwise require waiting for real timeouts. The termination_reason param simulates the cause so the storyboard runner can verify sellers report the correct reason in subsequent responses. Params:
FieldTypeRequiredDescription
session_idstringYesSession to transition
statuscomplete | terminatedYesTarget terminal status
termination_reasonstringWhen status = terminatedReason for termination (e.g., session_timeout, host_terminated, policy_violation)
Example:
{
  "scenario": "force_session_status",
  "params": {
    "session_id": "sess-abc",
    "status": "terminated",
    "termination_reason": "session_timeout"
  }
}

simulate_delivery

Injects synthetic delivery data for a media buy. Subsequent calls to get_media_buy_delivery MUST reflect this data. Delivery simulation is additive — each call adds to existing delivery totals. Delivery and budget are independent systems. simulate_delivery records what the ad server would report. simulate_budget_spend records what the billing system would track. A seller’s production system may or may not couple these — the test controller does not assume coupling. Params:
FieldTypeRequiredDescription
media_buy_idstringYesMedia buy to add delivery to
impressionsintegerNoImpressions to simulate
clicksintegerNoClicks to simulate
reported_spendobjectNo{ amount: number, currency: string } — spend as reported in delivery data, does not affect budget
conversionsintegerNoConversions to simulate
Example:
{
  "scenario": "simulate_delivery",
  "params": {
    "media_buy_id": "mb-789",
    "impressions": 10000,
    "clicks": 150,
    "reported_spend": { "amount": 150.00, "currency": "USD" }
  }
}

simulate_budget_spend

Simulates budget consumption to a specified percentage. Enables testing budget threshold alerts and payment_required transitions without waiting for real spend. This is the only scenario that affects account-level financial state. After calling simulate_budget_spend, the seller MUST reflect the simulated consumption in get_account_financials. Specifically:
  • total_spend (or equivalent) MUST reflect the simulated amount
  • remaining_budget (or equivalent) MUST be reduced accordingly
  • Budget utilization percentages MUST match spend_percentage
Params:
FieldTypeRequiredDescription
account_idstringNoAccount (for account-level budget)
media_buy_idstringNoMedia buy (for buy-level budget)
spend_percentagenumberYesSpend to this % of budget (0–100)
At least one of account_id or media_buy_id is required. The target entity MUST have a non-zero budget configured — the controller SHOULD return INVALID_PARAMS if it does not. Example:
{
  "scenario": "simulate_budget_spend",
  "params": {
    "media_buy_id": "mb-789",
    "spend_percentage": 95
  }
}

seed_product

Creates (or upserts) a product fixture with a caller-supplied product_id so subsequent storyboard steps can reference the product by stable ID. The controller MUST make the seeded product discoverable via get_products under the authenticated account unless the fixture explicitly marks it hidden. Why this scenario exists. Storyboards hardcode fixture IDs like "test-product" and expect the seller to have a matching product. Without a seed scenario, every implementer rediscovers which IDs the conformance suite expects and has to alias them by hand. seed_product replaces that discovery with an explicit, storyboard-authored contract. Params:
FieldTypeRequiredDescription
product_idstringYesStable identifier the storyboard will reference
fixtureobjectNoProduct shape. Minimum useful fields: delivery_type, channels, pricing_options[], format_ids[]. Sellers MAY fill in defaults for omitted fields.
Example:
{
  "scenario": "seed_product",
  "params": {
    "product_id": "test-product",
    "fixture": {
      "delivery_type": "non_guaranteed",
      "channels": ["display"],
      "pricing_options": [
        { "pricing_option_id": "test-pricing", "pricing_model": "cpm", "currency": "USD", "floor_price": 1.0 }
      ],
      "format_ids": [{ "id": "display_300x250" }]
    }
  }
}

seed_pricing_option

Adds (or upserts) a pricing option on an existing seeded product. Use this when a storyboard needs a specific pricing option that wasn’t included in the initial seed_product call, or when the option’s attributes need to diverge from the seller’s default. Params:
FieldTypeRequiredDescription
product_idstringYesParent product (must already exist — seed it first)
pricing_option_idstringYesStable identifier for the pricing option
fixtureobjectNoPricing option shape per the PricingOption schema (pricing_model, currency, floor_price for auction-based, fixed_price for fixed, etc.)
Example:
{
  "scenario": "seed_pricing_option",
  "params": {
    "product_id": "test-product",
    "pricing_option_id": "default",
    "fixture": {
      "pricing_model": "cpm",
      "floor_price": 5.0,
      "currency": "USD"
    }
  }
}

seed_creative

Creates a creative fixture at a specific lifecycle status. Lets governance and delivery storyboards reference a pre-approved creative without round-tripping sync_creatives first. Params:
FieldTypeRequiredDescription
creative_idstringYesStable identifier
fixtureobjectNoCreative shape. Typical fields: status, format_id, assets, click_through_url.
Example:
{
  "scenario": "seed_creative",
  "params": {
    "creative_id": "campaign_hero_video",
    "fixture": {
      "status": "approved",
      "format_id": { "id": "video_30s" },
      "assets": [{ "type": "video", "url": "https://example.com/hero.mp4" }]
    }
  }
}

seed_plan

Creates a media plan fixture. Used by governance storyboards that assert against a specific plan without running the full briefing + proposal flow first. Params:
FieldTypeRequiredDescription
plan_idstringYesStable identifier
fixtureobjectNoPlan shape. Typical fields: budget, brand, flight, line_items[].
Example:
{
  "scenario": "seed_plan",
  "params": {
    "plan_id": "gov_acme_q2_2027",
    "fixture": {
      "budget": { "total": 30000, "currency": "USD" },
      "brand": { "domain": "acmeoutdoor.example" },
      "flight": { "start": "2027-04-01", "end": "2027-06-30" }
    }
  }
}

seed_media_buy

Creates a media buy fixture at a specified lifecycle state, bypassing the create_media_buy flow. Used by storyboards that need to assert governance or delivery behavior against a pre-existing buy. Params:
FieldTypeRequiredDescription
media_buy_idstringYesStable identifier
fixtureobjectNoMedia buy shape. Typical fields: status, packages[], budget, flight.
Example:
{
  "scenario": "seed_media_buy",
  "params": {
    "media_buy_id": "mb_acme_q2_2026_auction",
    "fixture": {
      "status": "active",
      "packages": [{ "package_id": "pkg_001", "product_id": "test-product" }]
    }
  }
}

Seeding semantics and ordering

  • Fixture shape. fixture is kept permissive (additionalProperties: true) so storyboard authors can declare the minimum shape each test needs. Fixtures SHOULD conform to the corresponding domain schema (core/product.json for seed_product, core/pricing-option.json for seed_pricing_option, media-buy/sync-creatives-request.json creative-item shape for seed_creative, core/media-buy.json for seed_media_buy, the plan schema for seed_plan). Sellers MAY reject clearly malformed fixtures with INVALID_PARAMS.
  • Idempotency on re-seed. A second call with the same primary ID and a fixture equivalent to the first SHOULD succeed and return success: true with previous_state: "existing". A second call with a diverging fixture MUST return INVALID_PARAMS with error_detail explaining which fields diverged — sellers MUST NOT merge or update silently. Storyboards that need to change fixture state mid-run MUST use force_* scenarios, not a re-seed. This keeps the same storyboard deterministic across sellers.
  • Foreign-key ordering. The runner seeds fixtures in dependency order so sellers receive referenced parents before their children. The dependency DAG:
    product ──┬─→ pricing_option
              ├─→ plan
              └─→ media_buy
    creative ────→ media_buy
    plan ────────→ media_buy
    
    Concretely: seed_product before seed_pricing_option; seed_product, seed_creative, and seed_plan all before seed_media_buy when the fixture references them. Storyboards that declare a fixtures: block MUST list entries in an order the runner can topologically sort — sellers that receive a seed_pricing_option for a product that does not exist, or a seed_media_buy referencing a creative/product/plan that was not seeded first, MUST return INVALID_PARAMS rather than auto-create the parent.
  • Sandbox scope. Seeded fixtures exist only for the authenticated sandbox account. NOT_FOUND applies the same way as for force_* — a seller that cannot see the parent product for the caller’s account MUST return NOT_FOUND, not silently fall back to another tenant.
  • Capability advertisement. Sellers that do not implement a given seed scenario MUST return UNKNOWN_SCENARIO for that scenario name. The runner treats UNKNOWN_SCENARIO on a seed_* as a coverage gap for storyboards whose prerequisites.controller_seeding requires the scenario — those storyboards are graded not_applicable, not failed. This applies to unfamiliar seed_* names as well: a runner may emit a scenario the seller has never seen because the enum is open-for-extension (see below). Sellers and runners MUST respond with UNKNOWN_SCENARIO rather than schema-reject an unrecognized scenario value.
  • Open-for-extension enum. The scenario enum adds new values over time (new seed scenarios land as specialisms demand them). Runners and sellers MUST accept scenario strings they do not recognize and respond with UNKNOWN_SCENARIO rather than hard-fail schema validation — otherwise every new enum value becomes a breaking change for stale implementations.

Response shape

State transition responses (force_*)

Success:
{
  "success": true,
  "previous_state": "processing",
  "current_state": "approved",
  "message": "Creative cr-123 transitioned from processing to approved"
}
Failure (invalid transition):
{
  "success": false,
  "error": "INVALID_TRANSITION",
  "error_detail": "Cannot transition from archived to processing — archived is terminal",
  "current_state": "archived"
}
Failure (unknown entity):
{
  "success": false,
  "error": "NOT_FOUND",
  "error_detail": "Creative cr-unknown not found",
  "current_state": null
}

Simulation responses (simulate_*)

simulate_delivery response:
{
  "success": true,
  "simulated": {
    "impressions": 10000,
    "clicks": 150,
    "reported_spend": { "amount": 150.00, "currency": "USD" }
  },
  "cumulative": {
    "impressions": 25000,
    "clicks": 380,
    "reported_spend": { "amount": 375.00, "currency": "USD" }
  },
  "message": "Delivery simulated for mb-789: 10000 impressions, 150 clicks, $150.00 spend"
}
The simulated field echoes back the values injected by this call. The cumulative field returns running totals across all simulation calls for this media buy, so callers can verify expected state before checking get_media_buy_delivery. simulate_budget_spend response:
{
  "success": true,
  "simulated": {
    "spend_percentage": 95,
    "computed_spend": { "amount": 950.00, "currency": "USD" },
    "budget": { "amount": 1000.00, "currency": "USD" }
  },
  "message": "Budget for mb-789 set to 95% consumed ($950.00 of $1000.00)"
}

Error codes

Controllers MUST use structured error codes so the storyboard runner can assert on specific failure modes:
Error codeWhen
INVALID_TRANSITIONRequested state-machine transition is not valid (e.g., archived → processing, canceled → paused)
INVALID_STATEOperation is not permitted for the resource’s current status (e.g., re-seeding a fixture that already exists with a diverging shape)
NOT_FOUNDEntity does not exist or caller does not have access (multi-tenant sandboxes SHOULD treat “not yours” as “not found”)
UNKNOWN_SCENARIOScenario not implemented by this seller
INVALID_PARAMSMissing or malformed params, or precondition not met (e.g., simulate_budget_spend on an entity with no budget configured)
FORBIDDENProduction account referenced from a sandbox connection
INTERNAL_ERRORTransient seller-side failure (e.g., sandbox database unavailable). The runner SHOULD retry once before treating as a failure.
Controller-specific enum. The error field on controller responses uses a controller-specific vocabulary defined in comply-test-controller-response.json, distinct from the canonical seller-response error-code.json enum that governs task-level errors. INVALID_TRANSITION is controller-specific (state-machine primitives expose the transition-vs-state distinction that seller-level error codes collapse into INVALID_STATE). Storyboard assertions on controller responses use path: "error" or direct field_value checks, not check: error_code — the shape-agnostic error_code check is for task-response errors (adcp_error / payload errors[]), not the controller’s own response schema.

Idempotency

State transition scenarios (force_*) are idempotent: forcing a status that matches the current state returns success with previous_state equal to current_state. This avoids flaky tests when the runner retries after transient failures. Simulation scenarios (simulate_*) are NOT idempotent — simulate_delivery adds to existing totals, while simulate_budget_spend replaces the current spend level.

Test surfaces

Where a seller’s state-of-record lives determines how the storyboard test loop closes. State-local sellers (typically SSPs, creative agents) write to the seller’s DB via the seed_* scenarios above; the seller’s read handlers consume the same store, and the seed→read loop closes naturally. Upstream-proxy sellers (DSPs proxying to platforms, retail-media networks reading retailer catalogs, signals brokers) cannot close the loop that way because their read handlers reach a system the seller does not control; the TypeScript SDK ships a TestControllerBridge that runs the real adapter call first, then merges seeded fixtures into the response. Either path earns the wire-format pass that AAO Verified (Spec) attests. Neither path is what (Sandbox) attests — that’s a separate axis covering whether the seller’s production stack honors account.sandbox: true without real-world side effects. The cross-page framing for both implementations of this pattern, the SDK’s _bridge advisory marker, and the runtime-signals disambiguation table all live in the Conformance Specification → Test surfaces and the storyboard loop.

Compliance testing modes

The presence of comply_test_controller in a seller’s tool list determines which mode a compliance tester uses:

Capability discovery

A seller may implement the test controller without supporting every scenario. The storyboard runner SHOULD call comply_test_controller with scenario: "list_scenarios" as the first interaction. Sellers that support this return the list of implemented scenarios:
{
  "success": true,
  "scenarios": [
    "force_creative_status",
    "force_account_status",
    "force_media_buy_status"
  ]
}
Sellers that implement list_scenarios MUST respond with scenario names that appear verbatim in the scenario enum of comply-test-controller-request.json. Custom seller-specific scenario names are not part of the compliance contract; storyboard runners will not dispatch to scenarios outside the canonical enum, so listing them serves no purpose. A seller that supports seed_product MUST respond with the string "seed_product" — not "create_test_product" or any other variant. Sellers that do not implement list_scenarios SHOULD return an error with UNKNOWN_SCENARIO. When this happens, the runner tries each scenario individually and treats UNKNOWN_SCENARIO responses as coverage gaps (not failures). This means early implementers who skip list_scenarios are not penalized — the runner discovers supported scenarios through trial.

Observational mode (default)

When comply_test_controller is not available:
  • The runner executes buyer-initiated flows and validates response schemas
  • State machine transitions that require seller action are skipped
  • Advisory observations note what could not be tested

Deterministic mode

When comply_test_controller is available:
  • The runner walks every reachable state in each lifecycle
  • Forces edge cases: terminal states, invalid transitions, error codes
  • Validates that forced state changes are reflected in subsequent reads
  • Tests operation gates (e.g., create_media_buy blocked when account is suspended)
The runner distinguishes three outcome categories in deterministic mode:
  • Scenario not supported — returned by list_scenarios or UNKNOWN_SCENARIO error. Reported as a coverage gap, not a failure.
  • Transition correctly rejected — controller returned INVALID_TRANSITION for an invalid state change. This is a pass.
  • Unexpected failure — controller returned an error for a transition that should be valid, or succeeded on a transition that should fail. This is a compliance failure.

Example: creative lifecycle in deterministic mode

1. sync_creatives(creative)
2. list_creatives() → verify status = "processing"
3. force_creative_status(creative_id, "pending_review")
4. force_creative_status(creative_id, "approved")
5. list_creatives() → verify status = "approved"
6. force_creative_status(creative_id, "archived")
7. list_creatives() → verify status = "archived"
8. sync_creatives(same creative) → verify unarchive (→ approved or pending_review)
9. force_creative_status(creative_id, "rejected", reason)
10. list_creatives() → verify rejection_reason persisted
11. sync_creatives(same creative) → verify resubmission (rejected → processing)
12. force_creative_status(creative_id, "approved") → expect INVALID_TRANSITION (must go through pending_review)

Example: account operation gates in deterministic mode

1. sync_accounts(account) → active
2. force_account_status(account_id, "suspended")
3. create_media_buy() → expect ACCOUNT_SUSPENDED
4. get_media_buys() → expect existing buys still readable
5. force_account_status(account_id, "active")
6. create_media_buy() → expect success
7. force_account_status(account_id, "payment_required")
8. update_media_buy(add packages) → expect ACCOUNT_PAYMENT_REQUIRED
9. get_media_buys() → existing buys still readable

Example: media buy lifecycle in deterministic mode

1. create_media_buy() → status = "pending_creatives"
2. force_media_buy_status(media_buy_id, "rejected", reason) → expect success
3. get_media_buys() → verify status = "rejected", rejection_reason persisted
4. force_media_buy_status(media_buy_id, "active") → expect INVALID_TRANSITION (rejected is terminal)
5. create_media_buy() → new buy, status = "pending_creatives"
6. force_media_buy_status(media_buy_id, "pending_start")
7. force_media_buy_status(media_buy_id, "active")
8. force_media_buy_status(media_buy_id, "rejected") → expect INVALID_TRANSITION (rejected only valid from pending_creatives or pending_start)

Example: delivery and budget verification

1. create_media_buy(budget: $1000)
2. simulate_delivery(impressions: 10000, reported_spend: $500)
3. get_media_buy_delivery() → verify delivery reflects simulated data
   (reported_spend is delivery-only; does not affect account budget)
4. simulate_budget_spend(spend_percentage: 95)
5. get_account_financials() → verify total_spend reflects 95% ($950, not $500 from delivery)
6. simulate_budget_spend(spend_percentage: 100)
7. force_account_status("payment_required")
8. create_media_buy() → expect ACCOUNT_PAYMENT_REQUIRED

Certification tiers

TierRequirementWhat it proves
Functional compliancePass all storyboards in observational modeTools exist, respond correctly, and complete buyer-initiated flows
Stateful compliancePass all storyboards in deterministic modeState machines enforce correct transitions, error codes match spec, operation gates block correctly
Specialism-scoped seed requirements. Stateful compliance also requires that sellers implement the seed_* scenarios covering the specialisms they certify against. The UNKNOWN_SCENARIOnot_applicable grading is for honest coverage reporting on missing surface area, not a blanket opt-out from conformance — a seller certifying sales-non-guaranteed MUST implement at least seed_product and seed_pricing_option; a seller certifying creative-ad-server MUST implement seed_creative; a seller certifying governance-delivery-monitor MUST implement seed_plan (and seed_media_buy where the storyboard requires it). The storyboard authors in static/compliance/source/specialisms/ declare the fixtures their storyboards need; sellers match that list to the specialisms on their cert.

Implementation guidance

For sellers

  1. Gate comply_test_controller at the deployment level — it MUST NOT appear in tools/list (or A2A skills[]), MUST NOT be advertised via the compliance_testing capability block, and MUST dispatch to unknown-tool on production deployments. See Sandbox gating for the full rule.
  2. Reuse your production state machine logic — the controller should call the same internal transition functions, not bypass them
  3. Enforce transition rules — if rejected is terminal in production, force_media_buy_status(rejected → active) must fail via the controller too
  4. Reflect changes immediately — after a forced transition, the next list_* or get_* call must return the updated state

For compliance testers

  1. Detect the tool during profile discovery via tools/list
  2. Call list_scenarios to discover which scenarios are supported
  3. Run observational mode as the baseline — it works everywhere
  4. Layer deterministic scenarios on top when the controller is available
  5. Report which mode was used and distinguish coverage gaps from failures
  6. Test the controller’s transition validation itself — invalid transitions should return INVALID_TRANSITION, not silently succeed

Design decisions

  1. Sellers validate transition ordering. The controller enforces the same state machine rules as production. Calling force_creative_status(approved) on a creative that was never processing is an error — the controller rejects it just as production would. The lifecycle state machines referenced here are defined in the respective protocol specifications (see creative lifecycle, account lifecycle, media buy lifecycle, SI session lifecycle).
  2. Tests are self-contained. Each test SHOULD create dedicated entities (media buys, creatives, accounts) rather than reusing existing ones. This ensures additive simulation calls (simulate_delivery) start from known-zero state without needing a reset mechanism. No reset scenario is needed. Compliance testers SHOULD use unique identifiers (e.g., UUIDs) for test entities to avoid collisions when multiple storyboard runner instances run against the same sandbox concurrently. Sandbox entity cleanup (e.g., TTL-based expiration) is the seller’s responsibility.
  3. Delivery simulation uses a synthetic marker. simulate_delivery records MAY include a synthetic: true field that sellers can use internally for bookkeeping. The runner ignores this marker — it validates get_media_buy_delivery responses against the same schema regardless. This lowers the implementation bar for sellers without affecting test correctness.
  4. One tool, many scenarios. The single-tool design keeps context window cost to ~500 tokens vs ~1,400 for seven separate tools. Sellers implement one sandbox gate. The runner detects one tool. The list_scenarios introspection handles partial implementations without requiring per-tool presence detection.