Impression Tracker Implementation Reference

This page is non-normative reference content for the impression tracker that sits behind the Frequency-Cap Data Flow boundary. The protocol only constrains:

The wire spec — see the TMP specification.
The conformance invariants the Identity Match service must satisfy — also normative in the TMP specification.
The cap-fire boundary contract — defined in Frequency-Cap Data Flow.

Everything on this page is buyer-internal: how the impression tracker counts impressions, deduplicates across resolved identities, evaluates windows, and decides when a cap fires. Buyers running a conformant impression tracker may pick any approach that produces correct cap-fire events at the boundary. This page documents one such approach — the one implemented in adcp-go/targeting — so other implementers have a worked reference.

The cross-identity dedup problem

A single impression on a user is often resolved to multiple identities (RampID, ID5, MAID, UID2, publisher-issued tokens, etc.) inside the same TMPX. A naive impression tracker that counts per-identity will count one impression as 2–3 against the user’s caps. If the buyer runs an identity graph, the buyer can canonicalize identities before counting; if the buyer is graphless or partially graphed (common — Scope3’s hosted Identity Match is graphless), no canonical id exists. Counter-based approaches paper over this with a merge_rule (MAX / OR / SUM) when reading per-identity counters. None of the merge rules is correct in general. The pathological case is identity-resolution toggling across impressions: some impressions resolve rampid only, some resolve both rampid and id5. A MAX-merged counter under-counts; SUM over-counts; OR can’t represent more-than-one. The cap fires at the wrong time either way. The reference impl avoids the merge-rule problem entirely with an impression_id scheme: one id per impression, written to every resolved identity’s log, deduplicated by id at read time. The count is exact regardless of whether identities are canonicalized upstream.

impression_id rules

The impression tracker generates one impression_id per impression at TMPX decode time and writes it to every resolved identity’s log. At read time, scanning all of a user’s identity logs and deduplicating by impression_id recovers the distinct-impression count exactly. Required properties:

Globally unique across all sellers, sources, and time. A buyer agent serves impressions sourced from many sellers. Collisions across sellers would silently merge distinct impressions and under-count the cap. Use UUIDv4 (≥122 bits randomness) or an equivalent collision-resistant generator.
Generated by the buyer’s impression tracker at TMPX decode — not by the seller, the publisher, the router, or the TMPX nonce. The TMPX nonce is per-Identity-Match-evaluation and shared across all impressions in the serve window; seller- or publisher-supplied IDs would collide.
One id per impression, written to ALL of the user’s resolved identity logs for that impression. Generating a different id per identity breaks the dedup contract — the same impression would count once per resolved identity.
Pixel retries are a separate concern. The same pixel firing twice (network retry, page refresh, etc.) must not mint two impression_ids — minting two would let pixel retries double-count against the cap. Either dedupe incoming requests by an idempotency key in the pixel URL or Idempotency-Key header, or accept a small over-count from retries as benign for fcap purposes. Cross-identity dedup and per-pixel idempotency are different problems with different mitigations. (Lowercase wording: this page is non-normative; the boundary contract on the Frequency-Cap Data Flow page is what conformance tests cite.)

fcap_keys label model

Caps are tagged with dimension:value labels at impression-write time. Packages declare which labels they map to; fcap policies attach a window and a max_impression_count to each label.

package 2342:                   fcap_keys ["campaign:42", "campaign_group:7", "advertiser:13"]
policy "campaign:42":           {window: {interval: 10, unit: "minutes"}, max_impression_count: 5}
policy "campaign_group:7":      {window: {interval: 1,  unit: "days"},    max_impression_count: 50}
policy "advertiser:13":         {window: {interval: 1,  unit: "days"},    max_impression_count: 20}

When the impression tracker writes an exposure for an impression on package 2342, the entry’s fcap_keys is ["campaign:42", "campaign_group:7", "advertiser:13"]. When evaluating whether a cap has fired, it scans the log for entries matching each label within that policy’s window. Window unit is load-bearing, not just human-readable shorthand. The reference impl uses unit as the sliding-window bucket size: unit: "hours" evaluates against hourly buckets; unit: "minutes" evaluates against minute buckets. Two policies that look duration-equivalent — {interval: 2, unit: "hours"} vs {interval: 120, unit: "minutes"} — have the same window length but different post-cap re-evaluation cadence. After a user hits the 2-hour-bucket cap, the next eligibility check that admits new traffic happens at the next-hour bucket boundary; for the 120-minute-bucket policy, it happens at the next-minute bucket boundary. Pick unit to match the cadence you want, not the duration you can fit in the smaller number. Charset constraint. Each segment matches [a-zA-Z0-9_-]+ so the : delimiter is unambiguous. URL-bearing or otherwise colon-bearing values must be hashed or shortened. Multi-tenant operators typically adopt a tenant prefix (buyer-acme:campaign:42) as a deployment convention to prevent key collisions across advertiser orgs on shared state. This is operator policy, not protocol. Why labels, not hierarchy. Cap dimensions are heterogeneous across customers — some cap at creative, some at line item, some at advertiser-roll-up. A fixed schema either over-prescribes or under-serves. Labels also make cross-seller caps automatic: any policy whose key is shared across sellers (e.g., buyer-acme:advertiser:13) enforces across all of them with no extra mode. Cross-cutting policies are explicit — a campaign that needs both per-campaign and per-advertiser caps declares both keys and gets two policy lookups.

Reference data model (valkey-backed, log-based)

The layout below is what adcp-go/targeting uses. Any backend (Aerospike, DynamoDB, in-memory, anything) is fine; the data shape is the reference, not a requirement.

Exposure log (per identity)

type:  STRING  (binary-encoded []ExposureEntry, lazy-pruned to window)
key:   user:exposures:{HashToken(uid_type + ":" + user_token)}
value: [
  { impression_id, fcap_keys[], timestamp },
  ...
]

HashToken is a 16-byte SHA-256 prefix, hex-encoded. Binary entry encoding keeps the log compact (exposure_binary.go) — a 30-day log for a typical user is a few KB. Each entry records:

impression_id — generated at TMPX decode. Same value across all of this impression’s identity logs.
fcap_keys[] — the labels this impression counts toward.
timestamp — unix seconds.

Fcap policy (per fcap_key)

type:  STRING  (JSON-encoded FcapPolicy)
key:   fcap_policy:{fcap_key}
value: { window: {interval, unit}, max_impression_count, active, updated_at }

Sliding window applied at read by counting log entries that fall in the current and prior buckets that span the window. Bucket size is derived from window.unit (minutes/hours/days/weeks/months); window length is interval × unit. The bucket-level filter, not a per-second >= filter on entry timestamps, is what production uses — it makes re-evaluation cadence after a cap fires predictable from the policy’s unit.

Package configuration (per package)

type:  STRING  (JSON-encoded PackageConfig)
key:   package:identity:{package_id}
value: {
  fcap_keys: ["campaign:42", "advertiser:13"],
  active:    true,
  updated_at: <unix seconds>
}

Maps package → fcap_keys. The impression tracker reads this to figure out which labels to tag a new exposure with.

Write path: pixel → log

On a TMPX-bearing pixel fire, the impression tracker:

Decodes the TMPX (HPKE decrypt + binary parse) → resolved identities + (seller_agent_url, package_id) context.
Looks up the package’s fcap_keys.
Generates one impression_id.
For each resolved identity, appends {impression_id, fcap_keys, timestamp} to user:exposures:{hash(identity)}. Prunes entries older than the longest active window (default 30 days).

The read-modify-write per identity is not atomic in the reference impl (engine.go:478) — concurrent writes for the same user can lose an exposure. The reference impl explicitly accepts this; under-counting under contention is benign for fcap purposes. Atomic append via Lua or a Store.Append extension is a deferred optimization.

Evaluating whether this impression exhausted a cap

After writing the exposure, the impression tracker decides whether any cap just fired. A package typically maps to multiple fcap_keys (campaign, campaign_group, advertiser, …), each with its own policy. Policies are evaluated independently, and the cap fires when any one of them reaches max_impression_count within its window. A user can be capped on a package by the per-campaign policy without ever approaching the per-advertiser policy, or vice versa. For each fcap_key on the exposure, the impression tracker scans the user’s identity logs:

Read user:exposures:{h} for every resolved identity.
Filter entries to those that fall in the current+prior buckets spanning policy.window and where fcap_key ∈ entry.fcap_keys.
Deduplicate by impression_id across all the user’s identity logs.
Compare the deduped count to policy.max_impression_count.

If any policy’s deduped count is >= max_impression_count, the cap fired on this impression. The impression tracker then writes a cap-fire entry to the Identity Match cap-state store for every (user_identity, package_id) whose package maps to the exhausted fcap_key. The expiration is the end of the current bucket of policy.window (which is when the oldest in-scope exposure ages out under bucket semantics). For a cap on an advertiser-level label (advertiser:13) that maps to multiple packages on multiple sellers, the impression tracker emits one cap-fire entry per (user_identity, seller_agent_url, package_id) affected — main’s boundary contract is package-scoped, so cross-dimensional caps fan out at write time.

SDK primitives

The SDK ships impression handling as two composable functions, not one bundled call. Production tracking endpoints typically decode at intake and let a downstream worker write the store at its own pace; bundling decode+write into a single function would force synchronous topology and prevent buffering.

decodeTmpx(raw_tmpx) -> DecodedExposures
  Decrypts HPKE ciphertext, parses the published TMPX binary format
  (/docs/trusted-match/specification#binary-format), returns the resolved
  identity entries in a structured form ready for serialization onto a
  topic or for direct write. The persistent per-identity exposure log
  is a separate, store-resident structure — see Reference data model above.

writeExposure(decoded, fcap_keys, store_context) -> { ok, fired_caps }
  Appends entries to each resolved identity's exposure log with a fresh
  impression_id and the supplied fcap_keys. Prunes entries older than the
  longest active window. Returns the set of caps that fired on this
  impression — the caller fans these out to the Identity Match cap-state
  store.

Plus the buyer-side management plane:

upsertPackage(seller_agent_url, package_id, fcap_keys, opts)
upsertFcapPolicy(fcap_key, {window: {interval, unit}, max_impression_count})
inspectExposures(uid_type, user_token, fcap_key?)   // debugging helper

Plus HPKE encrypt/decrypt as net-new SDK primitives (X25519 KEM, ChaCha20-Poly1305, HKDF-SHA256 per RFC 9180 mode_base). Encrypt is needed by the Identity Match service emitting TMPX; decrypt by the impression tracker invoking decodeTmpx. The same surface ships in @adcp/client (TS), adcp-go, and adcp (Python).

Primitive names are illustrative. decodeTmpx, writeExposure, upsertPackage, upsertFcapPolicy, and inspectExposures describe the shape of the SDK surface; canonical signatures land with the corresponding SDK RFCs and may differ in naming or argument order. Treat this section as the impression-tracker decomposition, not as an API contract.

Production topology pattern

A typical Scope3-style deployment:

publisher pixel fires {TMPX} → tracking endpoint
                                      │
                          decodeTmpx (synchronous, at intake)
                                      │
                                      ▼
                              pub/sub topic
                                      │
                          frequency_writer worker
                                      │
                          writeExposure (asynchronous)
                                      │
                                      ▼
                              valkey (exposure log)
                                      │
                          if cap fired → RecordCap to
                                         Identity Match cap-state store

Decode at intake; emit to pub/sub for buffering; downstream worker writes the exposure log and emits any cap-fire events. Buffering, retries, dedup, observability, and abuse protection live at the queue layer — none of that is the SDK’s job. A simpler synchronous pipeline (decode + write in the same handler) is also valid for low-volume deployments.

Conformance scenarios

These walk through impression-tracker behavior end-to-end. They are buyer-internal mechanics; the on-wire observable is whatever cap-fire entries land in the Identity Match cap-state store, which surfaces as eligibility decisions in later identity_match_request calls. Setup for both scenarios: package = "pkg-42" on seller-a.example, fcap_keys: ["campaign:42"], policy campaign:42 = {window: {interval: 1, unit: "days"}, max_impression_count: 5}.

Scenario A — multi-identity dedup

User has two resolved identities across the impression stream: rampid:abc and id5:def. Identity resolution toggles — most impressions resolve both, but one resolves rampid only. imp-001, imp-002, imp-003 — TMPX resolves both identities. Each impression writes the same impression_id to both logs:

user:exposures:<hash(rampid:abc)> = [ imp-001, imp-002, imp-003 ]
user:exposures:<hash(id5:def)>    = [ imp-001, imp-002, imp-003 ]

imp-004 — TMPX resolves rampid only (id5 lookup fails). imp-004 is written to rampid’s log only:

user:exposures:<hash(rampid:abc)> = [ imp-001..imp-004 ]
user:exposures:<hash(id5:def)>    = [ imp-001..imp-003 ]    unchanged

imp-005 — TMPX resolves both identities again. imp-005 is written to both logs. The impression tracker then evaluates the cap by reading both resolved-identity logs:

rampid:abc log: { imp-001, imp-002, imp-003, imp-004, imp-005 }   = 5 entries
id5:def log:    { imp-001, imp-002, imp-003,           imp-005 }   = 4 entries

Union the entries across logs, deduplicate by impression_id:

{ imp-001, imp-002, imp-003, imp-004, imp-005 } = 5 distinct impressions

5 = max_impression_count → the cap just exhausted. Since both identities are resolved on imp-005, the impression tracker emits cap-fire entries for both:

RecordCap(rampid:abc, [{seller-a.example, pkg-42}], expire_at)
RecordCap(id5:def,    [{seller-a.example, pkg-42}], expire_at)

Two things are demonstrated:

Dedup matters. Naively summing per-identity counts gives 5 + 4 = 9 — way over max_impression_count. Dedup by impression_id recovers the correct count of 5.
Identity-resolution stability isn’t required. imp-004 missed id5’s log entirely; dedup at evaluation time still produces the right answer when both identities are next resolved together.

A counter-based tracker with a MAX merge_rule would see counters max(rampid=5, id5=4) = 5 here — coincidentally correct at this point, but only because the divergence happened to be a single missed write. A second missed-id5 impression (imp-006-style) would push rampid to 6 while leaving id5 at 5; MAX would still say 5 and over-serve by one. SUM (= 9 here) over-counts in the opposite direction. The log + impression_id dedup is correct by construction. A consequence to flag for the implementer: if a future query resolves only id5:def, the cap-state lookup hits the id5:def entry written at imp-005 and the user is correctly suppressed. If neither identity gets resolved in a future query, no cap-state lookup happens at all — that’s an identity-resolution problem upstream of fcap, not a fcap correctness problem.

Scenario B — cross-seller advertiser cap

Two packages on different sellers, both mapped to the same advertiser-level label:

package:identity:pkg-A = { fcap_keys: ["advertiser:13"], active: true }   // seller-a
package:identity:pkg-B = { fcap_keys: ["advertiser:13"], active: true }   // seller-b
fcap_policy:advertiser:13 = { window: {interval: 1, unit: "days"}, max_impression_count: 10 }

Ten impressions on pkg-A from seller-a. Each exposure entry’s fcap_keys includes advertiser:13. At the 10th write, the deduped count for advertiser:13 matches max_impression_count. The impression tracker emits cap-fire entries for every package mapped to advertiser:13 across all sellers, for every resolved identity:

RecordCap(<identity>, [
  {seller-a.example, pkg-A},
  {seller-b.example, pkg-B},
], expire_at)

A subsequent identity_match_request from seller-b for pkg-B returns eligible_package_ids: [] because the cap-state entry is present. The advertiser-level cap enforces across sellers because the fcap_key is shared. No cross-seller coordination is required at the IdentityMatch service — the buyer agent’s impression tracker is the single source of truth, and the cap-state store is the publication channel.

Performance reference

Numbers below are from targeting/scale_test.go against the in-memory mock store, single goroutine. They isolate CPU from network. They describe the impression tracker’s evaluation cost — the cost of scanning logs and deciding whether this impression just fired a cap. The Identity Match service’s at-query-time cost is a separate, much smaller cap-state presence check. Per-eval at write time, varying log size, single identity, single fcap_key:

Prior exposures in user’s log	Eval latency
0	368 ns
100	5.3 µs
1,000	53 µs
10,000	118 µs

Linear scan with binary lazy dedup; sub-millisecond at 10K entries. Combined load (multi-identity, multi-package eval), varying all dimensions:

packages mapped via fcap_keys	log entries / id	identities	CPU/eval
100	1,000	3	1.0 ms
1,000	1,000	3	7.5 ms ← realistic Scope3-shape load
1,000	10,000	3	58 ms ← pathological tail (heavy users)

CPU scales in packages × log_entries × identities. The pathological tail is addressed by the algorithmic optimization in adcp-go#103 (heuristic-gated prefilter bucket; gated at numPackages > 50 to avoid regressions on small requests):

packages	log entries	identities	Before	After	Speedup
1,000	100	3	784 µs	71 µs	11.0×
1,000	1,000	3	7,566 µs	287 µs	26.4×
1,000	10,000	3	57,861 µs	1,500 µs	~38×

Production sizing also depends on valkey round-trip latency, tail behavior under load, and the heavy-user impression-distribution shape. Mock-store CPU is the floor, not the production number.

Documentation Index

​Impression Tracker Implementation Reference

​The cross-identity dedup problem

​impression_id rules

​fcap_keys label model

​Reference data model (valkey-backed, log-based)

​Exposure log (per identity)

​Fcap policy (per fcap_key)

​Package configuration (per package)

​Write path: pixel → log

​Evaluating whether this impression exhausted a cap

​SDK primitives

​Production topology pattern

​Conformance scenarios

​Scenario A — multi-identity dedup

​Scenario B — cross-seller advertiser cap

​Performance reference

​See also