CardSearch AI
Methodology · v2026.04.0 · Macro

Macro market-sizing methodology

How we build the macro ladder — what counts as "the US TCG market", where the figures come from, how the monthly research cron revises them, and why no figure is a single point estimate.

1. Category taxonomy

The macro layer is structured as a tree of categories with a parent/child relationship. Pokémon US singles is a child of Pokémon US secondary, which is a child of US TCG secondary, and so on. The tree is the source of truth for which figures can be stacked and which cannot.

Categories are coded once and stable. Adding a category is a methodology MINOR bump; renaming or restructuring is MAJOR (the UI surfaces the change everywhere the category appears).

2. Three figure types per category
  • Primary — sealed product moving from publishers/distributors into retail.
  • Secondary (GMV) — the resale market: eBay, TCGplayer, dealers, auction houses, LCS storefronts, online consignment.
  • Revenue / consumer spend — the publisher- or industry-level figure used for context (TPCI revenue, Newzoo global games consumer spend).

The three are stored in distinct rows and never summed. The ladder UI only ever stacks figures of the same type within a parent/child branch.

3. Sourced vs modeled

Each figure carries a trust label:

  • Measured reserved for direct publisher disclosures we ingested (TPCI revenue, eBay 10-K segment).
  • Modeled for everything else — including most secondary GMV figures, since no third party publishes the US TCG secondary GMV. The model is documented and rerunnable.
  • Backcast reserved for vintage card historical estimates; not used at the macro layer.
4. Modeled figures — the bottom-up cross

For US Pokémon Secondary (and the analogous TCG categories), we compute a bottom-up estimate from observed marketplace activity:

  1. Sum measured eBay sold-listing GMV per game per month.
  2. Add observed TCGplayer GMV (derived from TCGCSV daily price tape × volume estimates per SKU).
  3. Add observed graded-card GMV from 130point + auction houses.
  4. Apply a documented coverage factor (eBay covers ~46–55% of total US secondary by our measurement; documented per period and stored on the figure).
  5. Triangulate against any top-down figure that does exist (Circana headline TCG numbers, ICv2 rankings) and widen the confidence band where the two diverge.

The result is a range(low / base / high). The published "value" is always the base; UI components default to showing the band when one exists.

5. The monthly GPT-5 research cron

On the first of every month a job runs /api/cron/csm/macro/refresh. It:

  1. Iterates every (category, year) row not flagged isLocked.
  2. Calls a GPT-5 research agent with a structured prompt asking for any new public publisher figure that updates that row.
  3. If the agent returns a citation, the new figure is auto-applied. The prior value, the new value, the agent's reasoning, and the cited URL are written to csm_macro_figure_revisions as an audit row.
  4. If a figure is operator-locked (e.g. an analyst pinned the row) the cron skips it and notes the skip.

The audit log is exposed under §7 Revision history and is exportable as JSON via /api/market/macro/revisions.

6. Confidence bands

Every figure stores a confidence in [0, 1] and the UI bins it into Very high / High / Moderate / Low / Very low. Confidence reflects:

  • How well-corroborated the figure is by independent sources
  • How recent the underlying disclosure or measurement is
  • How wide the low/high band is relative to the base value
7. Revision history

Every macro figure revision creates a row in csm_macro_figure_revisions with the prior value, new value, source citation, and methodology version. The full log is exposed under the admin macro page; a public summary appears here on the methodology doc when revisions ship.