Metric Decomposer

Turn vague business metrics into evaluable sub-metrics that autonomous agents can optimize

"If you can't evaluate it, you can't auto-research it." — Karpathy

Core Insight

"Revenue" is not auto-researchable. But "checkout conversion rate at stage 3" is. The metric-decomposer is the systematic process for getting from the first to the second. Decomposition creates evaluability.

Business metrics are too vague for autonomous optimization

Top-level metrics like "Revenue" or "MAU" fail the MART test — they're measurable but not actionable or timely enough for an agent to optimize. The skill decomposes them into leaves that pass all four checks.

M
Measurable
Can we compute it?
A
Actionable
Can we pull a lever?
R
Relevant
Does it answer the question?
T
Timely
Fast enough to iterate?

Six decomposition techniques from the book

Each technique trades one hard metric for several easier ones. Pick the technique that matches your metric's structure.

🎯

Funnel Analytics

M = (M/m3) × (m3/m2) × ... × E

Sequential stages with conversion rates between each.

When: onboarding flows, sales pipelines, checkout
🛁

Stock-Flow

MAU_t = MAU_{t-1} + In - Out

Accumulating quantities split into inflows and outflows.

When: MAU, inventory, customer base, balances
💰

P × Q

Revenue = Price × Quantity

Value = unit rate × volume. Nonlinear relationship.

When: revenue, GMV, throughput, cost analysis

Additive

g_y = w1·g1 + w2·g2

Growth = weighted sum of segment growth rates.

When: total = sum of segments (regions, products)

Multiplicative

g_y = g1 + g2 + g1·g2

Growth of a product = sum of factor growth rates.

When: Revenue = ARPU × MAU growth analysis

Mix-Rate

Δy = Rate + Mix + Combined

Separates "each segment improved" from "segment mix shifted."

When: weighted averages, Simpson's Paradox risk

Marketplace Purchases → Decomposition Tree

A marketplace combines funnel (supply side) with P×Q (demand side). The top-level metric decomposes into five leaves, each scored for auto-research readiness.

Decomposition Tree with Evaluability Scores
Marketplace Purchases MART: 11/20
Purchases = (P/V) × (V/B) × B × (L/S) × S
P / V
Checkout efficiency
Q1 · 0.92
V / B
Buyer engagement
Q2 · 0.75
B
Buyer volume
Q2 · 0.75
L / S
Seller engagement
Q3 · 0.42
S
Seller volume
Q2 · 0.67
Q1: Auto-Research Ready
P/V — Hand to an agent. Define fitness function, set constraints, let it run autonomously.
Q2: Human-in-the-Loop
V/B, B, S — Agent scaffolds experiments. Human reviews design and approves before execution.
Q3: Danger Zone
L/S — Needs further decomposition or better verification before any optimization.

The Evaluability-Automation Matrix

Every leaf metric lands in one of four quadrants. The skill's job is to push metrics from Q4 toward Q1 through decomposition (horizontal) and tooling (vertical).

Low Evaluability
High Evaluability
High Automation

Q3: Danger Zone

Agent CAN produce output but you CANNOT verify quality. Goodhart's Law lives here. Do not auto-optimize.

Q1: Auto-Research

Fully autonomous. Define objective, set boundaries, let the agent run. This is the sweet spot.

Low Automation

Q4: Human Domain

Metrics design itself, causal reasoning, business judgment. Agent gathers info; human decides.

Q2: Human-in-Loop

Agent scaffolds and calculates. Human provides judgment, reviews, and approves before execution.

Decomposition moves metrics → right (improves evaluability)
Skills & tooling move metrics ↑ up (improves automation)

8 steps from business question to auto-research candidate

1

Gather business context

Identify the top-level metric and the business question it serves

scripts/new_decomposition.py --metric "..."
2

Score MART properties

Rate on Measurable, Actionable, Relevant, Timely. Low scores = decompose.

references/01-mart-checklist.md
3

Select decomposition technique

Decision tree: funnel, stock-flow, P×Q, additive, multiplicative, or mix-rate

references/02-07
4

Apply decomposition

Break metric into sub-metrics. Verify identity holds (M = M). Recurse if needed.

5

Identify proxy metrics

For unmeasurable leaves, find proxies and document causal assumptions

references/08-proxy-identification.md
6

Score evaluability

6 dimensions → score 0-1 → classify Q1/Q2/Q3/Q4 per leaf

references/09-evaluability-scoring.md · scripts/score_evaluability.py
7

Check for gotchas

Screen against 6 failure modes: proxy gaps, non-linear funnels, circularity, over-decomposition, Simpson's Paradox, interaction effects

references/10-gotchas.md
8

Produce decomposition tree

Final output: tree with evaluability scores. Q1 leaves get program.md configs for auto-research.

Six gotchas that break decompositions

Proxy-Causality Gap

Proxy scores well on MART but optimizing it doesn't improve the real outcome

Non-Linear Funnels

Users skip stages or loop back, breaking conversion rate math

Circular P × Q

Price depends on quantity (volume discounts), creating feedback loops

Over-Decomposition

50 leaf metrics = 50 optimization targets = high false discovery rate

Simpson's Paradox

Overall metric reverses direction of every segment's metric due to mix shift

Interaction Effects

Optimizing one leaf degrades an adjacent leaf, nullifying gains