sofascore-shot-xg v1: first derived artifact committed (944KB)
Real proof the getOrCompute() pipeline works end-to-end on production data: - Inputs: sofascore_shots (156,794 rows in Supabase) + xg_pre_shot_v1_compact.json
Real proof the getOrCompute() pipeline works end-to-end on production data:
- Inputs: sofascore_shots (156,794 rows in Supabase) + xg_pre_shot_v1_compact.json
- Computed inside the cron docker container on supabase_default network
- Output: data/derived/sofascore-shot-xg/v1-b1859a472f21.json (944KB)
- Manifest: v1-b1859a472f21.manifest.json — content-hashed inputs, portable repo-relative paths
- 6,187 matches × {homeXG, awayXG, homeShots, awayShots}
- Mean xG/shot: 0.074 (sane for soccer; v1 model + missing Tier 2 features under-predicts slightly)
Sample: event=12000882 home=2.44 (24 shots), away=0.77 (12 shots) — typical EPL match.
Inventory regenerated to catalog the new derived source. 17 sources tracked.
This commit closes the loop on the inventory system: a session asking "can I get per-shot xG for sofascore data?" now finds: - INVENTORY.md row showing sofascore-shot-xg as a derived source - data/derived/sofascore-shot-xg/v1-*.manifest.json showing model version + inputs - The actual aggregates JSON (no recompute needed unless inputs change)
To re-run with new shots / model: npx tsx scripts/compute-sofascore-shot-xg.ts First call computes (~80ms with cached node + DB access), subsequent calls hit cache in <10ms.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>