Realign Squad/GK readers to registered squad-history source
The inventory discipline check surfaced that `data/availability/raw/` is an un-registered orphan directory — the Championship signal coverage has been running off 24 hand-placed files in a dir that `generate-data-inventory.ts`
The inventory discipline check surfaced that data/availability/raw/ is an un-registered orphan directory — the Championship signal coverage has been running off 24 hand-placed files in a dir that generate-data-inventory.ts explicitly ignores. On a fresh clone those files don't exist.
The canonical source for this data is squad-history, output by scripts/scrape-tm-squad-history.py → data/squad-history/raw/{tmId}-{year}.json. 94MB is already scraped locally (876 files for 2025 season alone) but gitignored, so the deployed container never sees it.
Changes:
- squad-strength.ts + gk-change.ts: rewrote the loaders to consume
data/squad-history/raw/{tmId}-{year}.json directly. Added squadHistoryToAvailability() to convert to the internal AvailabilityClub shape so the per-round scanning logic stays untouched.
- Team-name → tmId resolution uses the same 3-convention fuzzy lookup
(canonicalToMI → miToCanonicalName → stripped suffix) that findTeamInMap uses, so "Coventry City"/"Coventry"/"Derby County"/"Derby" all resolve to the right TM ID.
- .gitignore narrowed to
!*-2025.json— 876 current-season files (~28MB)
committed so the image has Squad/GK data for the current season.
- scripts/smoke-squad-gk.ts — tiny verifier that hits each league.
Coverage impact (measured locally):
- EPL 19/19, Championship 21/21, Bundesliga 14/14, Serie A 14/14 teams
all have populated files.
- La Liga 5/15 populated (scraper produced empty files for 10 teams).
- Ligue 1 0/17 populated (full re-scrape needed — separate task).
Smoke test confirms:
- Chelsea R20 correctly flags backup GK (Sánchez → Jørgensen)
- Arsenal R15 correctly returns 20.6% squad missing
- Real Madrid R15 available=false (empty file — expected, re-scrape needed)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>