Imprevista

Deploy Log

← Back to Deploy Log
|Sports Dashboard|DEPLOYED

Cloud-lab v3 retro + Phase 0/1 fixes

Phase 0 (cloud-lab VM): fix LAST=0 idle-check bug that was killing the VM 5 min after every boot. /root/auto-shutdown.sh and /root/watchdog.sh now handle missing /tmp/last-job-activity explicitly (initialize + exit 0)

Phase 0 (cloud-lab VM): fix LAST=0 idle-check bug that was killing the VM 5 min after every boot. /root/auto-shutdown.sh and /root/watchdog.sh now handle missing /tmp/last-job-activity explicitly (initialize + exit 0) instead of computing IDLE = NOW - 0 = 55 years. Scripts + @reboot cron checked in under scripts/cloud-lab-bootstrap/ so VM rebuilds can re-install them.

Phase 1 (compute-worker): solver-cache/backtest/model-sweep-solve timeouts 12h → 24h with measured-workload comment. SSH key path now CLOUD_LAB_SSH_KEY_PATH env var (default unchanged). waitForSSH first-try window 15s → 60s to cover cloud-init startup. Dockerfile installs rsync alongside openssh-client — syncCloudLabResults was silently failing with "rsync: not found" on every cloud-lab job.

Smoke test (scripts/cloud-lab-smoke-test.ts): end-to-end validation of the three things that broke on Apr 11 — SSH reachability from the compute-worker container, rsync binary presence, and fire+sync round-trip. Run before letting production traffic onto any further compute-worker changes.

Full retro + rebuild plan at docs/specs/cloud-lab-apr11-retro-and-rebuild.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Files Changed

Commit:8629255