Recovery Playbook
Recovery Playbook
Eval below threshold
HITL prompt fires automatically. Reject + add reason -> run terminates, recovery row on Monday.
Buffer publish failure
Automatic retry 3x (exponential). If still failing, alert in #recovery. Manual options:
- Reschedule for later
- Post a correction
- Mark as cancelled
Orphan run
runner.py detects stale state.json lock (>1h old) on next tick -> logs orphan, alerts.
Image generation failure
Run marked failed with image_unavailable. Re-run from Step 03 manually.
Wrong-country copy
Triggers country_copy_check failure -> HITL -> reject -> re-draft.
TODO: add SOP for each failure mode after W6 simulation pass.