# HTTP 5xx Spike Runbook ## Overview This runbook covers the response procedure for HTTP 5xx error rate spikes. ## Detection - Alert triggers when 5xx rate > 5% for 5 minutes - Check Grafana dashboard: HTTP Error Rate ## Diagnosis Steps 1. Check recent deployments 2. Review application logs for stack traces 3. Check database connection pool utilization 4. Verify downstream service health ## Mitigation 1. If bad deploy: rollback immediately 2. Restart affected pods 3. Scale up if load-related ## Prevention Checklist - [ ] Run smoke tests before promoting to production - [ ] Set up canary deployments - [ ] Monitor error budgets