Deployment Strategies (Canary, Blue-Green)
How you release a change is as important as the change itself. Releasing to everyone at once means a bad change hits everyone at once. Progressive strategies (canary, blue-green, rolling) expose a new version gradually, watch it, and make rollback instant. So a problem is caught on a slice of traffic instead of becoming a full outage.
The goal is to make releases low-risk and reversible. Rolling updates replace instances gradually. Canary sends a small percentage of traffic to the new version, watches the metrics, then ramps up if it is healthy. Blue-green runs two environments and switches traffic over, with an instant switch back. Combined with feature flags, you can fully separate deploying code from releasing a feature.
These build on CI/CD (the pipeline that delivers them), Feature Flags (release control), DORA (they improve change-failure rate and recovery time), Observability (you must watch to know it is healthy), and Backward Compatibility (old and new run side by side, so changes must be compatible).
Release gradually and reversibly
- DoPrefer progressive rollout (canary or rolling) over big-bang releases, so a bad change is caught on a small slice before it reaches everyone.
- DoGate each rollout step on health: watch error rates, latency, and key business metrics, and automatically halt or roll back on a regression (see Observability, SLOs).
- AlwaysHave a fast, tested rollback (or roll-forward) path for every deployment. Recovery speed is what makes frequent releasing safe (see Incident Readiness, DORA Metrics).
- DoKeep each release backward-compatible so old and new versions can run side by side during rollout (expand/contract for schema and contracts) (see Backward Compatibility, Schema Versioning).
- DoSeparate deploy from release using feature flags: ship code dark, then turn it on gradually and turn it off instantly if needed (see Feature Flags).
- ConsiderBlue-green for changes that are hard to roll forward, so you can switch traffic back instantly.
Do it safely through the pipeline
- DoAutomate the strategy in the pipeline (gates, ramps, rollback) so releases are repeatable and do not depend on manual steps (see CI/CD & Deployment).
- DoRun database migrations so they work with both versions during the rollout window. Never make a breaking schema change mid-canary (see Schema Versioning).
- DoPick canary traffic and segments deliberately, and make sure tenancy and data isolation hold across mixed versions (see Multi-Tenancy).
- DoRecord what was deployed, when, and to whom, for traceability and incident response (see Audit Trails, Runbooks & On-Call).
- AvoidReleasing everything to 100% at once for anything risky, or rolling out faster than your monitoring can actually detect problems.
- NeverDeploy by hand outside the pipeline, or skip the health checks or rollback path to push something out quickly (see CI/CD & Deployment).
// deploy new version to 100% at once; no canary; no health gate;
// rollback means a manual redeploy taking 30+ minutes
A bug now hits every user instantly, nobody is watching a slice to catch it early, and recovery is slow and manual. A small regression becomes a long, full outage.
// deploy -> 5% canary -> watch error rate/latency/key metric 15 min
// healthy? ramp 25% -> 50% -> 100%. Regression? auto-rollback.
// feature stays behind a flag until fully validated
A bad change is caught on 5% of traffic and rolled back automatically. A healthy one ramps safely. Small blast radius, fast recovery, which is exactly what improves DORA stability.
Self-review checklist
- AskIs this rolling out gradually, or hitting everyone at once?
- AskWhat is watched during rollout, and does a regression halt or roll back automatically?
- AskCan I roll back fast, and do old and new versions run together safely (schema included)?
- AskIs the whole thing automated through the pipeline, not manual?