Canary deployment releases a new version to a small fraction of production traffic before exposing everyone to it. The new version runs beside the stable one, and a router sends it a few percent of requests. Automated analysis compares the canary’s health against the stable baseline: if it holds, traffic shifts over in stages until the canary serves everyone; if it regresses, all traffic returns to the stable version and the canary is withdrawn. That rollback is immediate for traffic; restoring correctness also depends on the canary’s writes to shared state staying compatible.

Canary deployment: a router sends most live traffic to the stable v1 instances and a small share to the canary v2; automated analysis compares canary metrics against the baseline and promotes the rollout or routes traffic back.

How It Works

  • Deploy the new version beside the stable one; both read and write the same datastore.
  • Configure the router or service mesh to send a small share — often 1–5% — of traffic to the canary.
  • Compare canary error rate, latency, and saturation against the stable baseline over a fixed bake window.
  • Raise the canary’s share in stages while metrics hold; route all traffic back to stable on any regression.

Failure Modes

  • Low traffic or a tiny canary share yields too few samples, so the analysis reads noise as signal — or the bake window stretches for hours.
  • The canary and stable versions share a datastore, and an incompatible schema change from the canary corrupts data the stable version reads.
  • Session-affinity gaps route a user between versions mid-flow, exposing inconsistent behaviour.

Verification

  • A seeded bad build (e.g. an injected 5% error rate) trips automated canary analysis and rolls back before the share passes the first step.
  • The canary collects enough samples before each promotion step, and the analysis compares normalized rates or an equal-sized baseline cohort — not raw fleet totals — so the decision has statistical power.
  • A rollback drill confirms traffic returns to the stable version within the recovery-time objective (e.g. under 60 s).
  • Blue-green deployment switches all traffic at once rather than in graded steps — simpler, but with no progressive exposure.
  • Feature flags gate a release per user or cohort inside one running version, complementing traffic-level canarying.
  • Shadow traffic mirrors live requests to the new version without serving its responses, validating behaviour at zero user risk.

References