Blue-green deployment runs two identical application environments in production. At any moment one — blue — serves all live traffic while the other, green, sits idle. A new release goes onto the idle environment, where it is migrated, smoke-tested, and warmed up off the user’s path. When it passes, a single routing change cuts all traffic over. The old environment stays running as the rollback target: flipping back restores traffic in seconds, and restores correctness as long as the new version’s data changes stayed backward-compatible.

Blue-green deployment: a load balancer routes live traffic to the Blue (v1) environment while the validated Green (v2) one stays idle; flipping the router cuts traffic over to Green. Both share one database.

How It Works

  • Place blue and green behind one router or load balancer so traffic targets either on demand.
  • Deploy the new version to the idle environment; run migrations, smoke tests, and warm-up there.
  • Flip the router to send all production traffic to the validated environment in one atomic switch.
  • Keep the previous environment running untouched, so a second flip rolls back in seconds.

Failure Modes

  • A shared database takes an incompatible schema change, and the migration breaks the version still serving live traffic.
  • Sessions or connections pinned to the old environment drop at cut-over when the app skips connection draining.
  • The idle environment drifts from production config, so a release passes validation yet fails under real traffic.

Verification

  • Synthetic probes against the newly live environment stay green through the switch window, with zero failed checks.
  • Connection draining lets in-flight requests on the old environment complete at cut-over, with zero requests dropped during the switch.
  • A rollback drill flips traffic back and confirms the previous version resumes within the recovery-time objective (e.g. under 60 s).
  • Schema-compatibility tests pass against both old and new application versions before any migration runs.
  • Canary release shifts a small traffic share first, trading the all-at-once switch for graded exposure.
  • Rolling deployment replaces instances in place without a duplicate environment — lower cost, but no instant rollback.
  • DNS-based switching repoints a DNS record instead of a router — no shared routing layer needed, but TTLs and resolver caches make the cut-over gradual and rollback slow rather than atomic.

References