Feature toggles (also known as feature flags) allow teams to modify system behavior at runtime without changing or redeploying code. By wrapping new logic in a conditional check, developers can ship “dark” code to production and enable it only when ready.

How It Works

Every toggle is a named boolean (or multi-variant) condition evaluated at runtime. The application checks the toggle store and takes the appropriate code path. Toggle state is managed outside the codebase — enabling or disabling a feature requires no redeployment.

  • Toggle Check: A request arrives. The application checks the toggle store for the named flag, passing the current user context if targeting rules (e.g., country, user ID) apply.
  • Branching: If the flag is ON, the new code path executes. If OFF, the existing path runs. The caller sees no difference in the interface.
  • Runtime Control: An operator can flip the flag in the toggle store at any time to enable a feature, start an experiment, or kill a misbehaving service.

Toggle lifecycle: A toggle moves through four stages.

  1. Create: Wrap new code behind a toggle; default OFF in production.
  2. Validate: Enable for internal users, then a small canary percentage.
  3. Roll out: Increase rollout percentage; monitor error rates and business metrics.
  4. Clean up: Once fully released and stable, remove the toggle and dead code path to avoid technical debt. Set a calendar reminder or use a “stale flag” alert when the toggle is created.

Toggle categories: Toggles differ by lifespan and ownership.

Category Lifespan Who controls? Example
Release toggle Days to weeks Engineering Enable incomplete feature on main branch
Experiment toggle Weeks Product / Data A/B test a UI variant
Ops toggle Hours to days Operations Kill switch for a misbehaving service
Permission toggle Long-lived Product Beta access for paying customers

Failure Modes

  • Toggle debt: Toggles that are never cleaned up multiply over time, creating a combinatorial explosion of code paths that is impossible to test or reason about.
  • Stale toggles in tests: Tests that hard-code toggle states can become misleading, where new code paths are never exercised or old paths are never removed from the suite.
  • Evaluation Latency: Frequent checks against a remote toggle store (e.g., an external SaaS) can significantly increase request latency if not cached locally with a short TTL.
  • Inconsistent evaluation: Toggle state evaluated multiple times in a single request (e.g. once in the UI, once in the API) may differ if the store changes mid-request, causing incoherent behavior.
  • Configuration drift: Toggle state in staging diverges from production; a feature passes QA but breaks in production because the default flag values differ.
  • Toggle Collision: Two independent toggles affecting the same code area create unexpected side effects when enabled together (e.g., Toggle A changes the UI layout, Toggle B changes the data format).

Verification

  • Automated Inventory: Keep a registry of all active toggles with their expected expiry dates; fail the build or send an alert if any toggle is older than its maximum allowed lifespan (e.g., 90 days).
  • Dual-Path Testing: For critical toggles, run the automated test suite with the toggle both ON and OFF in CI to ensure no regressions in either path.
  • Canary Monitoring: After each increment of a percentage rollout, monitor error rates, p95 latency, and key business metrics (e.g., conversion) for a defined period (e.g., 30 minutes) before proceeding.
  • Kill-switch Drill: Periodically verify that an ops toggle can be flipped to OFF and the change propagates across the system within the SLA recovery window (e.g., ≤ 5 minutes) without a code deployment.
  • Dark launching: The new code path executes in production (often reading data or making calls) but its output is discarded—used to validate performance and correctness before exposure.
  • Percentage rollout: The toggle is ON for a configurable fraction of users (e.g. 5 %, then 25 %, then 100 %), allowing gradual exposure and early detection of issues at scale.
  • User-segment targeting: Toggles scoped to specific user attributes (country, plan, cohort) for localized releases and targeted experiments.
  • Branch by Abstraction: A technique for making large-scale changes by introducing an abstraction layer that can toggle between old and new implementations.

References