Hello developers! How do you use feature flags to ship safely? What do you flag, how do you roll out, and how do you clean up afterward? Please share:
• What you flag: release toggles, ops/kill switches, experiments.
• Rollout method: cohorts/percentages, internal → beta → GA, rollback plan.
• Guardrails: defaults/fallbacks, timeouts, dependency rules, fail-open/closed.
• Observability: metrics, alerts, dashboards, audit logs.
• Governance: ownership, naming, config-as-code, expiry dates/cleanup policy.
• Cleanup: when you remove flags, how you track & enforce removals.
Note your tooling (homegrown or vendor), top do/don’ts, and the key metric you watch to call it safe.
Thanks for sharing your experience!
Here’s the quick version I share with teams when we’re ramping up flags: use them to make every deploy reversible, observable, and boring. Small toggles, measured ramps, instant rollback, and a calendar reminder to delete the darn things. 👇
What we flag
Release toggles for incomplete code paths
Ops/kill switches (e.g., force_readonly, disable_payments)
Experiments (A/B/MVT)
Risky integrations & pricing/entitlements
Rollout
Cohorts: staff → dogfood → beta → % ramp (1 → 5 → 20 → 50 → 100)
Progression gates: advance only if SLOs hold and error-budget hit < threshold
Rollback: one-click off; dark path callable; documented plan
Guardrails
Defaults off, config turns on
Timeouts + fallbacks (circuit breaker around flagged code)
Dependencies: depends_on: billing_v2 && auth_sso
Fail-closed for security/billing; fail-open for UX
Observability
Per-variant metrics: error rate, p95/p99, throughput, conversion
Flag-scoped alerts: errors{flag=on} > 1% for 5m
Dashboards comparing ON vs OFF; audit logs (who/why/when)
Governance
Owner per flag, naming: <domain>.<feature>.<purpose>
Config-as-code via PRs; link to ticket/rollback notes
Expiry dates + bots/CI nags at T+14/30/60
Cleanup
Release flags: remove after 7–14 stable days at 100%
Experiments: merge winner, archive results
Kill switches: keep; quarterly review
Enforcement: nightly expired-flag report; CI fails on stale refs
Tooling
Vendor: LaunchDarkly / Flagsmith / Split
Standard: OpenFeature
Homegrown: JSON + S3/Redis + cache + webhooks
Do/Don’t
Do: keep flags thin, test both paths, document risk/rollback
Don’t: nest flags, let them live forever, or bundle unrelated changes
“Safe to ship” metric
Primary: Δ error rate + Δ p95 (ON vs OFF) within budget for ≥30 min
Secondary: KPI guardrails hold.
Sign in to share your expertise and help the community.
Sign In to AnswerNo ads/solicitation. No CTAs, coupons, or "DM us."
No spam/low-effort. Off-topic, duplicates, AI dumps, or link-only posts.
Respectful & professional. Debate ideas, not people.
Be helpful. Give concrete steps, examples, or numbers.
Links: plain text only, directly relevant, summarize key points inline; no shorteners, tracking, or gated pages (max 1).
Disclose affiliation when referencing your own company/resource.
Safe & legal. No illegal services, privacy or safety violations.
Stay on-topic.
Violations may be removed; repeat issues can lead to suspension.
See something off? Report it.