Feature flags and trunk-based development
Trunk-based development without feature flags collapses back into long-lived branches inside a quarter. Flags are how you ship code that is not finished yet without users seeing it. The patterns, the types, and how to keep flag debt from eating your config.
In one paragraph
Feature flags decouple "code is in main" from "users see this feature." That decoupling is what lets trunk-based development keep branches short even when features take weeks to build. The hard part is not adding flags, it is removing them. Most teams accumulate flag debt because the work to retire a flag falls off the backlog the moment the feature is live.
Why trunk-based development needs flags
Without flags, an engineer working on a multi-week feature has two options. Either they hold the work locally on a long-lived branch (the failure mode trunk-based development was designed to fix), or they refuse to merge anything until everything is done (same problem, different appearance).
With flags, the third option appears: merge each piece of the feature to main as soon as it is reviewable, behind a flag that is off in production. The branch closes in hours. The feature ships in weeks. The two timelines stop fighting each other.
A typical rollout
flowchart LR C["Code merged<br/>flag OFF"] D["Internal users<br/>flag ON"] P1["1% of traffic"] P10["10% of traffic"] P100["100% of traffic"] K["Flag deleted"] C --> D --> P1 --> P10 --> P100 --> K style C fill:#F2F4F7,stroke:#43A7E5,color:#1A1D24 style D fill:#F2F4F7,stroke:#43A7E5,color:#1A1D24 style P1 fill:#FFF4E5,stroke:#F27B2A,color:#1A1D24 style P10 fill:#FFF4E5,stroke:#F27B2A,color:#1A1D24 style P100 fill:#E6F8F2,stroke:#1CB893,color:#1A1D24 style K fill:#E6F8F2,stroke:#1CB893,color:#1A1D24
Code merges with the flag off. Internal users flip it on. Production traffic ramps in stages until 100%. The flag gets deleted. Total time: typically two to six weeks.
The flag is off when the code lands. Engineers and internal users flip it on for themselves to dogfood. Once the feature is reasonably stable, the flag rolls out to a small fraction of production traffic, then a larger fraction, then all of it. When the rollout is complete and rollback is no longer plausible, the flag and the old code path get deleted.
That last step is the one teams skip.
The four kinds of flags
Pete Hodgson's feature toggle taxonomy on martinfowler.com is the standard reference. Four kinds, with very different lifetimes and very different retirement strategies.
| Type | Purpose | Lifetime | Retirement |
|---|---|---|---|
| Release toggle | Hide unfinished work | Days to weeks | Delete after rollout |
| Experiment toggle | A/B test variants | Weeks to months | Delete when experiment ends |
| Ops toggle | Kill switches, circuit breakers | Long-lived | Stays as ops control |
| Permission toggle | Per-customer entitlements | Indefinite | Becomes permanent config |
Trunk-based development uses the first kind heavily and the second kind regularly. The third and fourth are valid but separate concerns, and they have to be tracked separately or they pollute the codebase with what looks like flag debt.
Flag debt and how to avoid it
Release toggles are supposed to be temporary. In practice, the work to retire one falls off the backlog the moment the feature is live. The flag stays. New code gets written behind if (newBillingFlow) branches. Old code paths stay alive forever in case anyone ever flips the flag back. The codebase grows two implementations of every feature with a flag deciding which one is real.
The patterns that prevent it:
- Expiry dates on every release flag. The flag library tracks when each flag was created. Anything past 90 days fails CI or shows up on a weekly report. Treat the flag as a TODO with a deadline.
- Retirement PR opens at the same time as the rollout PR. When the flag goes to 100%, the PR that deletes the flag is already drafted, sitting in review.
- One owner per flag. If nobody owns the flag, nobody retires it. The owner is on the hook for cleanup, same as any code.
- Categorize at creation. Release vs experiment vs ops vs permission goes into the flag definition. Release flags get the deletion clock. Permission flags do not.
Tooling, briefly
Three buckets, depending on team size and tolerance for managed services.
- In-house, config-file based. A YAML or JSON file in the repo, read at startup, with simple targeting. Free. Works for small teams. Updates require a deploy.
- In-house, runtime-evaluated. A small service that reads from a database and evaluates rules. Targeting by user, percentage, environment. Slightly more work, no deploy needed to flip a flag.
- Managed (LaunchDarkly, Statsig, GrowthBook, Unleash). Targeting, analytics, audit logs, kill switches. Real money once the team grows.
The choice matters less than the discipline around retirement. A team with a homegrown flag file and a strict 90-day expiry policy ends up with cleaner code than a team using LaunchDarkly with no retirement process.
Flags ship code dark. The merge queue keeps main green.
Trunk-based development needs both. Mergify is the merge queue piece, and it works with whichever feature flag system you already use.