Skip to content
Thomas Berdy Thomas Berdy
May 20, 2026 · 5 min read

We Spent Years Hardening Jinja2 in User Config. We're Removing It Instead.

We Spent Years Hardening Jinja2 in User Config. We're Removing It Instead.

After years of patching a Jinja2 sandbox against hostile templates, we pulled every customer config, learned what users actually did with the feature, and replaced it with a narrow declarative schema.

Mergify’s config lets users template merge commit messages with Jinja2. Keeping that safe turned out to be a permanent job. We’re removing the feature.

The cost we kept paying

When a user writes commit_message_template in their .mergify.yml, Jinja2 renders that string against a context full of pull request data. Jinja2 ships with a SandboxedEnvironment for exactly this case, and the defaults aren’t enough for hostile input. We built more around it. We are not going into specifics here for the same reason we don’t publish the layout of our front door locks.

What we will say is that the maintenance never ended. Every Jinja2 release and every customer report of an unexpected edge case was a reason to revisit the same module. Shipping the feature in the first place was the easy part. What kept us busy was the slow drip of follow-up work any user-supplied template engine forces on you.

Switching engines wouldn’t have saved us. Every templating engine has its flaws, its own catalog of edge cases that have to be patched as they emerge. The choice to accept user-supplied templating is what creates the cost.

There are also Jinja2’s own quirks unrelated to security: upstream filter bugs, behavior shifts between major releases. Stacked over years, none of these are dealbreakers on their own, but they are not free either.

What our users actually do with it

Complicated templates mostly do not exist in the wild. We pulled every customer config in April. Reading every config by hand to spot patterns is the kind of task an LLM is good at, so we had Claude do the first pass: cluster the commit_message_template values by shape, and flag the outliers. Then a human went through the clusters and asked the second question for each shape: is this something we want to keep supporting in the product, or is it a clever use we accidentally enabled and would rather not?

Most commit_message_template users wrote either a title alone, or a title plus a body. Some configs added a Co-authored-by: loop on top.

Roughly what we kept seeing, in pseudo-form:

{{ title }}

{{ body }}
{% for c in commits %}
Co-authored-by: {{ c.author }}
{% endfor %}

That’s the shape, more or less. Shipping the flexible primitive paid off as a discovery tool. Give people a powerful tool, watch what they build with it, and you get free product research. You also have to keep that tool in production while you run the study, with its full attack surface intact the entire time.

What the replacement looks like

Instead of one Jinja2 template per queue rule, the new field is a declarative block.

commit_message_format:
  title: inherit          # or pr-title
  body:  inherit          # or pr-body, or empty
  trailers: []            # any subset of co-authored-by, approved-by, merged-by

That covers the cases the analysis surfaced. The exotic uses we left out are the ones we never managed to map back to a coherent product surface. (Users are surprisingly good at using a feature for something completely orthogonal to its purpose. Some of those uses are clever. Most of them you cannot keep supporting without distorting the rest of the product.)

The inherit value does the most work. When it’s set, Mergify omits the corresponding key from the GitHub merge endpoint (PUT /repos/{owner}/{repo}/pulls/{pull_number}/merge) entirely, which makes GitHub’s repo-level “default commit message” setting render that side. By delegating to the system that already renders these commits, the schema gets the expressiveness GitHub already offers without adding new code we have to defend. It’s also the cheapest possible defaults strategy: if the user changes their repo settings later, the merge commits update with no Mergify config change.

The old field still works. The dashboard hides it from the editor. We’re tracking adoption with a counter so we know when the legacy path can be retired safely.

Why now and not earlier

The threat model moved while we weren’t looking. Supply-chain incidents are weekly news now (the recent bitwarden-cli poisoning shipped a malicious build to anyone who installed before maintainers caught it). Different attack model, same lesson: any user-controlled input you render server-side reads differently in 2026 than it did in 2019. The security work that felt proportionate then no longer feels proportionate now.

2019 us made a defensible call. 2026 us, starting from a blank schema, wouldn’t.

What’s next

Jinja2 is not gone from Mergify. We still use it in several places in the engine. Each surface is going through the same review, and not all of them will end with a removal. Some are doing work that a small declarative schema cannot cover, and those stay. For the rest, the playbook is the one we used here: pull the config dump, look at what’s there, design a narrow declarative replacement, ship both side by side, deprecate.

Flexible features as research

Flexible user-facing primitives are a great way to learn what users want. They are also a permanent liability while they live in production. Treat them as time-boxed research. When the answers are in, replace the flexible thing with a narrow safe schema that covers the shapes you actually saw. The exotic configs you cannot map back to a coherent product surface are a roadmap you have not written yet.

Merge Queue

Tired of broken main branches?

Mergify's merge queue tests every PR against the latest main before merging. Try it free.

Learn about Merge Queue

Recommended posts

The Comfortable Room
March 6, 2026 · 5 min read

The Comfortable Room

Software engineering was a walled garden. AI just copied the key. The data is messy: 19% slower in trials, 30% more warnings, 322% more vulnerabilities. But the baseline wasn't pristine either. What's left isn't coding: it's judgment, taste, and knowing which room to build.

Rémy Duthu Rémy Duthu
Claude Didn’t Kill Craftsmanship
February 4, 2026 · 5 min read

Claude Didn’t Kill Craftsmanship

AI doesn't remove craftsmanship: it moves it. The goal was never to protect the purity of the saw. It's to build good furniture. Engineers can now focus on intent, judgment, and product quality instead of translating tickets into code.

Rémy Duthu Rémy Duthu
Should We Still Write Docs If AI Can Read the Code?
October 15, 2025 · 5 min read

Should We Still Write Docs If AI Can Read the Code?

AI can explain what code does — but not why it does it. This post explores how documentation is evolving in the age of AI, and why writing down human intent is becoming one of the most practical forms of AI alignment.

Alexandre Gaubert Alexandre Gaubert