Julien Danjou

Dec 17, 2025

5 min

read

GitHub Merge Queue Was Step One. Real CI Orchestration Comes Next.

landmark photography of trees near rocky mountain under blue skies daytime
landmark photography of trees near rocky mountain under blue skies daytime

GitHub’s merge queue solved safety, not scale. As CI grows slower, costlier, and shared across teams, merging becomes a scheduling problem. Learn why large monorepos and Bazel users outgrow native queues and what real CI orchestration looks like.

When GitHub introduced its native merge queue, it solved a real and painful problem: how to stop teams from breaking main while multiple pull requests land in parallel. This was not new, as we wrote earlier; GitHub simply copied existing technology.

For many teams, especially small or medium-sized repos, that was enough. Predictive testing, automatic merges, and fewer rebase races: all real improvements.

But once CI stops being "fast and cheap" and becomes slow, expensive, and shared, the problem changes entirely.

And that's where most teams outgrow GitHub's merge queue.

When CI Becomes the Bottleneck

The merge queue problem is often explained as a correctness issue: semantic conflicts, green PRs breaking main, and accidental regressions.

In practice, for large teams, the fundamental constraint is different:

  • CI jobs take 30 minutes, 90 minutes, sometimes hours

  • Hardware, Bazel, E2E, or monorepo tests are finite resources

  • Hundreds of PRs compete for the same runners every day

  • A single flaky job can stall dozens of merges

At that scale, merging isn't just about safety anymore: it's about throughput, scheduling, and cost control.

A merge queue stops being a guardrail and becomes a traffic controller.

The Hidden Limits of GitHub's Merge Queue

GitHub's merge queue is intentionally simple. That's its strength: and its ceiling. Once CI gets heavy, teams start hitting the same walls: everything waits in one line.

A documentation change, a frontend tweak, and a deep backend refactor all queue together, even if they touch completely independent parts of the codebase:

  • CI work is duplicated: each PR runs CI on its branch, then again inside the merge group. As CI time grows, this duplication becomes expensive fast.

  • No notion of "related" vs "unrelated" changes: the queue doesn't understand monorepos, scopes, Bazel targets, or dependency graphs. All changes are treated the same.

  • No control under pressure: hotfixes, freezes, incidents, hardware scarcity; none of these are first-class concepts. Humans end up coordinating in Slack instead.

GitHub's queue answers the question:

“Can this PR merge safely?”

Large teams need a system that answers:

“How do we move many changes through CI without wasting time, money, or attention?”

Merge Queues Are Not Enough. You Need Merge Orchestration.

Once CI becomes a constrained resource, successful teams converge on the same primitives:

  • Batching to reduce total CI runs

  • Two-step CI to separate fast validation from expensive full tests

  • Priorities to unblock hotfixes without chaos

  • Freezes to handle incidents safely

  • Observability to understand where time and compute are going

This isn't theoretical. It shows up clearly in hardware companies, large SaaS monorepos, and Bazel-based stacks. At that point, the problem is no longer "merging PRs".

It's orchestrating CI flow. That’s the layer Mergify was built for.

Monorepos Change Everything

Monorepos make the CI problem sharper. In a monorepo:

  • Not every change affects the whole system

  • Running the full test matrix on every PR is wasteful

  • Independent changes should not block each other

Mergify introduces scopes as a shared language between your codebase, CI, and merge queue.

A scope represents a logical slice of your repository: a service, a package, a Bazel target set, a directory tree.

Once scopes exist:

  • CI jobs run only when their scope is touched

  • Merge queues batch compatible PRs together

  • Unrelated changes move in parallel instead of waiting in line

This applies equally to:

  • Bazel monorepos

  • Nx / Turborepo setups

  • Large multi-language GitHub Actions pipelines

GitHub Actions, But Smarter

To make this practical, we built Monorepo CI for GitHub Actions. Using a small helper action, your workflow can:

  1. Detect which scopes a pull request touches

  2. Conditionally run only the relevant jobs

  3. Aggregate results into a single CI gate suitable for branch protection

The result is simple:

  • fewer CI jobs

  • clearer signals

  • and a merge queue that understands what is being tested, not just that something ran

The same scopes can then be reused by the merge queue itself to build smarter batches and reduce re-testing.

Bazel, Hardware, and Long-Running CI

Bazel and hardware CI make these constraints impossible to ignore. When tests take 1–3 hours:

  • FIFO queues collapse under load

  • Flaky retries become extremely costly

  • "Just rerun it" is no longer acceptable

Teams running Bazel or hardware CI don't need more tests. They need fewer, better-scheduled ones.

This is where batching and two-step CI matter most:

  • fast checks validate intent early

  • expensive jobs run only when a PR is truly ready

  • batches amortize cost across multiple changes

Mergify doesn't replace Bazel. It makes Bazel usable at scale in a multi-PR world.

CI Insights: Seeing What the Queue Hides

One thing most merge queues ignore entirely: what happens inside CI. If jobs are getting slower, flakier, or retrying endlessly, the queue just feels "slow". Mergify’s CI Insights layer exposes:

  • job duration trends

  • flaky behavior

  • skipped vs executed jobs

  • wasted compute over time

This turns merge delays from a mystery into a diagnosable system. You can’t optimize what you can’t see.

The Pattern We See Everywhere

Across monorepos, Bazel stacks, hardware teams, and fast-moving SaaS companies, the pattern repeats:

  • GitHub merge queue works, until it doesn't

  • CI becomes the real bottleneck

  • Teams rebuild scheduling, batching, and control outside GitHub

  • Merge queues evolve into merge orchestration

Mergify exists at that layer. Not because GitHub's solution is "bad", but because once CI is expensive, simplicity alone is no longer enough.

Closing Thought

Merge queues were the first step toward safer collaboration.

The next step is treating CI as a shared, constrained system that must be orchestrated deliberately, especially in monorepos and Bazel-based environments.

That's the problem we've been building for over the last five years. And it’s the one that more teams are discovering every month.

Stay ahead in CI/CD

Blog posts, release news, and automation tips straight in your inbox

Stay ahead in CI/CD

Blog posts, release news, and automation tips straight in your inbox

Recommended blogposts

5 min

read

GitHub Merge Queue Was Step One. Real CI Orchestration Comes Next.

GitHub’s merge queue solved safety, not scale. As CI grows slower, costlier, and shared across teams, merging becomes a scheduling problem. Learn why large monorepos and Bazel users outgrow native queues and what real CI orchestration looks like.

Julien Danjou

5 min

read

GitHub Merge Queue Was Step One. Real CI Orchestration Comes Next.

GitHub’s merge queue solved safety, not scale. As CI grows slower, costlier, and shared across teams, merging becomes a scheduling problem. Learn why large monorepos and Bazel users outgrow native queues and what real CI orchestration looks like.

Julien Danjou

5 min

read

GitHub Merge Queue Was Step One. Real CI Orchestration Comes Next.

GitHub’s merge queue solved safety, not scale. As CI grows slower, costlier, and shared across teams, merging becomes a scheduling problem. Learn why large monorepos and Bazel users outgrow native queues and what real CI orchestration looks like.

Julien Danjou

5 min

read

GitHub Merge Queue Was Step One. Real CI Orchestration Comes Next.

GitHub’s merge queue solved safety, not scale. As CI grows slower, costlier, and shared across teams, merging becomes a scheduling problem. Learn why large monorepos and Bazel users outgrow native queues and what real CI orchestration looks like.

Julien Danjou

5 min

read

Stop Lying to Your Dependency Resolver: The Real Rules for Python Dependency Management

Your Python app didn’t change: your dependencies did. This post explains why apps must pin dependencies, libraries must declare ranges, dev tools must be locked, and how to use lockfiles correctly with Poetry, PDM, and uv to avoid CI and production surprises.

Mehdi Abaakouk

5 min

read

Stop Lying to Your Dependency Resolver: The Real Rules for Python Dependency Management

Your Python app didn’t change: your dependencies did. This post explains why apps must pin dependencies, libraries must declare ranges, dev tools must be locked, and how to use lockfiles correctly with Poetry, PDM, and uv to avoid CI and production surprises.

Mehdi Abaakouk

5 min

read

Stop Lying to Your Dependency Resolver: The Real Rules for Python Dependency Management

Your Python app didn’t change: your dependencies did. This post explains why apps must pin dependencies, libraries must declare ranges, dev tools must be locked, and how to use lockfiles correctly with Poetry, PDM, and uv to avoid CI and production surprises.

Mehdi Abaakouk

5 min

read

Stop Lying to Your Dependency Resolver: The Real Rules for Python Dependency Management

Your Python app didn’t change: your dependencies did. This post explains why apps must pin dependencies, libraries must declare ranges, dev tools must be locked, and how to use lockfiles correctly with Poetry, PDM, and uv to avoid CI and production surprises.

Mehdi Abaakouk

9 min

read

Lessons From a Noisy Monitor

Your database monitors keep firing even though nothing is wrong? We hit the same problem: noisy IOPS alerts caused by predictable jobs. This post explains how we replaced brittle thresholds with an SLO-based approach that restored signal, eliminated noise, and stopped the monitor from "crying wolf."

Julian Maurin

9 min

read

Lessons From a Noisy Monitor

Your database monitors keep firing even though nothing is wrong? We hit the same problem: noisy IOPS alerts caused by predictable jobs. This post explains how we replaced brittle thresholds with an SLO-based approach that restored signal, eliminated noise, and stopped the monitor from "crying wolf."

Julian Maurin

9 min

read

Lessons From a Noisy Monitor

Your database monitors keep firing even though nothing is wrong? We hit the same problem: noisy IOPS alerts caused by predictable jobs. This post explains how we replaced brittle thresholds with an SLO-based approach that restored signal, eliminated noise, and stopped the monitor from "crying wolf."

Julian Maurin

9 min

read

Lessons From a Noisy Monitor

Your database monitors keep firing even though nothing is wrong? We hit the same problem: noisy IOPS alerts caused by predictable jobs. This post explains how we replaced brittle thresholds with an SLO-based approach that restored signal, eliminated noise, and stopped the monitor from "crying wolf."

Julian Maurin

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.