Test Insights

Flaky tests block your whole team.
Quarantine them.

Test Insights detects flaky tests at PR time and auto-quarantines the unreliable ones. Track test health across your default branch. Catch regressions before they merge.
Your CI stays green. Your team stays focused.

Test Insights dashboard showing test health scores and flaky test detection

Trusted by the best engineering teams

From fast-moving startups to well-known enterprises

Cerebras
Apex Fintech Solutions
Luminar
Back Market
Jane
Botify
AheadComputing
PayFit
Productboard
Zama

One flaky test. A full day wasted.

Flaky tests don't just fail. They cascade. Here's what a single unreliable test does to your team every day.

1 flaky test Merge queue blocked 8 PRs stuck waiting CI re-runs start 3 engineers stop to investigate 45 min wasted per engineer "Just re-run it" becomes the norm Repeats every single day

Most teams have 10, 20, or 40 of these tests. Nobody tracks them. Nobody owns them. The cost compounds silently until CI results mean nothing.

The fix

Three pillars of test reliability

Test Insights gives your team a complete system for managing flaky tests. Prevent them from landing. Detect them when they appear. Quarantine them until they are fixed.

Other tools detect flaky tests. We're the only ones that also prevent them at PR time, quarantine them automatically, and track their health over time.

Pillar 1

Prevention

Catch flaky tests at PR time, before they merge to main. Test Insights runs your tests multiple times on every pull request and flags the ones producing inconsistent results. The author gets a PR comment telling them exactly which test is unreliable and since when.

No more merging a "green" PR only to discover the test was flaky all along.

Test Insights Prevention view showing caught flaky tests per PR
Test detail page showing confidence score and failure history for a quarantined test

Pillar 2

Detection

Every test in your suite gets a confidence score based on its pass/fail history on the default branch. Health statuses (Healthy, Flaky, Broken) make it obvious which tests need attention and which are safe to trust.

No more guessing whether a failure is real or just noise.

Pillar 3

Quarantine

When a test is confirmed flaky, Test Insights quarantines it. The test still runs in CI, but its result no longer blocks merges or marks the pipeline as failed. When the test stabilizes, quarantine lifts automatically.

No manual triage. The system handles it.

test_user_login Healthy test_checkout_flow Healthy test_payment_webhook Quarantined test_api_rate_limit Healthy

Without these three pillars, flaky tests win

Most teams underestimate how much flaky tests cost them until the problem is measurable.

Developer trust breaks down

When tests fail randomly, engineers stop trusting CI. They re-run, skip, or ignore results. Quality slips.

Merge queues grind to a halt

One flaky test blocks a batch of PRs. The whole queue retries. Multiply that across a day and you lose hours of engineering time.

Debugging is a time sink

Is the test broken or flaky? Engineers waste hours answering a question the tool should answer for them.

The backlog grows silently

Without tracking, flaky tests accumulate. By the time it is painful, there are dozens to triage and no data to prioritize them.

We had 40 known-flaky tests and no system to deal with them. Engineers just re-ran the suite and hoped. Test Insights quarantined the worst offenders on day one, and within a month we actually fixed most of them because we could finally see which ones mattered.

Lucas Meier

Lucas Meier

Staff Software Engineer at Norde

Want to see Test Insights on your test suite?

Works with any test framework. Native plugins for zero-config setup, or upload JUnit XML from any runner.

Book a discovery call

Works with any CI platform

Test Insights works at the test framework level, not the CI platform level. GitHub Actions, Jenkins, CircleCI, Buildkite, or anything else. If your tests produce results, Test Insights can read them.

Works with any test framework

Native plugins available for select frameworks, with more shipping regularly. Any framework that outputs JUnit XML is supported.

Jest
pytest
Cypress
RSpec
Go
Rust
JUnit
PHPUnit
Playwright
Vitest
NUnit
MSTest
minitest
Pest
TestNG

Don't see yours? Tell us and we'll probably build it.

Bring your own agent

Your AI agent doesn't know which tests are flaky. Now it can.

AI coding tools write code and debug failures, but they waste time investigating tests that were already known-flaky. The Mergify CLI gives any agent access to your test health data. No integration needed. Just a CLI call.

Terminal

$ mergify ci flaky-tests --repo myorg/myapp

Found 7 quarantined tests in myorg/myapp:

  test_payment_webhook         flaky    quarantined 3d ago

  test_email_delivery          flaky    quarantined 1w ago

  test_rate_limiter_edge_case   flaky    quarantined 2d ago

Your AI agent can now skip these when debugging CI failures.

One CLI call. Any agent.

Cursor, Claude Code, GitHub Copilot, Windsurf, Codex, or your own custom agent. If it can run a shell command, it can query your test health data.

Stop debugging noise

When CI fails, your agent checks which tests are quarantined and skips them. It only investigates failures that are actually new.

No plugins, no setup

The CLI outputs structured data any tool can parse. Add it to your agent's instructions and it just works.

Integrating Mergify transformed our development process. It gives us full control over merges and schedules. It streamlined our workflow, helped catch issues early, and improved team efficiency and software reliability.

Sean Davis

Sean Davis

Senior CI/CD Engineer at Ava Solutions

Stop letting flaky tests slow your team down

Purpose-built for teams who take delivery speed and reliability seriously.