Skip to content

Flaky tests in Cypress.
Named, fixed, and quarantined.

Flaky Cypress suites are not random. They follow patterns: cy.wait races, intercept-too-late, detached DOM, broken retry-ability, session leakage. Name them, fix them, quarantine what is left.
Your CI stays green.

By Rémy Duthu, Software Engineer, CI Insights · Published

mergify[bot] commented · 2 minutes ago Flaky test detected checkout flow › settles the pending promise src/checkout.test.ts:42 Last 3 runs on this commit: ✕ Failed ✓ Passed ✓ Passed Confidence on main: 98% 71% over the last 7 days Auto-quarantined by Test Insights This test no longer blocks your merge. Quarantine lifts when stable.
Example PR comment from the Mergify bot detecting a flaky Cypress test and quarantining it automatically.

Why Cypress is uniquely flaky

Cypress's command queue is the source of its expressiveness and the source of its flakes. Every cy.something() call is a deferred command, not a synchronous action: Cypress builds a chain, then runs it, with automatic retry on the last command until an assertion passes or the timeout hits. The model is brilliant when you stay inside the chain. The moment you drop into a native .then() with synchronous logic, retry-ability vanishes and your test becomes timing-sensitive.

Layer on the browser context: cookies, local storage, service workers, and an in-page network layer that cy.intercept only sees if the route is registered before the request fires. CSS animations make the click target move under your cursor. cy.origin guards keep tests from accidentally crossing domains, but only if you remember to declare them. Each rule has a clean fix once you can name the failure mode.

The patterns are finite. We've seen the same eight on Mergify Test Insights across hundreds of Cypress suites: cy.wait(ms) racing the page, cy.intercept registered after the request fires, detached DOM errors after re-render, synchronous .then() breaking command-chain retry-ability, session leakage without cy.session(), CSS animations stealing clicks, server state contamination from previous specs, and Cypress retries config hiding real bugs. Each has a clean fix once you can name it.

The 8 patterns behind most flaky suites

Pattern 1

cy.wait(ms) racing the page

Symptom. A test passes locally, fails on the slower CI runner, with `cy.click()` failing because the target was not in the DOM yet.

Root cause. cy.wait(ms) is a hardcoded sleep. It does nothing useful: Cypress already retries assertions and queries until the page settles or the timeout hits. A magic number works when the local machine renders in 200ms and breaks when CI takes 1100ms. Worse, it adds dead time to every passing run, slowing the suite down for the same reason it never actually waited for the right thing.

it("opens the modal", () => {
  cy.visit("/dashboard");
  cy.get("button.open-modal").click();
  cy.wait(500); // hope the modal is open by now
  cy.get("input[name=email]").type("user@example.com"); // fails if it isn't
});

Fix. Wait for what you actually need: an alias from cy.intercept, an assertion that an element exists, or a state change. Cypress retries the assertion on its own, with the configured timeout, so the test is fast when fast and patient when slow.

it("opens the modal", () => {
  cy.visit("/dashboard");
  cy.get("button.open-modal").click();
  // Cypress polls until the modal is in the DOM, with default timeout
  cy.get("[role=dialog]").should("be.visible");
  cy.get("input[name=email]").type("user@example.com");
});

With Mergify. Test Insights notices that the same spec only fails on the slower CI runner pool and never on the laptop pool. The dashboard tags the resource sensitivity so the timing assumption is the obvious place to look.

Pattern 2

cy.intercept registered after the request fires

Symptom. A test that asserts on a stubbed network response fails with `cy.wait('@alias') timed out`, but only intermittently and never on rerun.

Root cause. cy.intercept only catches requests that fire after the interceptor is registered. Calling cy.visit() first and cy.intercept after means a fast browser dispatches the XHR before Cypress installs the route. The request goes through to the real server, the alias never resolves, and the test waits until the timeout.

it("shows the user", () => {
  cy.visit("/profile"); // fires GET /api/me immediately
  cy.intercept("GET", "/api/me", { id: "u-1", name: "Rémy" }).as("me");
  cy.wait("@me"); // times out: the request already happened
});

Fix. Always register the interceptor before the action that triggers the request. The standard pattern is intercept first, then visit or whatever event causes the fetch.

it("shows the user", () => {
  cy.intercept("GET", "/api/me", { id: "u-1", name: "Rémy" }).as("me");
  cy.visit("/profile");
  cy.wait("@me"); // resolves cleanly
});

With Mergify. Test Insights records the failure URL and surfaces specs whose only failure mode is `cy.wait timeout`. The dashboard groups them so the registration-order pattern is easy to spot across the suite.

Pattern 3

Detached DOM errors after re-render

Symptom. A spec fails with `CypressError: Element is detached from the DOM` on a click that worked the previous five runs.

Root cause. Cypress queries the DOM, finds an element, and queues an action against it. If the page re-renders between the query and the action (a React state change, an SSR hydration finishing, an v-if flipping), the queried node is still in memory but no longer attached to the document. Cypress refuses to act on a detached node.

it("submits the form", () => {
  cy.get("button.submit")
    .then((el) => {
      // synchronous Cypress code path; queries once, holds the reference
      el.trigger("click");
      // by the time this fires, the form has re-rendered with a fresh button
      // → "Element is detached from the DOM"
    });
});

Fix. Stay inside the Cypress command chain so each command requeries on retry. Avoid grabbing a jQuery handle and acting on it across an async gap. When the chain itself crosses a re-render, scope the query to a stable parent so the requery picks the new child.

it("submits the form", () => {
  cy.get("form#checkout")
    .find("button.submit")
    .click(); // requeries 'button.submit' inside the form on retry
});

With Mergify. Test Insights flags specs that fail only with detached-DOM errors as re-render races, distinct from logic failures. The dashboard groups them by the component name in the query selector so the offending re-render is easy to find.

Pattern 4

Synchronous .then() breaking retry-ability

Symptom. A spec works against a mock and fails against a real backend, with the assertion inside a `.then()` reading stale data.

Root cause. Cypress only retries the last command in a chain. Wrapping a query in .then() and asserting inside the callback freezes the value at the moment of the callback's first execution. If the data was not ready yet, the assertion fails and Cypress never retries it because .then() succeeded.

it("loads the user count", () => {
  cy.get("[data-test=count]")
    .then(($el) => {
      // $el captured once. If the count was '0' on first paint, that's what
      // we assert against, even though the real value lands 100ms later.
      expect($el.text()).to.eq("42");
    });
});

Fix. Move the assertion into a chained .should() so Cypress retries the whole query+assertion until it passes or times out. Reach for .then() only when you need to capture a value for use outside Cypress.

it("loads the user count", () => {
  cy.get("[data-test=count]").should("have.text", "42");
});

With Mergify. Test Insights detects the mock-vs-real-backend signature: the same spec is reliable when the network is stubbed and unreliable when it hits a slower service. The dashboard surfaces the dependency so the retry-ability mistake is easy to locate.

Pattern 5

Session leakage without cy.session()

Symptom. A test fails on cold runs and passes on rerun, or a logout spec leaves the suite in a state the next spec did not expect because both share a hand-rolled login command.

Root cause. Cypress clears cookies, local storage, and session storage for the active origin between tests by default. A hand-rolled login command that calls cy.request and writes a token via cy.window().its("localStorage") reruns the full login on every test, which is slow on suites and brittle when login itself is racy. cy.session() exists to cache validated login state per key across tests; skipping it forces every test to redo the work and amplifies any flake in the auth path.

Cypress.Commands.add("login", (user) => {
  cy.request("POST", "/api/login", user).then((res) => {
    cy.visit("/");
    cy.window().then((win) => {
      win.localStorage.setItem("token", res.body.token);
    });
  });
});

beforeEach(() => {
  cy.login({ email: "user@example.com", password: "..." });
  // Login fires for every test. When the auth service is intermittently slow,
  // any spec can fail; rerun usually wins the race and the bug stays hidden.
});

Fix. Wrap the login work in cy.session(key, setup, options). Cypress runs the setup once per key, snapshots cookies + storage, and rehydrates them on every later test that requests the same key. Pass a validate callback so a stale cached session is rebuilt instead of silently reused.

Cypress.Commands.add("login", (user) => {
  cy.session(
    user.email,
    () => {
      cy.request("POST", "/api/login", user).then((res) => {
        cy.visit("/");
        cy.window().then((win) => {
          win.localStorage.setItem("token", res.body.token);
        });
      });
    },
    {
      validate: () => {
        cy.request("/api/me").its("status").should("eq", 200);
      },
    },
  );
});

With Mergify. Test Insights catches the order-dependent signature: a spec only fails after a specific other spec has run. The dashboard groups failures by the predecessor spec so the missing cy.session is easy to identify.

Pattern 6

CSS animations stealing clicks

Symptom. A `cy.click()` on a button fails with `actionability` errors mentioning `animation` or `position` changing between actionability checks.

Root cause. Cypress measures the element's position before clicking. If the element is mid-animation (a slide-in, a fade, a CSS transition triggered by the previous click), Cypress sees the position change between samples and refuses to act, fearing it would click the wrong target. Locally the animation finishes faster than Cypress measures; on CI the throttled CPU stretches the animation past the actionability threshold.

it("dismisses the toast", () => {
  cy.get("button.show-toast").click(); // toast slides in over 300ms
  cy.get(".toast button.dismiss").click();
  // CypressError: '.toast button.dismiss' is animating; aborting click
});

Fix. Disable animations in the test environment via a global CSS override or a config flag. For a one-off, force the click with the { force: true } option, but only after you have checked the animation is the actual cause and not a real layout bug.

// cypress/support/e2e.ts
beforeEach(() => {
  cy.get("head").invoke(
    "append",
    Cypress.$('<style>* { transition: none !important; animation: none !important; }</style>'),
  );
});

With Mergify. Test Insights tags actionability-related failures distinctly from assertion failures. The dashboard groups specs whose failures all mention 'animating' so the global config fix is one commit instead of 30.

Pattern 7

Server state contamination from previous specs

Symptom. A spec asserts on a clean dashboard and fails because rows from the previous spec's seed data are still in the test database.

Root cause. Cypress is a browser tool: it does not own the backend. A spec that POST /users leaves the user in the database for whichever spec runs next. If the suite assumes a clean state at the top of every spec without resetting the backend, the order in which specs run silently changes which assertions pass.

// spec-a.cy.ts
it("creates a user", () => {
  cy.request("POST", "/api/users", { name: "Rémy" });
  cy.visit("/users");
  cy.get("table tr").should("have.length", 1);
});

// spec-b.cy.ts (runs after spec-a in alphabetical order)
it("starts with no users", () => {
  cy.visit("/users");
  cy.get("table tr").should("have.length", 0); // fails: 1 row from spec-a
});

Fix. Reset the backend before each spec via a test-only endpoint or a database task. cy.task() runs Node code in the Cypress process, which is the right place for "truncate the test database" hooks.

// cypress.config.ts
export default defineConfig({
  e2e: {
    setupNodeEvents(on) {
      on("task", { resetDb: () => truncateAllTables() });
    },
  },
});

// cypress/support/e2e.ts
beforeEach(() => cy.task("resetDb"));

With Mergify. Test Insights groups failures whose only signature is row-count or unique-constraint violations. The dashboard surfaces them as state-leakage candidates so the missing reset hook is the obvious fix.

Pattern 8

Cypress retries config hiding real bugs

Symptom. Your suite is green. A user reports a bug your specs were supposed to catch.

Root cause. retries: { runMode: 3 } in cypress.config.ts reruns failing specs up to three times in CI and reports the last result. A real race that loses on attempt 1 and wins on attempt 2 gets reported as green. The bug is still there. The pipeline has decided not to look at it.

// cypress.config.ts (please don't)
export default defineConfig({
  retries: { runMode: 3, openMode: 0 },
});

Fix. Do not retry at the framework level. When a spec is genuinely flaky, fix it. When the fix takes longer than a session, quarantine it instead. That keeps the signal visible without blocking the merge queue.

With Mergify. Test Insights reruns at the CI level with attempt-level result tracking. You see that a spec passed on attempt 2 of 3, which is exactly the information Cypress retries throws away. Quarantine kicks in once the pattern is clear.

Detection

Catch every Cypress flake in CI

Add the cypress-junit-reporter package, point Cypress at it on every CI run, and upload the resulting XML to Mergify with a one-line CLI call. Test Insights builds a confidence score for every spec on your default branch. PR runs are compared against that baseline. Anything inconsistent gets flagged in a PR comment before the author merges.

mergify ci
# 1. Add the JUnit reporter
npm install --save-dev cypress-junit-reporter

# 2. Emit JUnit XML on every CI run
cypress run --reporter cypress-junit-reporter \
  --reporter-options "mochaFile=junit.xml,toConsole=false"

# 3. Upload the result (once, in CI)
curl -sSL https://get.mergify.com/ci | sh
mergify ci junit upload junit.xml

Prevention

Block flaky Cypress tests at PR time

On every PR, Mergify reruns the tests whose confidence is below threshold, without Cypress retries config touching your config. The PR gets a comment naming the unreliable tests, their confidence history, and whether the failure on this PR is new or historical noise. Authors fix the real bugs before merge instead of re-running CI until it passes.

Mergify Test Insights Prevention view showing caught flaky Cypress tests per PR

Quarantine

Quarantine without skipping

Once a Cypress test is confirmed flaky, Test Insights quarantines it. The test still runs in the suite, no `it.skip()` rewrite required, but its result no longer blocks merges or marks the pipeline red. When the pass rate on main recovers, quarantine lifts automatically and the test goes back to being load-bearing.

renders the invoice line Healthy login dispatches the right action Healthy checkout flow settles the pending promise Quarantined rate limiter rejects after 3 requests Healthy

Want to see which Cypress specs in your repo are already flaky?

Works with cypress-junit-reporter or any JUnit-compatible Cypress reporter. Setup takes under five minutes.

Book a discovery call

Frequently asked questions

Why are my Cypress tests flaky in CI but pass locally?
Your laptop and your CI runner differ in CPU, network latency, and animation throttling. Specs that depend on `cy.wait(ms)` magic numbers, register `cy.intercept` after the request fires, or click through CSS animations lose the race under CI's tighter resource budget. Reproduce locally with `cypress run --browser chrome --headless` (CI's actual mode), or throttle your CPU in DevTools to mirror the CI runner, then fix the underlying timing assumption before pushing.
How do I detect flaky Cypress tests?
Cypress alone cannot tell flaky from broken since each run gives one data point per test. You need to run the same commit multiple times and compare results. Mergify Test Insights does that on every PR and on the default branch, scores each test, and surfaces the tests whose pass rate drops below a confidence threshold.
Does Cypress retries config fix flaky tests?
No, it hides them. A test that fails on attempt 1 and passes on attempt 2 is still broken; you have only decided not to look at the failure. Use Cypress retries config as a temporary bandage for a test you are actively fixing, never as a permanent policy. For visibility without blocking the merge queue, quarantine instead of retry.
What causes flaky tests in Cypress?
Eight patterns cover most of what we see: cy.wait(ms) racing the page, cy.intercept registered after the request fires, detached DOM errors after re-render, synchronous .then() breaking command-chain retry-ability, session leakage without cy.session(), CSS animations stealing clicks, server state contamination from previous specs, and Cypress retries config hiding real bugs. Each is covered above with a minimal reproducer.
How do I quarantine a flaky Cypress test without deleting it?
Mergify Test Insights quarantines the test automatically once its confidence score drops. The test still runs in the suite, but a failing result no longer blocks merges and its noise no longer drowns out real signal. When the test stabilizes on main, quarantine lifts automatically. No `it.skip()`, no commented-out tests, no orphaned files.
Why does my Cypress test pass with cy.wait but fail without it?
Because `cy.wait(ms)` masked a timing assumption that should have been made explicit. The fix is not to keep the wait, it is to wait for the actual signal: an alias from `cy.intercept`, a `.should()` assertion that an element exists, or a state-change observable from the DOM. Cypress retries those automatically with the configured timeout, so the spec is fast when the page is fast and patient when CI is slow.

Ship your Cypress suite green.

2k+ organizations use Mergify to merge 75k+ pull requests a month without breaking main.