Playwright storageState is not just a setup file. It is a contract.
Why a single test that re-saves the auth state file poisons every later test that uses it, the per-test path pattern that prevents the leak, and when cy.session-style validation is the right answer.
A Playwright suite uses an auth setup project: one test logs in, saves browser storage to auth.json, every other test loads auth.json and runs as the logged-in user. The pattern works for months. Then a logout test gets added, and the next CI run fails on twelve unrelated tests with redirected to /login. Roll back the logout test, the suite is green again. The logout test is not broken — it followed the same pattern as setup, just in reverse. That is the bug.
We see this pattern often enough on Mergify Test Insights that it earned its own slot in our flaky Playwright catalog. The cause is treating the shared auth file as mutable. The fix is per-test snapshot paths plus a contract about who is allowed to write the shared one.
What you see
The setup project saves auth state once:
// auth.setup.ts
setup("authenticate", async ({ page, context }) => {
await page.goto("/login");
await page.getByLabel("Email").fill("user@example.com");
await page.getByLabel("Password").fill("secret");
await page.getByRole("button", { name: "Sign in" }).click();
await context.storageState({ path: "auth.json" });
});
Other tests load it:
// account.spec.ts
test.use({ storageState: "auth.json" });
test("user can sign out", async ({ page, context }) => {
await page.goto("/account");
await page.getByRole("button", { name: "Sign out" }).click();
await expect(page).toHaveURL("/login");
await context.storageState({ path: "auth.json" }); // BUG
});
The logout test mirrors the setup pattern: save the state at the end. The author’s intent was to checkpoint the state for later inspection. The actual effect: rewrite auth.json with the post-logout state (no auth cookies, no token in localStorage). The next test that loads auth.json finds an empty state, hits the login wall, fails.
In CI with the auth setup cached between runs (typical optimization to avoid re-running the setup for every test execution), the poisoned auth.json survives across runs. The first test to fail is the test in the next pipeline run, not the logout test in this one. The triage path is: someone looks at the failing test, sees nothing wrong, retries — the retry uses the same poisoned file, fails again. Eventually someone deletes the cached auth.json and the suite goes green for a few days, until the next test author writes the same pattern.
Why storageState is mutable
Playwright’s context.storageState() reads the current cookies and localStorage of a browser context. With no arguments, it returns the state object. With a { path } option, it serializes the state to disk. There is no protection against overwriting the file. The pattern is the same whether you are saving a fresh login or capturing the post-logout state — Playwright cannot tell the intent apart.
The setup-project pattern relies on auth.json being written once and read many times. The contract is “setup writes, everyone else reads.” Nothing enforces it. Any test can rewrite the file, and the framework will not warn you.
The naive fix and why it is incomplete
// account.spec.ts
test.afterEach(async ({ context }) => {
await context.clearCookies();
});
Clear cookies in the test that mutates them. Works for cookies. Does not work for localStorage, IndexedDB, service workers, or anything else storageState captures. The failure surface is wider than cookies, and forgetting one piece of state means the next test still inherits something it should not.
test.use({ storageState: { ...JSON.parse(readFileSync("auth.json", "utf8")) } });
Read the auth file synchronously and pass the parsed state object instead of a path. Now the test gets a copy of the state, not a reference to the file. Mutations stay in the test’s context, not on disk. This works, until someone later does context.storageState({ path: "auth.json" }) again and the same bug returns. The fix protects this test, not the next one.
The fix that holds
Three rules. Together they make the leak impossible.
Rule 1: Tests that need to capture state write to per-test paths.
test("user can sign out", async ({ page, context }) => {
await page.goto("/account");
await page.getByRole("button", { name: "Sign out" }).click();
await expect(page).toHaveURL("/login");
// per-test path under the test's output directory
await context.storageState({
path: test.info().outputPath("after-logout.json"),
});
});
test.info().outputPath() returns a directory unique to this test run. The state goes there, never to the shared file. Later tests cannot see it. Test runs cannot conflict.
Rule 2: Tests that mutate session state run in their own context.
test("user can sign out", async ({ browser }) => {
const context = await browser.newContext({ storageState: "auth.json" });
const page = await context.newPage();
await page.goto("/account");
await page.getByRole("button", { name: "Sign out" }).click();
await context.close();
});
Build a fresh context from the shared auth.json. Mutate it. Throw it away. Nothing on disk changes. The next test gets a fresh context from the same untouched auth.json.
Rule 3: Code-review the shared file’s writers.
Add a CI check that fails any commit touching context.storageState({ path: "auth.json" }) outside auth.setup.ts. One line of grep in a pre-commit hook catches it before the PR opens. For teams using lefthook or similar:
forbidden_storage_state_writes:
glob: '*.spec.ts'
run: |
if grep -l 'storageState({ path: "auth.json"' {staged_files}; then
echo "Only auth.setup.ts may write to auth.json"
exit 1
fi
The mechanical check is more reliable than discipline. Reviewers miss this; grep does not.
When you actually need cy.session-style validation
For long-running auth (OAuth tokens that expire, sessions that get invalidated server-side), the shared auth.json can go stale even without test mutation. Playwright does not have a built-in equivalent to Cypress’s cy.session validate callback, but you can implement it in the setup project:
setup("authenticate or restore", async ({ page, context }) => {
if (existsSync("auth.json")) {
await context.addCookies(JSON.parse(readFileSync("auth.json", "utf8")).cookies);
await page.goto("/account");
if ((await page.url()).endsWith("/account")) return; // session is still valid
// session expired; fall through to fresh login
}
await page.goto("/login");
// ... fill in form, sign in ...
await context.storageState({ path: "auth.json" });
});
The setup checks whether the cached session still works. If yes, exit early. If no, log in fresh and save. The validation is a single page load against an authenticated route — cheap compared to a full login flow, and it eliminates the “stale auth.json passed validation but failed inside the test” failure mode.
How Mergify catches this before you ship
Auth-state poisoning is the worst kind of test failure to triage. The failing tests have nothing visibly wrong with them. The test that caused the failure is the one that ran successfully and exited normally. Manual triage reaches for “delete auth.json and retry” without ever finding the culprit.
Test Insights tracks failures by the previous test that ran on the same worker. When a cluster of tests starts failing only after a specific other test has run, the dashboard surfaces the predecessor: “12 tests fail with redirected to /login after account.spec.ts:user can sign out ran.” You see the actual culprit on the same screen as the symptom.
Quarantine kicks in once the pattern is clear, so the merge queue keeps moving while you switch to per-test output paths.
Catch auth-state poisoning before it ships by pointing Mergify at your suite. Works with Playwright’s built-in JUnit reporter or any JUnit-compatible output.
More patterns like this
storageState leakage is one of the eight patterns in the flaky-tests-in-Playwright guide. The others are variants of the same theme: tests that share state in ways the framework does not enforce. Auto-wait racing element re-renders, route handlers registered after page.goto, networkidle that never settles in SPAs, test.use() scope confusion. Different trigger, same root.
Here’s the upside: each pattern has a clean fix once you can name it. Most are about scoping shared resources tighter, not rewriting tests.