pytest fixture teardown races: when yield-style cleanup eats itself
Why a session-scoped fixture's teardown can pull state out from under tests still using it, and the per-fixture lifecycle discipline that fixes the leak for good.
A pytest test fails in CI with psycopg.OperationalError: connection already closed. The test body reads a row, the row exists, the connection was alive when the fixture handed it over. Run the test alone and it passes. Run the suite and it fails on the same SHA, sometimes.
We see this pattern often enough on Mergify Test Insights that it earned its own slot in our flaky pytest catalog. The cause is fixture teardown order, the failure mode is invisible from the failing test’s perspective, and the fix is one decision about scope.
What you see
@pytest.fixture(scope="module")
def db_conn():
conn = connect_to_db()
yield conn
conn.close()
@pytest.fixture(scope="module")
def seeded_users(db_conn):
db_conn.bulk_insert("users", FIXTURES)
yield
db_conn.execute("DELETE FROM users WHERE seeded = TRUE")
@pytest.fixture(scope="module")
def admin(db_conn):
u = db_conn.create_user(role="admin")
yield u
db_conn.delete_user(u.id)
seeded_users and admin are siblings: both module-scoped, both depend on db_conn, neither depends on the other. Pytest tears down fixtures in reverse order of setup, and setup order is determined by which test asks for which fixture first. Test 1 requests seeded_users, test 2 requests admin — setup order is db_conn → seeded_users → admin. At module end, teardown reverses: admin finalizer (delete_user), then seeded_users finalizer (the DELETE), then db_conn finalizer (close).
The bug is that seeded_users’s DELETE WHERE seeded = TRUE and admin’s delete_user(u.id) interact through the database, not through the fixture graph. If the admin row was inserted with seeded = TRUE (because the factory defaults it), the DELETE already removed it before delete_user runs. Pytest sees no error — the row was deleted, just by the wrong fixture. Now the next module starts, asks for db_conn, gets a fresh module-scoped connection, and the next admin fixture’s setup hits a unique-constraint violation because delete_user had already raised silently and the row is still there in the database under a soft-delete flag the test author didn’t know about.
Why it crosses tests
The leak is visible across tests because the fixtures share an external resource (the database) that lives outside the dependency graph pytest can see. Pytest tracks seeded_users → db_conn and admin → db_conn, but it has no idea that seeded_users’s DELETE touches rows admin’s teardown also expects to own. The reverse-of-setup teardown order is correct as far as pytest knows; it just doesn’t know enough about the rows.
The randomness comes from setup order. The first test to request a fixture pins its position in the LIFO. If you change the test file order or the test names (changing alphabetical run order), the setup sequence changes and the failing fixture pair flips. Under pytest-xdist, worker assignment shifts the order further. You see the failure in CI but not locally because your laptop runs -n 1 by default.
The naive fix and why it is incomplete
@pytest.fixture(scope="module")
def admin(db_conn):
u = db_conn.create_user(role="admin")
yield u
try:
db_conn.delete_user(u.id)
except RowNotFoundError:
pass
Defensive teardown. If the row is already gone, swallow the error. The build goes green more often. It still leaves orphan state in the database when the swallow path fires, which means the next module’s fixtures see leftovers they did not create. You traded a hard failure for silent data accumulation, and you lost the “this teardown should have worked” signal entirely.
The fix that holds
Make sibling fixtures explicit about their ordering by giving one a dependency on the other. If admin only makes sense after seeded_users ran, declare it:
@pytest.fixture(scope="module")
def admin(db_conn, seeded_users):
u = db_conn.create_user(role="admin", seeded=False)
yield u
db_conn.delete_user(u.id)
Now pytest knows admin must tear down before seeded_users. The teardown order is admin → seeded_users → db_conn, regardless of which test requested which fixture first. The DELETE WHERE seeded = TRUE no longer collides with delete_user(u.id) because the admin row is gone first, and was inserted with seeded=False so it would not have been swept anyway.
If you genuinely need fixtures that mutate the same external state at the same scope, isolate them. Each fixture gets its own connection or its own table partition:
@pytest.fixture(scope="module")
def admin():
conn = connect_to_db()
u = conn.create_user(role="admin")
yield u
conn.delete_user(u.id)
conn.close()
Slower per test, but each unit owns its dependency.
Use a transactional fixture for shared external state
For Postgres specifically, pytest-postgresql and pytest-flask-sqlalchemy offer transactional fixtures: the connection wraps every test in a savepoint and rolls back at the end. The connection lives at session scope, but every test sees a clean database without anyone running explicit deletes. Teardown order stops mattering because the rollback handles cleanup.
@pytest.fixture
def db(db_session):
yield db_session # autocommit off; rollback on exit
This is the Rails-style transactional-test pattern adapted to pytest. It does not help if your test starts a JS-driven browser (separate connection, sees uncommitted data), but for pure-Python suites it is the cleanest answer.
How Mergify catches this before you ship
The signature of fixture-teardown races is consistent: the failing test does not match the test that broke. Test Insights reruns the suspect test on a fresh worker. When the same SHA passes alone but fails inside the suite, the dashboard tags it as fixture-lifecycle-sensitive and shows the predecessor test that triggered the bad teardown. You see test 4 highlighted next to test 5’s symptom, with a one-line summary of the scope mismatch.
Quarantine kicks in automatically once the pattern is confirmed. The merge queue keeps moving while you fix the scope, instead of blocking on a test that fails one run in eight.
To detect this pattern automatically across an xdist suite, point Mergify at your repo. Native plugin: pytest-mergify. One pip install and you’re set.
More patterns like this
Fixture teardown races are one of the eight patterns in the flaky-tests-in-pytest guide. The others are mostly variants of the same theme: state that crosses tests because the cleanup did not run when expected. xdist ordering surprises, autouse fixtures touching globals, monkeypatch leakage when stubs are not reverted. Same shape, different surface.
The good news: the patterns are finite. Once you name what you are looking at, the fix is usually a scope decision.