Skip to content
Rémy Duthu Rémy Duthu
April 10, 2026 · 5 min read

Diving into pytest Finalizers

Diving into pytest Finalizers

While building pytest-mergify, we hit a wall with fixture teardown during test reruns. The fix was two helper functions and a trick borrowed from pytest-rerunfailures.

I thought rerunning a pytest test would be easy. We’re building pytest-mergify, a plugin that can rerun a single test multiple times in a row. pytest already has a clean test protocol with three steps: setup, call, and teardown, so why not just run the call phase multiple times?

Something like:

for _ in range(retries):
    _pytest.runner.call_and_report(item=item, when="call", log=True)

That actually works fine for simple tests.

But then fixtures show up, and everything breaks.

Say you have a fixture that creates a temporary database:

@pytest.fixture
def _create_test_db():
    conn = _connect_to_db()
    yield conn
    conn.drop_database()

During the first run, it works perfectly. On the second run, pytest reuses the setup state from the first run, so the fixture isn’t reinitialized. Now the test tries to recreate a DB that already exists.

At that point, I knew I had to dig into how pytest handles fixture scopes and teardowns.

Understanding how pytest actually tears things down

pytest keeps track of all fixtures that need cleanup in a private attribute. This attribute can be accessed from an item with: item.session._setupstate.stack.

It’s a dictionary that maps nodes (like the test item, module, or session) to their teardown callables (and also exception information).

It looks roughly like this:

{
    <Function test_example>: [(finalizer_fn, ...)],
    <Module test_file.py>: [(finalizer_fn, ...)],
    <Session>: [(finalizer_fn, ...)],
}

Each entry is a list of finalizers.

When pytest finishes a test, it calls teardown_exact() (you can see it in the source) to decide what to clean up. That function compares the current test item with the next one (nextitem). If there is no next item, pytest assumes the session is ending and tears down everything.

That behavior makes sense in normal test runs, but not when you’re trying to rerun the same test multiple times in the same session. Calling _pytest.runner.runtestprotocol(item=item, nextitem=nextitem, log=True) resets fixtures that should stay alive, like session- or module-scoped ones.

At that point, I needed to either replicate the teardown logic myself (bad idea) or find a way to temporarily stop pytest from tearing down the wrong fixtures.

The small trick hiding in pytest-rerunfailures

I started reading how other plugins deal with this. The most interesting one was pytest-rerunfailures.

It reruns failed tests, so it faces the same problem: how to isolate test retries without tearing down the world every time.

Their approach is clever. They use what they call suspended finalizers.

Here’s the idea: between retries, they temporarily remove higher-scoped finalizers (class, module, session) from the stack so only function-scoped fixtures get cleaned up and recreated. When the last retry finishes, they restore everything and let pytest do its normal teardown.

A minimal version looks like this:

_suspended_finalizers = {}

def suspend_item_finalizers(item):
    for stacked_item in list(item.session._setupstate.stack.keys()):
        if stacked_item == item:
            continue  # Keep function-scoped finalizers.

        _suspended_finalizers[stacked_item] = item.session._setupstate.stack.pop(stacked_item)

def restore_item_finalizers(item):
    item.session._setupstate.stack.update(_suspended_finalizers)
    _suspended_finalizers.clear()

In your plugin, you can call these inside a pytest_runtest_teardown hook, depending on whether this is the last retry or not:

def pytest_runtest_teardown(self, item):
    if is_last_execution(item):
        restore_item_finalizers(item)
    else:
        suspend_item_finalizers(item)

Two short helpers and one conditional hook.

During intermediate retries, pytest will tear down only the function-scoped fixtures. At the end, the full teardown sequence runs normally.

Why I liked this solution

It feels like working with pytest instead of against it. The internal setup stack stays intact, the teardown order is preserved, and other plugins continue to behave normally.

It’s also surprisingly minimal. You don’t need to copy any pytest logic, just adjust when certain finalizers are visible.

Once you get used to reading pytest’s source, you start understanding why the behavior is the way it is. That understanding is worth more than the fix itself.

If you’re interested in how CI failures affect developer focus or why CI feels broken in general, we’ve written about those too.


PS: I did try asking AI for help while figuring this out. It didn’t mention finalizers once. Even when I added specific context about the setup stack. Turns out, this is one of those cases where reading the source code is still the fastest path to clarity.

Test Insights

Tired of flaky tests blocking your pipeline?

Test Insights detects flaky tests, quarantines them automatically, and tracks test health across your suite.

Try Test Insights

Recommended posts

Testing

pytest fixture teardown races: when yield-style cleanup eats itself

May 1, 2026 · 6 min read

pytest fixture teardown races: when yield-style cleanup eats itself

Why a session-scoped fixture's teardown can pull state out from under tests still using it, and the per-fixture lifecycle discipline that fixes the leak for good.

Rémy Duthu Rémy Duthu
Fake-timer leakage in Jest: the flake nobody sees coming
April 25, 2026 · 6 min read

Fake-timer leakage in Jest: the flake nobody sees coming

Why a single jest.useFakeTimers() call can pollute the next test in the file, what the failure looks like, and the three-line afterEach that makes it go away for good.

Rémy Duthu Rémy Duthu
Jest snapshot drift: when toMatchSnapshot lies about your code
April 25, 2026 · 7 min read

Jest snapshot drift: when toMatchSnapshot lies about your code

Snapshots fail every few days with diffs you did not cause. Most teams blame their Date.now() mock. The real story involves three less-obvious sources of drift, and a one-line serializer that fixes them.

Rémy Duthu Rémy Duthu