Skip to content
Rémy Duthu Rémy Duthu
April 10, 2026 · 5 min read

Diving into pytest Finalizers

Diving into pytest Finalizers

While building pytest-mergify, we hit a wall with fixture teardown during test reruns. The fix was two helper functions and a trick borrowed from pytest-rerunfailures.

I thought rerunning a pytest test would be easy. We’re building pytest-mergify, a plugin that can rerun a single test multiple times in a row. pytest already has a clean test protocol with three steps: setup, call, and teardown, so why not just run the call phase multiple times?

Something like:

for _ in range(retries):
    _pytest.runner.call_and_report(item=item, when="call", log=True)

That actually works fine for simple tests.

But then fixtures show up, and everything breaks.

Say you have a fixture that creates a temporary database:

@pytest.fixture
def _create_test_db():
    conn = _connect_to_db()
    yield conn
    conn.drop_database()

During the first run, it works perfectly. On the second run, pytest reuses the setup state from the first run, so the fixture isn’t reinitialized. Now the test tries to recreate a DB that already exists.

At that point, I knew I had to dig into how pytest handles fixture scopes and teardowns.

Understanding how pytest actually tears things down

pytest keeps track of all fixtures that need cleanup in a private attribute. This attribute can be accessed from an item with: item.session._setupstate.stack.

It’s a dictionary that maps nodes (like the test item, module, or session) to their teardown callables (and also exception information).

It looks roughly like this:

{
    <Function test_example>: [(finalizer_fn, ...)],
    <Module test_file.py>: [(finalizer_fn, ...)],
    <Session>: [(finalizer_fn, ...)],
}

Each entry is a list of finalizers.

When pytest finishes a test, it calls teardown_exact() (you can see it in the source) to decide what to clean up. That function compares the current test item with the next one (nextitem). If there is no next item, pytest assumes the session is ending and tears down everything.

That behavior makes sense in normal test runs, but not when you’re trying to rerun the same test multiple times in the same session. Calling _pytest.runner.runtestprotocol(item=item, nextitem=nextitem, log=True) resets fixtures that should stay alive, like session- or module-scoped ones.

At that point, I needed to either replicate the teardown logic myself (bad idea) or find a way to temporarily stop pytest from tearing down the wrong fixtures.

The small trick hiding in pytest-rerunfailures

I started reading how other plugins deal with this. The most interesting one was pytest-rerunfailures.

It reruns failed tests, so it faces the same problem: how to isolate test retries without tearing down the world every time.

Their approach is clever. They use what they call suspended finalizers.

Here’s the idea: between retries, they temporarily remove higher-scoped finalizers (class, module, session) from the stack so only function-scoped fixtures get cleaned up and recreated. When the last retry finishes, they restore everything and let pytest do its normal teardown.

A minimal version looks like this:

_suspended_finalizers = {}

def suspend_item_finalizers(item):
    for stacked_item in list(item.session._setupstate.stack.keys()):
        if stacked_item == item:
            continue  # Keep function-scoped finalizers.

        _suspended_finalizers[stacked_item] = item.session._setupstate.stack.pop(stacked_item)

def restore_item_finalizers(item):
    item.session._setupstate.stack.update(_suspended_finalizers)
    _suspended_finalizers.clear()

In your plugin, you can call these inside a pytest_runtest_teardown hook, depending on whether this is the last retry or not:

def pytest_runtest_teardown(self, item):
    if is_last_execution(item):
        restore_item_finalizers(item)
    else:
        suspend_item_finalizers(item)

Two short helpers and one conditional hook.

During intermediate retries, pytest will tear down only the function-scoped fixtures. At the end, the full teardown sequence runs normally.

Why I liked this solution

It feels like working with pytest instead of against it. The internal setup stack stays intact, the teardown order is preserved, and other plugins continue to behave normally.

It’s also surprisingly minimal. You don’t need to copy any pytest logic, just adjust when certain finalizers are visible.

Once you get used to reading pytest’s source, you start understanding why the behavior is the way it is. That understanding is worth more than the fix itself.

If you’re interested in how CI failures affect developer focus or why CI feels broken in general, we’ve written about those too.


PS: I did try asking AI for help while figuring this out. It didn’t mention finalizers once. Even when I added specific context about the setup stack. Turns out, this is one of those cases where reading the source code is still the fastest path to clarity.

Test Insights

Tired of flaky tests blocking your pipeline?

Test Insights detects flaky tests, quarantines them automatically, and tracks test health across your suite.

Try Test Insights

Recommended posts

Testing

Hypothesis is not flaky. Your code under test is, and Hypothesis is the messenger.

May 17, 2026 · 5 min read

Hypothesis is not flaky. Your code under test is, and Hypothesis is the messenger.

Why a property test that passed for months suddenly fails on a CI run with an example you cannot reproduce locally, the @example pattern that pins the failing case, and why the right response is never to delete the test.

Rémy Duthu Rémy Duthu
Testing

RSpec system specs see an empty database. database_cleaner is why.

May 15, 2026 · 6 min read

RSpec system specs see an empty database. database_cleaner is why.

How Capybara's separate browser connection breaks transactional fixtures, the per-spec-type strategy that fixes the leak, and the modern Rails 7.1 alternative that makes database_cleaner optional.

Rémy Duthu Rémy Duthu
Testing

Playwright route handlers fire only on requests they were registered before

May 13, 2026 · 5 min read

Playwright route handlers fire only on requests they were registered before

Why a `page.route()` call placed after `page.goto()` silently misses the request you wanted to intercept, and the registration-order rule that makes network mocks deterministic.

Rémy Duthu Rémy Duthu