Rémy Duthu Rémy Duthu
April 10, 2026 · 5 min read

Diving into pytest Finalizers

Diving into pytest Finalizers

While building pytest-mergify, we hit a wall with fixture teardown during test reruns. The fix was two helper functions and a trick borrowed from pytest-rerunfailures.

I thought rerunning a pytest test would be easy. We’re building pytest-mergify, a plugin that can rerun a single test multiple times in a row. pytest already has a clean test protocol with three steps: setup, call, and teardown, so why not just run the call phase multiple times?

Something like:

for _ in range(retries):
    _pytest.runner.call_and_report(item=item, when="call", log=True)

That actually works fine for simple tests.

But then fixtures show up, and everything breaks.

Say you have a fixture that creates a temporary database:

@pytest.fixture
def _create_test_db():
    conn = _connect_to_db()
    yield conn
    conn.drop_database()

During the first run, it works perfectly. On the second run, pytest reuses the setup state from the first run, so the fixture isn’t reinitialized. Now the test tries to recreate a DB that already exists.

At that point, I knew I had to dig into how pytest handles fixture scopes and teardowns.

Understanding how pytest actually tears things down

pytest keeps track of all fixtures that need cleanup in a private attribute. This attribute can be accessed from an item with: item.session._setupstate.stack.

It’s a dictionary that maps nodes (like the test item, module, or session) to their teardown callables (and also exception information).

It looks roughly like this:

{
    <Function test_example>: [(finalizer_fn, ...)],
    <Module test_file.py>: [(finalizer_fn, ...)],
    <Session>: [(finalizer_fn, ...)],
}

Each entry is a list of finalizers.

When pytest finishes a test, it calls teardown_exact() (you can see it in the source) to decide what to clean up. That function compares the current test item with the next one (nextitem). If there is no next item, pytest assumes the session is ending and tears down everything.

That behavior makes sense in normal test runs, but not when you’re trying to rerun the same test multiple times in the same session. Calling _pytest.runner.runtestprotocol(item=item, nextitem=nextitem, log=True) resets fixtures that should stay alive, like session- or module-scoped ones.

At that point, I needed to either replicate the teardown logic myself (bad idea) or find a way to temporarily stop pytest from tearing down the wrong fixtures.

The small trick hiding in pytest-rerunfailures

I started reading how other plugins deal with this. The most interesting one was pytest-rerunfailures.

It reruns failed tests, so it faces the same problem: how to isolate test retries without tearing down the world every time.

Their approach is clever. They use what they call suspended finalizers.

Here’s the idea: between retries, they temporarily remove higher-scoped finalizers (class, module, session) from the stack so only function-scoped fixtures get cleaned up and recreated. When the last retry finishes, they restore everything and let pytest do its normal teardown.

A minimal version looks like this:

_suspended_finalizers = {}

def suspend_item_finalizers(item):
    for stacked_item in list(item.session._setupstate.stack.keys()):
        if stacked_item == item:
            continue  # Keep function-scoped finalizers.

        _suspended_finalizers[stacked_item] = item.session._setupstate.stack.pop(stacked_item)

def restore_item_finalizers(item):
    item.session._setupstate.stack.update(_suspended_finalizers)
    _suspended_finalizers.clear()

In your plugin, you can call these inside a pytest_runtest_teardown hook, depending on whether this is the last retry or not:

def pytest_runtest_teardown(self, item):
    if is_last_execution(item):
        restore_item_finalizers(item)
    else:
        suspend_item_finalizers(item)

Two short helpers and one conditional hook.

During intermediate retries, pytest will tear down only the function-scoped fixtures. At the end, the full teardown sequence runs normally.

Why I liked this solution

It feels like working with pytest instead of against it. The internal setup stack stays intact, the teardown order is preserved, and other plugins continue to behave normally.

It’s also surprisingly minimal. You don’t need to copy any pytest logic, just adjust when certain finalizers are visible.

Once you get used to reading pytest’s source, you start understanding why the behavior is the way it is. That understanding is worth more than the fix itself.

If you’re interested in how CI failures affect developer focus or why CI feels broken in general, we’ve written about those too.


PS: I did try asking AI for help while figuring this out. It didn’t mention finalizers once. Even when I added specific context about the setup stack. Turns out, this is one of those cases where reading the source code is still the fastest path to clarity.

CI Insights

How much time does your team waste on flaky tests?

CI Insights detects flaky jobs, retries them automatically, and tracks everything. See what's breaking your CI.

Try CI Insights

Recommended posts

Your Monorepo's Merge Queue Has a Concurrency Problem
April 10, 2026 · 8 min read

Your Monorepo's Merge Queue Has a Concurrency Problem

How scoped parallel queues turn a serial bottleneck into a DAG, so your frontend team stops waiting for backend CI to finish.

Mehdi Abaakouk Mehdi Abaakouk
How a Dashboard Arms Race Put Me on the App Store
April 8, 2026 · 7 min read

How a Dashboard Arms Race Put Me on the App Store

It started with a shell script on Slack. Four days and 31 commits later, I had a native macOS app built entirely with Claude Code in a language I'd never touched.

Julian Maurin Julian Maurin
The Day My AI Agent Deleted 29 Git Worktrees
April 1, 2026 · 4 min read

The Day My AI Agent Deleted 29 Git Worktrees

What happens when you rubber-stamp an AI agent's suggestion to 'clean up' your git worktrees, and the agent uses --force on all of them.

Alexandre Gaubert Alexandre Gaubert