Detecting Blocking Tasks in Asyncio by Measuring Event Loop Latency

Blog

Engineering

Mehdi Abaakouk

Jan 7, 2026

∙

5 min

read

Detecting Blocking Tasks in Asyncio by Measuring Event Loop Latency

landmark photography of trees near rocky mountain under blue skies daytime

Asyncio only works if every coroutine cooperates. One blocking call can freeze your entire app. This post shows a simple watchdog coroutine that measures event loop latency, detects blocking tasks early, and turns invisible stalls into actionable metrics.

Asyncio gives you lightweight concurrency, but only as long as every coroutine cooperates. One blocking call (just one) can freeze the entire application. When that happens, all your "concurrent" tasks stall at the same time: HTTP handlers slow down, background jobs drift, timeouts trigger late, and nothing looks obviously wrong.

There's an easy way to catch this: run a tiny coroutine that repeatedly sleeps and checks how late it wakes up. If the event loop is stuck, this coroutine wakes up late, and that delay becomes your signal. Add metrics and a graph, and you have a reliable early-warning system for any blocking task in your async code.

This post walks through why blocking the event loop is so disruptive, how to measure it, and how to turn that measurement into something actionable in production.

Asyncio in practice: cooperative multitasking

Under the hood, asyncio is cooperative. The event loop runs a set of tasks, and each task must explicitly yield control with await. When it does, the loop schedules other tasks or handles I/O events:

Because everything shares a single event loop thread, concurrency works only if tasks frequently yield. If they don't, nothing else progresses.

This is the catch: a coroutine that performs blocking I/O or heavy CPU work without await prevents the loop from serving other coroutines. The entire application stalls, often in ways that are difficult to identify solely from logs.

What blocking looks like in reality

Blocking usually comes from:

time.sleep() inside async code
synchronous HTTP or database clients
CPU-heavy work (compression, JSON parsing, regex)
filesystem operations done synchronously

Timeline example:

Every time-based operation slips. A task meant to run every 20 ms might fire 500 ms late. An HTTP request that usually takes 5 ms now takes 505 ms.

Unless you're monitoring the event loop itself, this is almost invisible.

A simple trick: measure loop latency with a watchdog coroutine

The idea:

Record the current time.
await asyncio.sleep(dt)
Measure how late the coroutine wakes up.

If the event loop is healthy, the delay is small. If something blocks the loop, you detect it immediately.

latency = actual_wakeup - expected_wakeup

Implementing a loop latency watchdog

import asyncio
import logging

log = logging.getLogger(__name__

Start it at application boot:

Seeing a spike: a blocking example

import asyncio
import time
import logging

logging.basicConfig(level=logging.INFO)
log = logging.getLogger(__name__

Example log:

A perfect demonstration of loop freeze → watchdog spike.

Graphing and alerting

Once you export latency to metrics, you can graph:

p50/p95/p99 loop latency
correlation with HTTP latency
per-endpoint patterns during stalls

Useful alerts:

loop latency > 100ms for 30 seconds
p99 latency above 200ms

This gives you a clean separation between "system is slow" and "event loop is frozen."

Tuning the sampling interval

Shorter intervals catch smaller stalls but generate more metrics.

5–10 ms: low-latency apps, high resolution.
20–50 ms: good default.
100 ms+: lowest overhead, detects only large stalls.

A solid starting point:

After detection: finding the culprit

The watchdog tells you when the loop was blocked. To figure out why, you need to correlate with:

request logs
CPU spikes
GC activity
stack dumps
profiling

Typical fixes:

move CPU-heavy operations to run_in_executor
switch synchronous clients to async versions
isolate expensive tasks into worker services

Takeaways

Asyncio gives concurrency only if tasks cooperate.
A single blocking call can freeze the entire application.
A tiny watchdog coroutine reliably detects loop stalls.
With metrics and alerts, you catch blocking tasks the moment they happen.

This type of lightweight guardrail provides immediate feedback and helps surface bugs that would be nearly invisible otherwise.

Stay ahead in CI/CD

Blog posts, release news, and automation tips straight in your inbox

Subscribe to our RSS feed

Stay ahead in CI/CD

Blog posts, release news, and automation tips straight in your inbox

Subscribe to our RSS feed

Recommended blogposts

March 6, 2026

∙

5 min

read

The Comfortable Room

Software engineering was a walled garden. AI just copied the key. The data is messy: 19% slower in trials, 30% more warnings, 322% more vulnerabilities. But the baseline wasn't pristine either. What's left isn't coding: it's judgment, taste, and knowing which room to build.

Rémy Duthu

February 20, 2026

∙

9 min

read

How We Turned Claude Into a Cross-System Support Investigator

Support triage at Mergify meant juggling Datadog, Sentry, PostgreSQL, Linear, and source code. We built a repo with MCP servers and Claude Code that investigates tickets in parallel — cutting triage from 15 minutes to under 5, with 75% first-pass accuracy.

Julian Maurin

February 11, 2026

∙

7 min

read

Spinners Are the UX Equivalent of “TODO: Fix Later”

We replaced a spinner with a chart-shaped skeleton and realized loading states are part of the layout contract. Bad skeletons cause layout shift. Good ones match the final UI exactly. Here's what we learned fixing ours — and why CLS is a UX problem, not just an SEO metric.

Alexandre Gaubert

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Get started

Talk to our team

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Get started

Talk to our team

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Get started

Talk to our team

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Get started

Talk to our team