Fabien Martinet

Oct 29, 2025

6 min

read

Why PostgreSQL Ignored Our Index (and What the Planner Was Thinking)

Stay ahead in CI/CD

The latest blog posts, release news, and automation tips straight in your inbox

Stay ahead in CI/CD

The latest blog posts, release news, and automation tips straight in your inbox

PostgreSQL doesn’t "ignore" your indexes, it just does the math differently. We dive into how the planner weighs cost, why it sometimes chooses sequential scans, and how we tuned our queries to make peace with it.

Every engineer has written a query that should be fast (indexed column, simple filter), and yet PostgreSQL still decided to walk the entire table.

At Mergify, we met this wall repeatedly on tables with tens or hundreds of millions of rows. Even though our organization_id column was indexed, PostgreSQL sometimes preferred a sequential scan, and our API latencies spiked. Only when we gently nudged the planner (for example, via ORDER BY) did it switch to index scans, and performance improved by orders of magnitude.

In this post, I'll walk you through how PostgreSQL's planner reasons about execution plans (especially on large tables), why it sometimes "ignores" indexes, and how we leveraged that understanding with SQLAlchemy to achieve consistent high performance.

How PostgreSQL's Planner "Thinks" (Cost-Based, Not Guess-Based)

PostgreSQL uses a cost-based optimizer. It doesn't just trust you to add indexes and expect them to be used: it models multiple plans, assigns each a cost, and picks the cheapest path.

When you run:

EXPLAIN (ANALYZE, BUFFERS)
SELECT

PostgreSQL prints its chosen plan along with estimated costs and runtime statistics. Each plan node shows:

Cost Formulas (Simplified)

Sequential scan:


  • seq_page_cost is the cost per page read sequentially

  • cpu_tuple_cost, cpu_operator_cost estimate CPU work per row and comparison operator

    Default values are often 1.0, 0.01, 0.0025 respectively.

Index scan uses an amcostestimate (per index method) to compute:

  • index startup cost

  • cost to traverse index pages

  • cost to fetch table rows

  • selectivity of matching rows

  • correlation between index ordering and table layout PostgreSQL

So:

The planner compares sequential vs. index (and other hybrid plans) and picks the smallest total cost.

Heuristics & Shortcuts

Of course, PostgreSQL can't exhaustively evaluate every possible execution plan: the search space grows exponentially as queries get more complex. To keep planning fast and practical, it uses a mix of heuristics and shortcuts that prune or approximate the search early.

  • For example, in the case of multi-join queries, PostgreSQL utilizes GEQO (genetic optimization) to limit exhaustive search.

  • It prunes the dominated paths early.

  • It relies heavily on statistics (histograms, distinct counts, correlations) to predict selectivity.

Thus, the planner's choice is only as good as its data: cost constants + statistic inputs.

Why Indexes "Lose" in Multi-Million-Row Tables

When your table contains millions of rows, the planner's mistakes are magnified. Some of the usual culprits:

  1. Stale/insufficient statistics

    The table may have changed (with new data, deletions, or skew), but histograms remain outdated. The planner overestimates the match size and picks sequential scan.

  2. Weak correlation

    If index ordering is poorly correlated with the physical row layout, the cost of fetching rows after an index lookup can be high. The planner penalizes that.

  3. Startup overhead and random I/O

    For queries where many rows match, the overhead of index navigation and scattered reads may outweigh the benefit of scanning fewer rows.

  4. Cost model mismatch

    Default assumptions (such as spinning disks) may not hold for SSD/cloud hardware. Sequential page reads can be significantly cheaper than assumed, making sequential scans appear artificially attractive.

  5. Combined operations (joins, sort, group by)

    Sometimes, an index scan forces additional operations (such as sorting or hash building), which can accumulate costs and tip the balance.

In our experience, we regularly observed the following phenomenon: queries on multi-million-row tables with indexed filters would fall back to sequential scans until we intervened.

Mergify Case Study: SQLAlchemy + Huge Tables

In our stack, if we execute:

query = (
    session.query(PullRequest)
    .filter(PullRequest.organization_id == org.id)
)

Even though pull_requests.organization_id has an index, PostgreSQL often chooses:

The planner was implicitly assuming a significant fraction of rows matched, making the index path "too expensive."

We discovered a simple tweak:

query = (
    session.query(PullRequest)
    .filter(PullRequest.organization_id == org.id)
    .order_by(PullRequest.organization_id)
)

Now the plan switches to:


Latency dropped dramatically (from hundreds of milliseconds to a few milliseconds). The planner treats ORDER BY as a signal that the index ordering is relevant, and reuses the index path.

We confirmed this pattern on multiple heavy tables (millions of rows). The difference was especially stark when the filter matched only a small subset of the rows (e.g., one organization among many).

What We Did (and You Should) When Working With Massive Tables

Given our experience, here's a checklist (with SQLAlchemy flavor) for when you work at scale:

1. Refresh Stats More Frequently

  • Run VACUUM ANALYZE or ANALYZE after large data imports or bulk deletes

  • Tune autovacuum thresholds (e.g. autovacuum_analyze_scale_factor) to trigger analyze earlier

  • For columns with skew, increase statistics_target or use ALTER TABLE … SET (statistics = n)

Keeping stats fresh ensures the planner's selectivity guesses stay accurate.

2. Use ORDER BY / LIMIT as Gentle Hints

  • Adding .order_by(MyModel.indexed_col) may nudge the planner to reuse the index

  • Limiting results (when you don't need the full set) favors plans with a lower startup cost.

Although it may seem like a side effect, in large tables, the difference is enormous.

3. Adjust Cost Constants to Reflect Your Hardware

On modern SSDs or fast cloud storage, sequential access is cheaper than the default assumptions. You can override:

SET seq_page_cost = 0.5;
SET random_page_cost = 1.1

In SQLAlchemy:

session.execute("SET seq_page_cost = 0.5;")

Then run EXPLAIN ANALYZE again and compare. If index plans now appear more affordable, consider using them.

4. Force Plan Exploration (Safely)

For debugging:

SET

This forces the planner to choose index-based paths (if viable). Use it temporarily; do not leave it disabled in a production environment.

5. Batch Large Deletes & Updates

Single giant deletes or updates can be destructive. Use:

BATCH = 1000
while True:
    n = (
        session.query(Event)
        .filter(Event.created_at < cutoff)
        .limit(BATCH)
        .delete(synchronize_session=False)
    )
    session.commit()
    if n < BATCH:
        break

This keeps transactions small, avoids locking large swathes, and distributes I/O.

6. Consider Partitioning / Covering Indexes

For tables with truly massive scale:

  • Partition by date, organization, or shard key

  • Use covering indexes (so index-only scans suffice)

  • Use BRIN indexes for append-only time-ordered data

These structural optimizations reduce the search space for the planner.

7. Always Validate with EXPLAIN ANALYZE

When you make changes, run:

EXPLAIN (ANALYZE, BUFFERS)

Check:

  • Estimated vs actual row counts (significant gaps indicate bad stats)

  • Buffer hits/reads

  • Which operations dominate (sorts, nested loops, etc)

If the index plan you expect isn't chosen, review the cost numbers and determine which path the planner favored. That's a conversation you should have with the planner.

Takeaways & Next Steps

We used to treat PostgreSQL's planner as a black box that "might or might not" use our indexes. But after debugging these large-table behaviors, we began to reason with the planner.

Key lessons:

  • Indexes are only as good as the planner's model (statistics + cost constants)

  • In large tables, small misestimates can flip plan decisions

  • Query form (ORDER BY, LIMIT) is a hint the planner uses

  • Structural changes (partitioning, covering indexes) shift the balance permanently

  • Always measure: never guess

“PostgreSQL isn’t ignoring your index: it’s just doing the math differently than you are.”

At Mergify, these insights reshaped how we build, tune, and monitor query performance. We moved from reactive optimization to conversational optimization, where we read the planner's cost output, understand its assumptions, and gently guide it.

Stay ahead in CI/CD

The latest blog posts, release news, and automation tips straight in your inbox

Stay ahead in CI/CD

The latest blog posts, release news, and automation tips straight in your inbox

Recommended blogposts

Nov 19, 2025

4 min

read

Updating Materialized Views (Without Worrying Too Much)

Materialized views are powerful but painful to change. Here’s how we safely version, refresh, and migrate them without locking production or timing out deployments, plus the approach we use to avoid dangerous DROP/CREATE migrations.

Rémy Duthu

Nov 19, 2025

4 min

read

Updating Materialized Views (Without Worrying Too Much)

Materialized views are powerful but painful to change. Here’s how we safely version, refresh, and migrate them without locking production or timing out deployments, plus the approach we use to avoid dangerous DROP/CREATE migrations.

Rémy Duthu

Nov 19, 2025

4 min

read

Updating Materialized Views (Without Worrying Too Much)

Materialized views are powerful but painful to change. Here’s how we safely version, refresh, and migrate them without locking production or timing out deployments, plus the approach we use to avoid dangerous DROP/CREATE migrations.

Rémy Duthu

Nov 19, 2025

4 min

read

Updating Materialized Views (Without Worrying Too Much)

Materialized views are powerful but painful to change. Here’s how we safely version, refresh, and migrate them without locking production or timing out deployments, plus the approach we use to avoid dangerous DROP/CREATE migrations.

Rémy Duthu

Nov 17, 2025

5 min

read

Goodbye Checklists, Hello AI Linters

We turned our pull request rules into small AI-powered linters using GitHub’s new actions/ai-inference. Each linter enforces one rule: catching risky changes before humans do, without regexes, static analysis, or friction.

Mehdi Abaakouk

Nov 17, 2025

5 min

read

Goodbye Checklists, Hello AI Linters

We turned our pull request rules into small AI-powered linters using GitHub’s new actions/ai-inference. Each linter enforces one rule: catching risky changes before humans do, without regexes, static analysis, or friction.

Mehdi Abaakouk

Nov 17, 2025

5 min

read

Goodbye Checklists, Hello AI Linters

We turned our pull request rules into small AI-powered linters using GitHub’s new actions/ai-inference. Each linter enforces one rule: catching risky changes before humans do, without regexes, static analysis, or friction.

Mehdi Abaakouk

Nov 17, 2025

5 min

read

Goodbye Checklists, Hello AI Linters

We turned our pull request rules into small AI-powered linters using GitHub’s new actions/ai-inference. Each linter enforces one rule: catching risky changes before humans do, without regexes, static analysis, or friction.

Mehdi Abaakouk

Nov 5, 2025

5 min

read

Shadow Shipping: How We Double-Executed Code to Ship Safely

How do you ship risky code without crossing your fingers? In this post, we explain how he ran old and new logic in parallel (“shadow shipping”) to validate behavior in production before rollout. Learn how this simple pattern turned feature-flag anxiety into data-driven confidence.

Julian Maurin

Nov 5, 2025

5 min

read

Shadow Shipping: How We Double-Executed Code to Ship Safely

How do you ship risky code without crossing your fingers? In this post, we explain how he ran old and new logic in parallel (“shadow shipping”) to validate behavior in production before rollout. Learn how this simple pattern turned feature-flag anxiety into data-driven confidence.

Julian Maurin

Nov 5, 2025

5 min

read

Shadow Shipping: How We Double-Executed Code to Ship Safely

How do you ship risky code without crossing your fingers? In this post, we explain how he ran old and new logic in parallel (“shadow shipping”) to validate behavior in production before rollout. Learn how this simple pattern turned feature-flag anxiety into data-driven confidence.

Julian Maurin

Nov 5, 2025

5 min

read

Shadow Shipping: How We Double-Executed Code to Ship Safely

How do you ship risky code without crossing your fingers? In this post, we explain how he ran old and new logic in parallel (“shadow shipping”) to validate behavior in production before rollout. Learn how this simple pattern turned feature-flag anxiety into data-driven confidence.

Julian Maurin

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.