EXPLAIN vs EXPLAIN ANALYZE — which should I run?

EXPLAIN shows a cost estimate without executing the query (safe on prod). EXPLAIN ANALYZE executes the query and shows real numbers. Always run ANALYZE on read-only replicas only — never on production writes, since UPDATE/DELETE mutations execute fully.

Estimated rows way off from actual (10x divergence)?

Your statistics are stale. Run ANALYZE table_name to refresh. If divergence is recurrent, raise default_statistics_target from 100 to 500 and re-ANALYZE. For highly skewed data, this helps the planner choose better strategies.

Hash Join is spilling to disk (Batches > 1). How do I fix it?

Increase work_mem: SET work_mem = '256MB'; SELECT ...; to expand the in-memory hash table. Or reduce input size with an earlier WHERE or GROUP BY to pre-filter. On production, update postgresql.conf and reload, or increase for a specific session.

My Nested Loop is slow but I can't add an index. What do I do?

As a last resort, disable nested loop: SET enable_nestloop = off; to force Hash Join or Merge Join. But fix the root cause instead (add an index or refresh statistics). Disabled planner features cause cascading failures when data shape changes.

What does 'Rows Removed by Filter' mean?

Rows scanned from disk but discarded by a WHERE clause. High values signal a missing index on the filter column or a partial-index opportunity. If 90% of rows are removed, an index (or partial index) on that column would save I/O.

#postgresql #database-internals #performance #cheat-sheet

Postgres EXPLAIN Cheat Sheet: Reading Query Plans Like a Pro

BackendBytes Engineering Team

Apr 12, 2026

16 min read

Postgres EXPLAIN Cheat Sheet: Reading Query Plans Like a Pro

Part of Series: PostgreSQL Deep Dive

Lesson 1 of 4

→Compare estimated vs actual rows — divergence >10x signals stale statistics; run ANALYZE on affected tables immediately
→Nested Loop + large outer table = missing index on the join column; the planner thinks the outer table is small and sequentially probes the inner for each row
→Rows Removed by Filter >> output means the WHERE clause isn't indexed; the index returns candidates, but filters away most of them — use partial indexes for skewed predicates
→Hash Join with Batches > 1 means the hash table spilled to disk; either raise work_mem for the session or reduce the input size with earlier WHERE/GROUP BY aggregation

The planner estimated 200 rows. The actual count was 4.2 million. A dashboard query that loaded in 800 ms at launch now takes 12 seconds on 8 million rows. EXPLAIN ANALYZE shows a Nested Loop over a Seq Scan; a missing composite index on (tenant_id, created_at) turned a 2 ms Index Scan into a full table scan. We debugged this exact regression on multiple production teams.

How to read a Postgres EXPLAIN plan

The dashboard query that loaded in 800ms at launch now takes 12 seconds on 8M rows. EXPLAIN ANALYZE shows a Nested Loop over a Seq Scan — the planner estimated 200 rows but actual was 4.2 million. A missing composite index on (tenant_id, created_at) turned a 2ms Index Scan into a full table scan. This cheat sheet teaches you to read those numbers and spot the pattern in under a minute.

TL;DR

EXPLAIN shows a tree of operations (read bottom-up)^{[PostgreSQL Docs]}. Compare estimated rows vs actual rows — in our experience, divergence beyond ~10× is the threshold where the planner reliably picks the wrong join strategy and you should run ANALYZE. Use BUFFERS to spot I/O bottlenecks. Watch for Nested Loops with large outer tables, disk spills, and filter throwaway.

Add BUFFERS option to spot cache misses vs CPU bottlenecks
Compare estimated vs actual row counts; divergence beyond ~10× is a reliable signal of stale statistics — run ANALYZE and re-check
Nested Loop + large outer table = missing index on join column

The quick start — node types

Plan trees are read bottom-up — leaves are scans, internals are joins/aggregations, root is the final shape:

graph TB
    Root[Limit + Sort<br/>top operator<br/>final result] --> J1[Hash Join]
    J1 --> Build[Build side<br/>smaller relation]
    J1 --> Probe[Probe side<br/>larger relation]
    Build --> S1[Seq Scan customers<br/>1000 rows]
    Probe --> Idx[Index Scan orders<br/>using idx_customer_id]
    Idx --> NL[Nested Loop<br/>inside larger plan]
    NL --> Outer[Outer: small batch<br/>10 rows]
    NL --> Inner[Inner: index lookup<br/>per outer row]
    style Root fill:#dfd
    style J1 fill:#ffd
    style S1 fill:#fdd
    style Idx fill:#dfd
    style NL fill:#ffd

Postgres plans are trees. Read bottom-up. Each node takes input from children, transforms, emits to parent.

Node	What it does	Good sign	Bad sign
Seq Scan	Read every row	Small tables, bitmap input	Large table + WHERE clause (missing index)
Index Scan	Walk index, fetch rows	Selective predicate (< 5% of table)	Fetching >10% via index (seq scan faster)
Index Only Scan	Answer from index alone	Covering query, visibility map warm	Heap fetches (visibility map stale)
Bitmap Heap Scan	Combine multiple indexes	Multi-column AND/OR on indexes	High heap recheck (working set > shared_buffers)
Nested Loop	Outer loop: probe inner	Tiny outer, indexed inner side	Large outer table, unindexed inner
Hash Join	Build hash on small side	Both sides sized correctly	Spill to disk (`Batches: > 1`)
Sort	Sort input rows	Fits in work_mem (quicksort)	Disk sort (`external merge Disk: ...kB`)

Reading the numbers

^{[PostgreSQL Docs, Resource Usage]}

Seq Scan on orders (cost=0.00..18334.00 rows=1000000 width=8) (actual time=0.010..85.2 rows=998422 loops=1)

cost — startup .. total (units arbitrary; compare estimates only).
rows=1000000 — planner estimate. Compare with actual. >10x divergence = stale stats.
actual time — startup .. total milliseconds per loop.
loops — execution count. Multiply time by loops for total cost. ^{[PostgreSQL Docs, Resource Usage]}

Diagnose by symptom, not by node type

When EXPLAIN ANALYZE prints a 200-line plan tree, the question is "where does the time go?" Route by symptom:

graph TD
    Slow[Query is slow] --> Where{Where does<br/>the time go?}
    Where -->|Single node, big actual time| Hot[Find that node<br/>read its inputs and filter]
    Where -->|Many small nodes, sum is big| Loops[Look at loops<br/>multiply time per loop]
    Hot --> Type{Node type?}
    Type -->|Seq Scan on big table| SeqIdx[Add index on<br/>WHERE / JOIN column]
    Type -->|Nested Loop, big outer| Nested[Index inner table<br/>or rewrite as Hash Join]
    Type -->|Sort, Disk: ...kB| Sort[Raise work_mem<br/>or add covering index]
    Type -->|Hash Join, Batches over 1| Hash[Raise work_mem<br/>or pre-aggregate]
    Type -->|Index Scan, slow| IdxSlow{Estimate vs<br/>actual rows?}
    IdxSlow -->|Off over 10x| Stats[Run ANALYZE<br/>raise default_statistics_target]
    IdxSlow -->|Close| Selectivity[Index not selective enough<br/>add columns to make composite]
    Loops -->|Loops over 10000| LoopMany[Same answer as Nested Loop<br/>index inner table]
    style SeqIdx fill:#dfd
    style Nested fill:#dfd
    style Sort fill:#dfd
    style Hash fill:#dfd
    style Stats fill:#dfd
    style Selectivity fill:#ffd

The diagram is the cheat-sheet for performance triage^{[PostgreSQL Docs]}. The red flags table below is the same answers in tabular form — pick whichever shape matches your debugging style.

Red flags and fixes

^{[PostgreSQL Docs]}

Red flag	Cause	Fix
Nested Loop, large outer table	Per-row inner scan	Add index on join column
`Rows Removed by Filter` >> output	Throwaway reads	Index filter column, use partial index
`Buffers: read=X` (high)	Cache miss	Warm cache, raise shared_buffers, replica
`Batches: > 1` (Hash Join)	Spill to disk	Raise work_mem, pre-aggregate
`Sort ... Disk: ...kB`	work_mem overflow	Raise work_mem, index for sorted output
Estimate off >10x	Stale statistics	Run ANALYZE, raise default_statistics_target

EXPLAIN ANALYZE safely

^{[PostgreSQL Docs]}

-- Plan only (no execution, safe on prod).
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;
 
-- Plan + actual numbers (runs query; use replica).
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42;
 
-- Full diagnostics: buffers + settings + JSON.
EXPLAIN (ANALYZE, BUFFERS, SETTINGS, FORMAT JSON)
SELECT * FROM orders WHERE customer_id = 42;

Always include BUFFERS on production investigations. It reports shared-buffer cache hits vs disk reads — the single best signal for I/O vs CPU bottleneck.

For writes, wrap in a rolled-back transaction:

BEGIN;
EXPLAIN ANALYZE UPDATE orders SET status='shipped' WHERE id = 42;
ROLLBACK;

Six anti-patterns

^{[Postgres B-tree]}

SELECT * from wide tables — defeats index-only scans. Project only needed columns.
WHERE lower(email) = $1 — function on column blocks index. Fix: expression index or stored lowercase.
WHERE created_at::date = $1 — cast kills index. Use range: WHERE created_at >= $1 AND created_at < $1::date + '1 day'.
OFFSET 10000 LIMIT 50 — scans 10k rows then discards. Use keyset: WHERE (created_at, id) < ($ts, $id) LIMIT 50.
IN (SELECT ...) with large subquery — rewrite as JOIN or EXISTS for better join strategy.
COUNT(col) instead of COUNT(*) — COUNT(*) counts rows; COUNT(col) counts non-nulls. Never confuse them.

EXPLAIN (FORMAT JSON) for Tooling

For machine-readable output, use EXPLAIN (FORMAT JSON, ANALYZE, BUFFERS). The JSON format feeds plan visualisers, regression-detection scripts, and slow-query review pipelines:

EXPLAIN (FORMAT JSON, ANALYZE, BUFFERS, SETTINGS, WAL)
SELECT order_id, total_cents
FROM orders
WHERE customer_id = 42 AND status = 'pending'
ORDER BY created_at DESC LIMIT 50;

Key JSON fields a regression-detection script should check:

Plan.Total Cost — planner's estimated total cost; spikes vs. baseline indicate stats drift or new index needed
Plan.Actual Total Time — wall-clock execution time; the SLO indicator
Plan.Plans[].Plan Rows vs Actual Rows — divergence over 10× signals stale stats; run ANALYZE
Plan.Shared Hit Blocks vs Shared Read Blocks — cache hit ratio; low hit ratio means warm cache or raise shared_buffers
Plan.Plans[].Node Type — change from Index Scan to Seq Scan between deploys is a smoke alarm

A simple regression check in CI:

# pseudo-code: load baseline plan + new plan, compare key fields
import json
 
def regression_score(baseline: dict, current: dict) -> dict:
    b_plan = baseline["Plan"]
    c_plan = current["Plan"]
    return {
        "cost_ratio": c_plan["Total Cost"] / b_plan["Total Cost"],
        "time_ratio": c_plan["Actual Total Time"] / b_plan["Actual Total Time"],
        "node_changed": b_plan["Node Type"] != c_plan["Node Type"],
        "estimate_drift": abs(c_plan.get("Plan Rows", 0) - c_plan.get("Actual Rows", 0))
                          / max(c_plan.get("Actual Rows", 1), 1),
    }

Wire this into your nightly slow-query report so plans that regress (cost > 2× baseline, node-type changed, estimate-drift > 10×) page the on-call DBA before they cause a customer incident.

For interactive review, paste JSON output into explain.depesz.com (free, no signup) — colour-codes hot nodes by time and rows-removed-by-filter.

Incident-time queries: paste these into psql at 3am

When the on-call page fires and the dashboard shows latency climbing, EXPLAIN on a single query is rarely the right starting point. You need cluster-wide signals: which queries are burning CPU, which sessions are blocking each other, which tables have rotted, and which indexes are paying their rent. The four queries below are the ones we paste into psql first, and they answer those questions in roughly that order.

Start with the slowest queries by cumulative execution time. total_exec_time (renamed from total_time in PG13+) is the right ranking — mean_exec_time hides queries that are individually fast but called millions of times per minute:

-- Top-10 slowest queries by cumulative execution time.
-- Requires: CREATE EXTENSION pg_stat_statements; in postgresql.conf.
SELECT
    substring(query, 1, 80) AS short_query,
    calls,
    round(total_exec_time::numeric, 1) AS total_ms,
    round(mean_exec_time::numeric, 2) AS mean_ms,
    round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 1) AS pct_of_total,
    rows
FROM pg_stat_statements
WHERE query NOT ILIKE '%pg_stat_statements%'
ORDER BY total_exec_time DESC
LIMIT 10;

If pct_of_total for a single query is over 30%, you have a clear culprit. If the top 10 are all under 5%, the load is broadly distributed and you should look at connection count or shared_buffers instead. Reset the counters after each investigation with SELECT pg_stat_statements_reset(); to get clean before/after numbers when you ship a fix. ^{[PostgreSQL Docs]}

Next, find blocking locks. When the 3am page reads "writes are stalled," it's almost always one long-running transaction holding RowExclusiveLock on a table that everything else needs. This query reports the blocking PID, the blocked PID, and the SQL of both so you can decide which to kill:

-- Live blocking-lock detector. Shows which session is blocking which.
SELECT
    blocked.pid          AS blocked_pid,
    blocked.usename      AS blocked_user,
    blocking.pid         AS blocking_pid,
    blocking.usename     AS blocking_user,
    blocking.state       AS blocking_state,
    age(now(), blocking.xact_start) AS blocking_xact_age,
    substring(blocked.query, 1, 60)  AS blocked_query,
    substring(blocking.query, 1, 60) AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking
    ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
WHERE NOT blocked.pid = blocking.pid
ORDER BY blocking_xact_age DESC;

pg_blocking_pids() (PG9.6+) is the canonical way to walk the lock graph — it correctly handles transitive blocking through multiple sessions, which a manual pg_locks join does not. If blocking_xact_age is over a minute and blocking_state is idle in transaction, the application forgot to commit; kill it with SELECT pg_terminate_backend(blocking_pid);.

Now check for table bloat. When autovacuum can't keep up with churn, dead tuples accumulate, the heap grows, and sequential scans get slower even though the live row count is unchanged. The pgstattuple extension gives you the truth without estimating:

-- Bloat estimate per table. Requires CREATE EXTENSION pgstattuple;
-- Note: scans the full table, so run on a replica for tables > 100GB.
SELECT
    schemaname || '.' || tablename AS table_name,
    pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS total_size,
    round(stat.dead_tuple_percent::numeric, 1) AS dead_pct,
    pg_size_pretty(stat.dead_tuple_len) AS dead_bytes,
    round(stat.free_percent::numeric, 1) AS free_pct
FROM pg_tables
CROSS JOIN LATERAL pgstattuple(schemaname || '.' || tablename) AS stat
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
  AND pg_total_relation_size(schemaname || '.' || tablename) > 100 * 1024 * 1024
ORDER BY stat.dead_tuple_len DESC
LIMIT 20;

Anything over 20% dead_pct warrants a VACUUM (VERBOSE, ANALYZE) tablename; and a look at autovacuum tuning (autovacuum_vacuum_scale_factor, autovacuum_naptime). For tables over 100GB, prefer pg_repack over VACUUM FULL — it rewrites the heap without taking an exclusive lock for the duration. ^[pg_repack]

Finally, find indexes that are pure write tax — indexes that exist, slow down every INSERT/UPDATE, and serve zero reads. Drop them. This is the single highest-ROI cleanup you can do on most aging databases:

-- Indexes that have never been scanned since stats were reset.
-- Excludes UNIQUE constraints (still needed for correctness) and primary keys.
SELECT
    schemaname || '.' || relname AS table_name,
    indexrelname AS index_name,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
    idx_scan AS times_used,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes ui
JOIN pg_index i ON ui.indexrelid = i.indexrelid
WHERE idx_scan = 0
  AND NOT i.indisunique
  AND NOT i.indisprimary
  AND pg_relation_size(ui.indexrelid) > 1024 * 1024
ORDER BY pg_relation_size(ui.indexrelid) DESC
LIMIT 20;

Before you DROP INDEX, verify the stats window is wide enough — SELECT stats_reset FROM pg_stat_database WHERE datname = current_database(); should be at least one full business cycle ago (a week minimum, ideally a month). Indexes that look unused on Monday may be hammered during Friday batch jobs. When in doubt, mark them invalid first with UPDATE pg_index SET indisvalid = false WHERE indexrelid = ...::regclass; — this lets you roll back instantly if a query starts seqscanning instead.

auto_explain: capture slow plans before the user complains

Running EXPLAIN ANALYZE after the fact only works when you can reproduce the slow run, and production traffic rarely hands you the same row distribution twice. The auto_explain extension solves this by logging the actual plan of every query that exceeds a duration threshold — so the next time the dashboard query takes 12 seconds, you already have the plan in your log stream, with the real row counts that triggered the regression.

Enable it cluster-wide by editing postgresql.conf and reloading. The configuration below logs the full plan with buffer counters and trigger timing for any statement that exceeds 500 ms, samples 10% of fast queries to keep noise down, and skips nested statements to avoid blowing up the log volume on stored-procedure-heavy workloads: ^{[PostgreSQL Docs]}

-- Add to postgresql.conf and run SELECT pg_reload_conf();
-- shared_preload_libraries requires a full restart, the rest reload live.
ALTER SYSTEM SET shared_preload_libraries = 'auto_explain,pg_stat_statements';
ALTER SYSTEM SET auto_explain.log_min_duration = '500ms';
ALTER SYSTEM SET auto_explain.log_analyze = on;
ALTER SYSTEM SET auto_explain.log_buffers = on;
ALTER SYSTEM SET auto_explain.log_timing = on;
ALTER SYSTEM SET auto_explain.log_triggers = on;
ALTER SYSTEM SET auto_explain.log_verbose = off;
ALTER SYSTEM SET auto_explain.log_format = 'json';
ALTER SYSTEM SET auto_explain.log_nested_statements = off;
ALTER SYSTEM SET auto_explain.sample_rate = 1.0;
SELECT pg_reload_conf();

The log_format = 'json' setting matters: it lets your log shipper (Loki, Splunk, Datadog) parse the plan as structured fields rather than treating it as a multi-line text blob. With JSON, you can write alerts like "page on-call when any query has Plan.Total Cost over 100000 and Node Type = Seq Scan" without regex-scraping log lines. log_timing = on adds about 5% overhead on a CPU-bound workload — measure first, then decide whether the plan visibility is worth it. On most OLTP systems with auto_explain set to log only queries above 500 ms, the overhead is in the noise. ^{[PostgreSQL Docs]}

One caveat: auto_explain.log_analyze = on instruments every query at the executor level, even if the query finishes below the duration threshold and is never logged. The instrumentation cost is the price you pay for being able to see plans at all. If your p99 latency budget is tight, set auto_explain.sample_rate to 0.1 (sample 10% of statements) — you still catch chronic offenders without paying instrumentation cost on every fast query. ^{[PostgreSQL Docs]}

pg_hint_plan: emergency override for a planner gone wrong

Sometimes the planner picks the wrong plan and you cannot fix the root cause in the moment. Stats refresh is queued, the index build will take six hours, and the customer is complaining now. The pg_hint_plan extension lets you pin a specific join order, scan type, or join method per query via SQL comments — without changing application code or postgresql.conf. Treat this as a circuit breaker, not a long-term fix. Hints freeze the plan against the planner's future improvements and hide the underlying problem.

The hint syntax is a structured comment immediately before the SELECT. The example below forces a Hash Join between orders and customers, an index scan on orders using idx_orders_customer_status, and a leading-join order so customers is the build side:

-- Force Hash Join with explicit build side and index scan.
-- Remove this hint after stats are refreshed and the planner picks the right plan organically.
/*+
    HashJoin(o c)
    IndexScan(o idx_orders_customer_status)
    Leading((c o))
 */
SELECT o.id, o.total_cents, c.email
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
  AND o.created_at >= now() - interval '24 hours'
ORDER BY o.created_at DESC LIMIT 100;

After applying the hint, verify it took effect with EXPLAIN (ANALYZE, BUFFERS) — pg_hint_plan silently ignores invalid hints rather than throwing an error, so a typo gives you no signal. If the plan looks unchanged, check SHOW pg_hint_plan.message_level; (set to notice for visible feedback during testing) and SHOW pg_hint_plan.enable_hint; (must be on).

Track every hint you add in a hints_audit table with the issue ticket, expected removal date, and the underlying root cause. Stale hints rot fast — a hint pinned in 2024 against a 100GB table is actively wrong by the time the table is 800GB and the index has been rebuilt twice. Audit quarterly and remove any hint whose root cause has been fixed.

Plan stability tracking with pg_stat_statements over time

pg_stat_statements shows you the current top-N slow queries, but it does not show you whether a specific query has been getting slower over time — and that's the question that matters for capacity planning and slow-rollout regression detection. The pattern is a periodic snapshot job that copies the relevant counters into a history table, then resets the live stats so the next window starts clean.

The schema below stores per-query mean and total time, plus shared-buffer hit rate, with a snapshot timestamp. A nightly cron job inserts the current row, then resets:

-- One-time setup: history table for plan-stability snapshots.
CREATE TABLE IF NOT EXISTS pg_stat_statements_history (
    snapshot_at        timestamptz NOT NULL DEFAULT now(),
    queryid            bigint      NOT NULL,
    query_fingerprint  text        NOT NULL,
    calls              bigint      NOT NULL,
    total_exec_time_ms double precision NOT NULL,
    mean_exec_time_ms  double precision NOT NULL,
    rows_emitted       bigint      NOT NULL,
    cache_hit_ratio    double precision NOT NULL,
    PRIMARY KEY (snapshot_at, queryid)
);
CREATE INDEX ON pg_stat_statements_history (queryid, snapshot_at DESC);
 
-- Cron job (runs every hour or daily): snapshot then reset.
INSERT INTO pg_stat_statements_history (
    queryid, query_fingerprint, calls,
    total_exec_time_ms, mean_exec_time_ms, rows_emitted, cache_hit_ratio
)
SELECT
    queryid,
    substring(query, 1, 200),
    calls,
    total_exec_time,
    mean_exec_time,
    rows,
    CASE WHEN (shared_blks_hit + shared_blks_read) = 0 THEN 1.0
         ELSE shared_blks_hit::float / (shared_blks_hit + shared_blks_read)
    END
FROM pg_stat_statements
WHERE calls > 10;
SELECT pg_stat_statements_reset();

With a few weeks of history, regression questions become trivial. "Has this query gotten slower since the v2.4 deploy on April 19th?" is now a single comparison query against pg_stat_statements_history filtered by queryid and snapshot_at. Plot mean execution time over time and you'll see exactly when the cliff happened — usually the day someone added a column without updating an index, or the day a partition crossed the boundary where the planner stopped using a partial index. The snapshot window also gives you a safer baseline for the regression-detection script earlier in this article: instead of comparing against a single hand-picked baseline plan, you can compare today's mean execution time to the rolling 30-day p50 and alert when the divergence exceeds a threshold for at least three consecutive snapshots, which filters out noise from one-off slow runs.

Frequently Asked Questions

EXPLAIN vs EXPLAIN ANALYZE?

EXPLAIN returns a cost estimate without executing the query. EXPLAIN ANALYZE actually runs the query and returns real numbers (row counts, times, buffers). Run ANALYZE on replicas only — never on production writes, since it executes mutations like any other statement.

Estimate way off (e.g., 10k estimated, 100 actual)?

Underestimating is worse than overestimating. Underestimates push the planner toward Nested Loops that explode at runtime; overestimates are conservative. If skew is recurrent, raise default_statistics_target to 500 and re-ANALYZE the affected tables.

What's "Rows Removed by Filter"?

Rows read from the heap but discarded by a WHERE clause that wasn't pushed into the index. High values signal a missing index or a partial-index opportunity for the predicate.

Hash Batches > 1?

The hash table spilled to disk because it didn't fit in work_mem. Either raise work_mem for the session, or reduce the input size with an earlier WHERE / GROUP BY pre-aggregation.

Disable Nested Loop to force Hash Join?

Last resort. Prefer fixing the root cause: add an index on the join column or refresh statistics. Disabled planner features (set enable_nestloop = off) cause future outages when data shape changes and the planner is no longer free to pick the right strategy.

Keep Reading

Indexes and query planning strategies
Postgres Query Planner Internals
Zero-Downtime Database Migrations — when an EXPLAIN points to "this needs a different schema."
Caching Strategies at Scale — when "the query is fundamentally expensive" pushes you off the database.
Production Go API Design — context-aware DB calls (QueryContext) so EXPLAIN-discovered slow queries get cancelled.

Was this article helpful?

Your feedback directly shapes our editorial depth and technical accuracy.