Skip to content

Database Indexing Strategies: B-Trees, GIN, GiST, and Production Tuning

BackendBytes Engineering Team
BackendBytes Engineering Team
9 min read
Database Indexing Strategies: B-Trees, GIN, GiST, and Production Tuning

Key Takeaways

  • A 34-second sequential scan on 10M rows dropped to 0.2ms with proper indexing — the gap between `Rows Removed by Filter` 9.9M and 0 is where production spends its time
  • Composite index column order matters: put equality columns first, range columns last — wrong order forced PostgreSQL to scan thousands of entries to find 50 results
  • Partial indexes on hot subsets (WHERE status = 'pending') cost less to maintain because they only index matching rows, cutting both size and scan time
  • Monitor pg_stat_statements quarterly — unused indexes are invisible friction until you measure them; watch for indexes with zero scans draining your memory budget

The classic missing-index production slowdown. A query that ran in milliseconds at launch is now taking tens of seconds. We've debugged this exact incident on multiple Postgres-backed services: the table grew from hundreds of thousands of rows into the millions, and nobody noticed because the application just got slower — gradually, then suddenly. A developer finally runs EXPLAIN ANALYZE[PostgreSQL Docs]:

Seq Scan on orders  (cost=0.00..287431.00 rows=9843 width=124)
                    (actual time=1247.331..33842.109 rows=9821 loops=1)
  Filter: ((status = 'pending') AND (created_at > '2026-01-01'::date))
  Rows Removed by Filter: 9990179
  Buffers: shared hit=2104 read=185327
Planning Time: 0.089 ms
Execution Time: 33844.512 ms

PostgreSQL read 185,327 pages from disk, examined all 10 million rows, threw away 9,990,179 of them, and returned 9,821. In our investigations, that Rows Removed by Filter line is always the tell — the database is doing orders of magnitude more work than necessary.

Quick Take

Proper indexing reduces a multi-second sequential scan to sub-ms by mapping query patterns to the right index type and column order[PostgreSQL Docs]. B-tree for scalars with equality-first composite ordering, GIN for multi-valued data, partial indexes for hot paths, and monitoring pg_stat_statements to catch degradation before it hits production.

  • Use composite indexes with equality columns first, range columns last
  • Create partial indexes on frequently-queried subsets (e.g., status = 'pending')
  • Monitor pg_stat_user_indexes and pg_stat_user_tables quarterly to drop unused indexes
graph TD
    Slow[Slow query?] --> Plan[EXPLAIN ANALYZE]
    Plan --> Filter{Rows Removed<br/>by Filter?}
    Filter -->|"high"| Type{Filter type?}
    Filter -->|"~0"| Other[Look elsewhere:<br/>joins, sort, work_mem]
    Type -->|"equality + range<br/>on scalars"| BT[B-tree composite<br/>equality first, range last]
    Type -->|"array / JSONB /<br/>full-text"| GIN[GIN index]
    Type -->|"geometry / range types"| GiST[GiST index]
    Type -->|"hot subset only<br/>WHERE status='X'"| PI[Partial index]
    Type -->|"covers all<br/>SELECT cols"| IO[Covering index<br/>INCLUDE clause]
    style Filter fill:#fee
    style BT fill:#efe
    style GIN fill:#efe
    style GiST fill:#efe
    style PI fill:#efe
    style IO fill:#efe

The diagram is the index-type picker. The first signal is always Rows Removed by Filter in EXPLAIN — high values mean the database is throwing away most of what it read, which is the symptom that points to a missing index. Once you know that, the type of filter (equality vs range vs multi-valued vs hot-subset) routes you to the right index family.

How B-Trees Work and Why Column Order Matters

[Postgres B-tree]

A B-tree index is a balanced tree where every leaf page is the same distance from the root. PostgreSQL uses a B+ variant with three levels: root page (key boundaries), internal pages (route the search), and leaf pages (index entries + heap TID to the actual row).

When PostgreSQL looks up WHERE user_id = 4821, it performs 3–4 page reads (binary search at each level) instead of ~185,000 pages for a sequential scan.

The leftmost prefix rule determines which queries an index can serve. A composite index on (a, b, c) is sorted first by a, then within each a value by b, then by c. This means:

CREATE INDEX idx_orders_composite ON orders(customer_id, status, created_at);
Query WHERE clauseUses index?Why
customer_id = 5YesLeftmost column
customer_id = 5 AND status = 'pending'YesLeft prefix (a, b)
status = 'pending'NoSkips leftmost column
customer_id = 5 AND created_at > '2026-01-01'PartialUses a, skips b, cannot use c efficiently

The most impactful rule: place equality columns first, range columns last. A query filtering by status (exact match) and date (range):

SELECT * FROM orders
WHERE status = 'pending' AND created_at > '2026-01-01'
ORDER BY created_at DESC LIMIT 50;

Wrong index: (created_at, status) — PostgreSQL scans the entire date range, then filters by status. Reads thousands of index entries to find 50.

Right index: (status, created_at) — PostgreSQL jumps to status='pending', walks the created_at range in sorted order. Reads exactly 50 entries.

EXPLAIN confirms the difference:

-- Wrong order: 47,823 rows filtered, 1,204 buffer hits
Rows Removed by Filter: 47823
Buffers: shared hit=1204 read=892
 
-- Right order: 0 rows filtered, 5 buffer hits
Rows Removed by Filter: 0
Buffers: shared hit=5

Covering indexes eliminate heap fetches. Add INCLUDE for SELECT columns that aren't part of the lookup:

CREATE INDEX idx_orders_covering ON orders(customer_id, status)
    INCLUDE (order_id, total_amount);
 
EXPLAIN (ANALYZE, BUFFERS)
SELECT order_id, total_amount FROM orders
WHERE customer_id = 42 AND status = 'pending';
-- Index Only Scan ... Heap Fetches: 0

Specialized Index Types Beyond B-Tree

B-trees[Postgres B-tree] handle equality and range comparisons on scalar values. PostgreSQL has five other built-in index types[PostgreSQL Docs]; the four below cover the patterns you'll actually hit — the fifth, SP-GiST, is niche (space-partitioned trees for points, IP ranges, and text prefixes).

The decision tree for "which index type" — route by data shape and query pattern:

graph TD
    Q[Need to add an index] --> Shape{What is the<br/>column shape?}
    Shape -->|Scalar:<br/>=, <, >, ORDER BY| Btree[B-tree<br/>default; covers<br/>most cases]
    Shape -->|Multi-valued:<br/>JSONB containment,<br/>arrays, full-text| GIN[GIN<br/>inverted index]
    Shape -->|Geometric or range:<br/>tsrange, geo, ip| GiST[GiST<br/>generalized search tree]
    Shape -->|High cardinality<br/>insert-heavy + range| BRIN[BRIN<br/>block-range index;<br/>1000x smaller than B-tree]
    Shape -->|Equality only,<br/>cannot use B-tree| Hash[Hash<br/>WAL-logged in PG10+]
    Btree --> Subset{Need only<br/>some rows?}
    Subset -->|Yes — small subset| Partial[Partial index<br/>WHERE status='active']
    Subset -->|No — all rows| Full[Full index]
    Btree --> Compute{Query expression<br/>vs raw column?}
    Compute -->|Expression| Expr[Expression index<br/>ON UPPER email]
    Compute -->|Raw column| Full
    style Btree fill:#dfd
    style GIN fill:#dfd
    style Partial fill:#ffd
    style Expr fill:#ffd
    style BRIN fill:#ffd

The diagram is the entire indexing playbook in one picture: classify by column shape, then narrow with partial/expression variants when the query pattern justifies it.

GIN (Generalized Inverted Index) maps each element in a multi-valued column to containing rows. Use for full-text search, JSONB containment, and array operations:

-- Full-text search: tsvector index on articles
ALTER TABLE articles ADD COLUMN search_vector tsvector;
UPDATE articles SET search_vector =
    setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
    setweight(to_tsvector('english', coalesce(body, '')), 'B');
 
CREATE INDEX idx_articles_search ON articles USING gin(search_vector);
 
-- Query: find articles about replication
SELECT title, ts_rank(search_vector, query) AS rank
FROM articles, to_tsquery('english', 'replication & consensus') AS query
WHERE search_vector @@ query
ORDER BY rank DESC LIMIT 20;

JSONB containment with GIN:

CREATE INDEX idx_products_attrs ON products USING gin(attributes);
 
-- Find all products with specific nested attributes
SELECT name FROM products
WHERE attributes @> '{"color": "red", "size": "large"}';

GiST (Generalized Search Trees) supports data without natural linear order: geometric types, ranges, nearest-neighbor queries. Example: find events overlapping with a time window:

CREATE TABLE events (
    id bigint PRIMARY KEY,
    name text,
    during tstzrange NOT NULL
);
 
CREATE INDEX idx_events_during ON events USING gist(during);
 
-- Find events overlapping 2026-03-01 to 2026-03-07
SELECT name FROM events
WHERE during && tstzrange('2026-03-01', '2026-03-07');

BRIN (Block Range Indexes) store min/max summary per table block range. Use for append-only data: 0.1% the size of B-trees: [PostgreSQL Docs]

-- Append-only event log with monotonically increasing created_at
CREATE INDEX idx_events_created_brin ON event_log
    USING brin(created_at) WITH (pages_per_range = 32);
 
-- On a 50 GB table: BRIN is 48 KB vs 2 GB for B-tree

Hash indexes support only equality (=) and are rarely preferable to B-trees (which support ranges, sorts, covering indexes).

Partial Indexes and Expression Indexes

A partial index includes a WHERE clause limiting indexed rows — a severely underused PostgreSQL feature.

Most applications query "hot" data frequently but touch "cold" data rarely. An e-commerce platform queries status = 'pending' orders constantly but touches completed orders only in nightly reports:

-- Full index: 10M rows, ~214 MB
CREATE INDEX idx_orders_full ON orders(customer_id, created_at);
 
-- Partial index: ~50K pending rows (0.5%), ~1.1 MB
CREATE INDEX idx_orders_pending ON orders(customer_id, created_at)
    WHERE status = 'pending';
``` <Cite id="postgres-btree-internals" />
 
The partial index is 200x smaller: faster to scan, cheaper to maintain. The query's WHERE clause must match or imply the index predicate: <Cite id="postgres-btree-internals" />
 
```sql
-- Uses idx_orders_pending efficiently
SELECT * FROM orders
WHERE status = 'pending'
  AND customer_id = 42
  AND created_at > '2026-01-01'
ORDER BY created_at DESC;

Unique partial indexes enforce uniqueness on a subset. Example: email must be unique only for active users (soft-deleted users should not block new registrations):

CREATE UNIQUE INDEX idx_users_email_active ON users(email)
    WHERE deleted_at IS NULL;
 
-- Active user: email='alice@example.com', deleted_at=NULL  ✓
-- Deleted user: email='alice@example.com', deleted_at='2026-01-15'  ✓
-- New active user: email='alice@example.com', deleted_at=NULL  ✗ (rejected)

Expression indexes index computed values. A query filtering by transformed email:

-- This query CANNOT use an index on email
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';
 
-- Expression index: indexes the result of LOWER(email)
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
 
-- Now queries with LOWER(email) = 'x' use the index

Date truncation for dashboards that group by day:

CREATE INDEX idx_orders_day ON orders(date_trunc('day', created_at));
 
-- Dashboard query: orders per day
SELECT date_trunc('day', created_at) AS day, count(*)
FROM orders
WHERE created_at >= '2026-01-01' AND created_at < '2026-02-01'
GROUP BY date_trunc('day', created_at);

Index Maintenance: Bloat, Unused Indexes, and Autovacuum

[PostgreSQL Docs]

In MVCC, each UPDATE creates a new row version; old versions remain in indexes until VACUUM cleans them. Under write load, indexes accumulate dead entries — index bloat — increasing cache pages, scan I/O, and write latency.

Detect bloat with pgstattuple:

CREATE EXTENSION IF NOT EXISTS pgstattuple;
 
-- Check a specific index
SELECT * FROM pgstattuple('idx_orders_composite');
-- dead_tuple_percent > 20% = worth rebuilding
 
-- Quick estimate for all indexes (no locks)
SELECT
    indexrelname,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
    idx_scan AS times_used
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;

Rebuilding indexes:

-- REINDEX CONCURRENTLY (PostgreSQL 12+)
-- Builds new index alongside old, then swaps
REINDEX INDEX CONCURRENTLY idx_orders_composite;
-- Requires ~2x disk space, brief lock at start/end

Find unused indexes — every index slows writes:

SELECT
    schemaname || '.' || relname AS table,
    indexrelname AS index,
    pg_size_pretty(pg_relation_size(indexrelid)) AS size,
    idx_scan AS scans_since_reset
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelid NOT IN (
      SELECT conindid FROM pg_constraint
      WHERE contype IN ('p', 'u')  -- exclude PKs and unique constraints
  )
ORDER BY pg_relation_size(indexrelid) DESC;

Before dropping, verify the stats have been running long enough (30+ days minimum) to capture all query patterns. Safe approach: rename the index and monitor for one week before dropping.

Autovacuum tuning for high-write tables:

-- Per-table autovacuum tuning
ALTER TABLE orders SET (
    autovacuum_vacuum_scale_factor = 0.01,      -- vacuum after 1% dead rows (default: 20%)
    autovacuum_vacuum_threshold = 1000,
    autovacuum_vacuum_cost_delay = 2,           -- aggressive (default: 20ms)
    autovacuum_vacuum_cost_limit = 1000         -- higher budget per cycle
);
 
-- Monitor autovacuum progress
SELECT relname, n_live_tup, n_dead_tup,
       round(100.0 * n_dead_tup / NULLIF(n_live_tup, 0), 1) AS dead_pct
FROM pg_stat_user_tables WHERE relname = 'orders';
-- dead_pct > 10% on high-traffic tables = autovacuum is behind
``` <Cite id="postgres-docs" />
 
## Finding Slow Queries and Missing Indexes
 
<Cite id="postgres-docs" />
Use pg_stat_statements to find performance issues:
 
```sql
-- Top 10 queries by total execution time (not average)
SELECT
    substring(query, 1, 80) AS query,
    calls,
    round(total_exec_time::numeric / 1000, 1) AS total_sec,
    round((total_exec_time / calls)::numeric, 1) AS avg_ms,
    round(100.0 * shared_blks_hit /
        NULLIF(shared_blks_hit + shared_blks_read, 0), 1) AS cache_hit_pct
FROM pg_stat_statements
WHERE calls > 50
ORDER BY total_exec_time DESC LIMIT 10;

Prioritize high total_sec (aggregate impact) over low avg_ms: a 5ms query × 2M calls/day outweighs a 30-second query × 2 calls.

Large tables with many sequential scans and few index scans signal missing indexes:

-- Tables with the most sequential scans
SELECT
    relname,
    seq_scan,
    idx_scan,
    pg_size_pretty(pg_relation_size(relid)) AS size
FROM pg_stat_user_tables
WHERE seq_scan > 100
  AND pg_relation_size(relid) > 10 * 1024 * 1024  -- > 10 MB
ORDER BY seq_tup_read DESC LIMIT 10;

Production Checklist

  • EXPLAIN ANALYZE every slow query before designing an index. Verify the planner uses it.
  • Test write amplification: measure query duration and HOT updates before and after adding indexes on write-heavy tables.
  • Size connection pools for post-index latency: connections_needed = target_tps * avg_write_duration_seconds.
  • Monitor quarterly with pg_stat_statements, pg_stat_user_indexes, and pg_stat_user_tables to catch growing sequential scans.
  • Drop unused indexes after 30+ days of stats capture to avoid index bloat and write overhead.
  • Use partial indexes for hot paths (e.g., pending orders, active users) to reduce index size and maintenance cost.
  • Check index-to-table size ratio: > 200% signals too many indexes; each index adds write amplification.
  • Set random_page_cost = 1.1 on SSD-backed databases (default 4.0); too high makes the planner avoid index scans.
  • Monitor HOT updates with n_tup_hot_upd: < 20% means frequent updates are writing to every index; consider dropping indexes on frequently-updated columns.
  • Rebuild indexes regularly: REINDEX INDEX CONCURRENTLY during low-traffic hours to reclaim space and restore performance. [Postgres B-tree]

Index Choice in Production: Three Case Studies

Three real production incidents where the diagnosis came from EXPLAIN and the fix was a single index change. The before/after plans show what to look for in your own systems.

Case 1: Notifications inbox — a query that ate 4 GB of RAM. A multi-tenant SaaS dashboard ran a notifications query on 80M rows. The original index was (user_id) and the query filtered unread items per user, ordered by recency:

SELECT id, body FROM notifications
WHERE user_id = 9182 AND read_at IS NULL
ORDER BY created_at DESC LIMIT 20;
-- Before: index on (user_id) only
Index Scan using idx_notif_user on notifications
  (actual time=0.4..1184.3 rows=20 loops=1)
  Index Cond: (user_id = 9182)
  Filter: (read_at IS NULL)
  Rows Removed by Filter: 41213
  Buffers: shared hit=4106 read=39214

41,213 read rows discarded to find 20 unread. The fix was a partial index that already encodes the IS NULL predicate and matches the sort order:

CREATE INDEX idx_notif_unread ON notifications(user_id, created_at DESC)
    WHERE read_at IS NULL;
-- After: partial index, sorted descending
Index Scan using idx_notif_unread on notifications
  (actual time=0.03..0.18 rows=20 loops=1)
  Index Cond: (user_id = 9182)
  Buffers: shared hit=6

p99 dropped from 1.2 s to 0.4 ms; the index shrank from 1.8 GB to 22 MB because only 1.2 % of rows are unread. [Postgres B-tree]

Case 2: Audit log search — LIKE 'prefix%' ignored the index. A compliance team searched audit events by request ID prefix:

SELECT * FROM audit_log WHERE request_id LIKE 'req_2026Q1_%';
-- Before: default text_ops index
Seq Scan on audit_log
  (actual time=0.1..8421.7 rows=14302 loops=1)
  Filter: (request_id ~~ 'req_2026Q1_%'::text)
  Rows Removed by Filter: 121884593

The default B-tree on text cannot serve LIKE patterns when the database collation is non-C. The fix was an opclass-aware index that uses byte-wise comparison:

CREATE INDEX idx_audit_reqid_pattern ON audit_log(request_id text_pattern_ops);
-- After: text_pattern_ops opclass enables prefix range scan
Index Scan using idx_audit_reqid_pattern on audit_log
  (actual time=0.07..6.2 rows=14302 loops=1)
  Index Cond: ((request_id ~>=~ 'req_2026Q1_') AND (request_id ~<~ 'req_2026Q2'))

Execution time fell from 8.4 s to 6 ms. The lesson: opclass selection (text_pattern_ops, varchar_pattern_ops, bpchar_pattern_ops) is mandatory whenever locale-aware text indexes meet LIKE or ~[PostgreSQL Docs].

Case 3: Analytics aggregate — sort spilled to disk. A reporting query grouped daily revenue by region:

SELECT region, sum(amount) FROM orders
WHERE created_at >= now() - interval '90 days'
GROUP BY region;

The plan showed an external merge sort writing 412 MB to pgsql_tmp. Raising work_mem was a temporary patch; the durable fix was a covering index that delivers rows already grouped by region:

CREATE INDEX idx_orders_region_amount ON orders(region, created_at)
    INCLUDE (amount);

The planner switched to an Index Only Scan plus HashAggregate; the temp spill disappeared and execution time dropped from 14 s to 380 ms.

BRIN Indexes for Time-Series and Append-Only Data

B-tree indexes are wasteful for append-only tables where the indexed column is correlated with physical row order. Time-series ingestion (events, metrics, logs) writes rows in roughly chronological order, so each 8 KB heap block contains a narrow timestamp range. BRIN exploits this correlation by storing one summary tuple per block range — typically 128 blocks (1 MB) — recording the min and max value:

-- 60 GB events table, 1.4B rows, monotonic created_at
CREATE INDEX idx_events_created_brin ON events
    USING brin(created_at) WITH (pages_per_range = 32);
-- Index size: 41 KB (B-tree equivalent: ~3.1 GB)
 
-- Range scan for the last 24 hours
EXPLAIN (ANALYZE, BUFFERS)
SELECT count(*) FROM events
WHERE created_at >= now() - interval '24 hours';
-- Bitmap Heap Scan on events
--   Recheck Cond: (created_at >= ...)
--   Heap Blocks: lossy=8412
--   ->  Bitmap Index Scan on idx_events_created_brin
--       Index Cond: (created_at >= ...)

The bitmap shows lossy=8412 because BRIN identifies candidate ranges, not exact tuples — the heap recheck filters non-matching rows. This is the right trade-off when correlation is high (pg_stats.correlation near 1.0) and selectivity is low (you scan 1 % of the table or more, not 0.001 %). BRIN is wrong for unordered insert patterns: random UUID v4 IDs make every range cover the entire keyspace, so every range matches every query. [PostgreSQL Docs]

Pair BRIN with BRIN_SUMMARIZE_NEW_VALUES after bulk loads, or set autosummarize = on on PostgreSQL 14+ so newly added ranges are summarized without a manual pass.

Multi-Column Column Ordering: When Reordering Broke Production

A payments service ran fine for 14 months on this index:

CREATE INDEX idx_charges_lookup ON charges(merchant_id, status, created_at);

A junior engineer "optimized" by reordering columns, believing the high-cardinality column belonged first:

DROP INDEX idx_charges_lookup;
CREATE INDEX idx_charges_lookup ON charges(created_at, merchant_id, status);

The next morning every merchant dashboard timed out. The hot query was:

SELECT id, amount FROM charges
WHERE merchant_id = 4821 AND status = 'captured'
ORDER BY created_at DESC LIMIT 100;

The reordered index made created_at the leftmost column — but the query has no equality or range predicate on created_at until after merchant_id and status. PostgreSQL fell back to a full index scan, reading 47M index entries to find 100 results. Reverting restored the millisecond plan.

The rule: leftmost columns must match the query's equality predicates first, then range predicates, with ORDER BY columns last and in the same direction. High cardinality is irrelevant for ordering — what matters is which columns the query filters on with =. If two indexes are otherwise equivalent, pick the one whose trailing columns match the ORDER BY so the planner can skip the sort step.

Frequently Asked Questions

What is the leftmost prefix rule for composite indexes?

A composite index on (a, b, c) can only be used efficiently when the query filters on a leftmost prefix: column a alone, columns a and b, or all three. Skipping the leftmost column forces a full table scan.

When should you use a partial index in PostgreSQL?

Use a partial index when queries consistently filter on a specific subset of rows, such as WHERE status = 'pending'. The index is smaller, faster to scan, and cheaper to maintain because it only includes rows matching the WHERE clause.

What is the difference between a B-tree and a GIN index?

B-tree indexes are optimal for equality and range queries on scalar values. GIN (Generalized Inverted Index) indexes are designed for multi-valued columns like arrays, JSONB fields, and full-text search tsvector columns.

How do you find missing indexes in PostgreSQL?

Check pg_stat_user_tables for tables with high seq_scan counts relative to idx_scan. Use EXPLAIN ANALYZE on slow queries and look for sequential scans with high Rows Removed by Filter — this indicates the query would benefit from an index.

Keep Reading

BackendBytes Engineering Team
BackendBytes Engineering Team

Engineering Team

A multidisciplinary team of backend engineers, architects, and DevOps practitioners shipping deep dives into distributed systems and production infrastructure.

Read Next