How to Read a Query Execution Plan Without Getting Lost
by Arif Ikhsanudin, Backend Developer
The output nobody reads until something breaks
EXPLAIN output gets pasted into Slack, misread, and argued about. Half the team doesn't know how to interpret it. The other half confuses estimated rows with actual rows and draws the wrong conclusion. The result is slow queries that stay slow because the root cause is never correctly identified.
Execution plans are not that hard to read once you have the mental model. Here's how to get useful signal from them in under five minutes.
The structure: a tree of operators
A query execution plan is a tree. Each node is an operator that takes input rows, processes them, and passes output rows to its parent. The root node is the last step — what gets returned to the client. The leaf nodes are where data is read from storage.
In PostgreSQL EXPLAIN output, the tree is represented by indentation — deeper nodes execute first:
Hash Join (cost=1234.00..5678.00 rows=4500 width=64)
Hash Cond: (o.user_id = u.id)
-> Seq Scan on orders o (cost=0.00..3200.00 rows=180000 width=32)
Filter: (status = 'pending')
-> Hash (cost=890.00..890.00 rows=27520 width=32)
-> Index Scan using users_pkey on users u (cost=0.43..890.00 rows=27520 width=32)
Reading this bottom-up: PostgreSQL will do an index scan on users, build a hash table from those rows, scan all rows in orders filtering for status = 'pending', then probe the hash table to join them.
The five numbers that matter
Each node shows: (cost=startup_cost..total_cost rows=estimated_rows width=avg_row_bytes)
startup_cost: estimated cost to return the first row (relevant for LIMIT queries where the optimizer can stop early)
total_cost: estimated cost to return all rows. This is the number the optimizer minimizes. Cost is in abstract units (not milliseconds), calibrated by seq_page_cost, random_page_cost, and cpu_tuple_cost parameters.
rows: estimated number of rows this node will output. This is the most important number for diagnosing plan problems.
width: average row width in bytes. Relevant for understanding data volume.
With EXPLAIN ANALYZE, you get actual values alongside estimates:
Seq Scan on orders (cost=0.00..45231.00 rows=950 width=64)
(actual time=0.043..1823.000 rows=847291 loops=1)
The discrepancy here is critical: estimated 950 rows, actual 847,291 rows. The optimizer was off by 3 orders of magnitude. Every plan decision downstream of this node was based on the wrong cardinality. This is almost certainly why the query is slow — the optimizer chose a plan optimized for 950 rows that's catastrophic for 847,291.
Key operators and what they mean
Seq Scan: reads every row in the table. The presence of a seq scan is not automatically bad — for small tables or low-selectivity queries it's correct. It's bad when you expected selectivity and didn't get it.
Index Scan: traverses the B-tree and fetches heap rows. Expect ~6 page reads to find an entry plus one heap page per result row. Bad when many result rows are scattered across many heap pages.
Index Only Scan: fetches data directly from the index without hitting the heap. Look for this on covering indexes — it's the fast path.
Bitmap Heap Scan (with Bitmap Index Scan): PostgreSQL's hybrid approach. Collects all matching row pointers from the index into a bitmap, sorts them by physical location, then reads heap pages in order. Better than Index Scan when many rows match (reduces random I/O). Worse than Seq Scan for very high row counts.
Hash Join: builds a hash table from the smaller input, probes it with the larger. Good for large unsorted inputs. Requires memory — if it spills to disk (Hash Batches > 1), performance degrades significantly.
Merge Join: requires both inputs to be sorted on the join key. Very efficient when sorted indexes are available. Expensive if a sort is required first.
Nested Loop: for each outer row, looks up matching inner rows. Fast when the inner side uses an index lookup. Catastrophically slow when the inner side does a seq scan — O(n²) complexity.
Sort: explicit sort node. Appears before ORDER BY, GROUP BY, DISTINCT. If you see Sort Method: external merge Disk: 54268kB, the sort spilled to disk — increase work_mem for this session or query.
What to look for in five minutes
-
Find the node with the highest actual time — that's your bottleneck. In
EXPLAIN ANALYZE, timing appears per node. -
Find the largest estimate vs actual row discrepancy — a 10x or greater gap means the optimizer made decisions based on wrong information. Run
ANALYZEon the table. -
Look for nested loops on large tables — a Nested Loop where the outer side produces thousands of rows and the inner side does anything other than a primary key lookup is suspect.
-
Check for hash batches > 1 —
Hash Batches: 8means the hash join spilled to disk 8 times. Increasework_mem. -
Verify index scans where you expect them — if you see a Seq Scan on a column you indexed, trace back to why (function on column, type mismatch, low selectivity).
The invocation to remember
-- PostgreSQL: full diagnostic output
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT ...;
-- BUFFERS shows shared hit/read counts — high 'read' vs 'hit' means cache misses
-- FORMAT TEXT is human-readable; FORMAT JSON is better for tools
Run this on every slow query before touching anything else. The plan tells you exactly what the database is doing. Guessing without it wastes time.