What Actually Happens When SQL Executes Your Query
by Arif Ikhsanudin, Backend Developer
Why you can't predict query performance from SQL alone
Two developers argue about whether to rewrite a query. One insists the JOIN version is faster. The other has benchmarks showing the subquery version is equally fast on their machine. Both are right in different contexts, which means neither can explain why. The root cause is the same in both cases: neither developer has a mental model of what the database actually does when it executes SQL.
SQL is declarative. You say "give me all users who placed an order this month," and the database decides the execution strategy — which tables to read first, which indexes to use, how to combine results. This decoupling is powerful, but it means you can't reason about performance from the SQL text alone. You have to understand the pipeline.
The five stages of SQL execution
1. Parsing and validation
The database first parses your SQL into an abstract syntax tree and validates it: are the table names real? Do the columns exist? Does the user have permission? This is fast and happens before any data is touched.
2. Query rewriting
Before the optimizer runs, many databases apply rule-based rewrites. PostgreSQL, for example, will rewrite a view reference into the underlying query, expand * into explicit column names, and sometimes transform IN (subquery) into a semi-join internally. These rewrites happen invisibly and can significantly change what the optimizer sees.
3. Query optimization — the hard part
The optimizer's job is to find the lowest-cost execution plan from the space of all possible plans. For a query joining five tables, the number of possible join orderings is 5! = 120, and that's before considering different join algorithms (hash join, merge join, nested loop join) and different access methods (sequential scan, index scan, index-only scan, bitmap heap scan).
The optimizer estimates the cost of each plan using statistics — specifically, the distribution of data in each column. In PostgreSQL, this is controlled by pg_statistic and updated by the ANALYZE command. In MySQL, it's maintained by InnoDB's internal statistics, updated via ANALYZE TABLE.
When statistics are stale or misleading, the optimizer makes bad choices. This is the source of a huge proportion of "mysterious" slow query reports.
4. Plan execution
The selected plan is a tree of operators. Each operator reads from one or more inputs and produces a stream of rows. Common operators:
- Seq Scan: reads every row in a table. O(n) in rows.
- Index Scan: follows index entries to fetch individual heap rows. Good for selective queries returning few rows.
- Index Only Scan: returns data directly from the index without hitting the heap. Requires the index to cover all needed columns.
- Hash Join: builds a hash table from the smaller relation, then probes it with each row from the larger. Good for large unsorted inputs.
- Merge Join: merges two sorted inputs. Requires sorted data, but is very efficient when that's available.
- Nested Loop Join: for each row in the outer relation, scan the inner relation. Extremely fast when the inner side is an index lookup; catastrophic when it's a full scan.
-- A simplified representation of what EXPLAIN shows you
Hash Join
Hash Cond: (o.user_id = u.id)
-> Seq Scan on orders o
Filter: (created_at > '2024-01-01')
-> Hash
-> Index Scan on users u
Index Cond: (status = 'active')
Reading this plan: PostgreSQL will scan all orders, filter by date, build a hash table from active users (using an index), then probe the hash table for each order row.
5. Result delivery
The database streams the result set back to the client. For large result sets with ORDER BY, this requires sorting before delivery, which may spill to disk if the work_mem (PostgreSQL) or sort_buffer_size (MySQL) is too small.
What this means in practice
Row estimates drive plan choice. If the optimizer thinks a filter returns 100 rows but it actually returns 100,000, it may choose a nested loop join that's catastrophic at the real cardinality. This is why EXPLAIN ANALYZE — which shows both estimated and actual row counts — is more useful than EXPLAIN alone.
-- PostgreSQL: see estimated vs actual rows
EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id)
FROM users u
JOIN orders o ON u.id = o.user_id
GROUP BY u.name;
Look for large discrepancies between rows=X (estimate) and actual rows=Y. A 10x or greater difference is a signal that statistics need updating or that the optimizer needs help via query restructuring.
The optimizer is not magic. It makes locally greedy choices within a plan search space. Highly complex queries with many joins can exceed the optimizer's search budget (join_collapse_limit in PostgreSQL defaults to 8), causing it to evaluate only a subset of possible plans.
You can influence the plan. You can't force a specific plan in standard SQL (unlike Oracle's hints), but you can: create covering indexes, update statistics with ANALYZE, adjust work_mem per session for sort-heavy queries, or rewrite the query to expose a simpler structure to the optimizer.
Start here
Run EXPLAIN (ANALYZE, BUFFERS) on any slow query in PostgreSQL, or EXPLAIN FORMAT=JSON in MySQL. Before you tune anything, read the plan. Find the node with the highest actual time or the worst estimate-vs-actual row ratio. That node is your bottleneck. Everything else is secondary.