SQL vs NoSQL — How I Actually Make This Decision for a New Project
by Arif Ikhsanudin, Backend Developer
The scalability argument that misled a generation of engineers
The decade of NoSQL evangelism built on "relational databases don't scale" produced a generation of teams that chose MongoDB for applications that had perfectly relational data, then spent the next year fighting the lack of joins, the write amplification from denormalization, and the consistency surprises from eventually-consistent reads. Meanwhile, PostgreSQL was scaling to tens of millions of rows on commodity hardware with read replicas and partitioning, serving companies with millions of users without incident.
SQL scales. NoSQL solves different problems. Here is how to think clearly about which problems you have.
When relational databases are clearly correct
Your data has relationships. Orders belong to customers. Line items belong to orders. Payments reference orders. Inventory records reference products. This is a relational model. Forcing it into a document database means either embedding documents (denormalization — now you have update anomalies when a customer email changes) or storing references and doing application-level joins (now you are implementing a worse version of the query planner in your service layer).
-- Relational model — the joins are the feature, not the problem
SELECT
o.id AS order_id,
o.created_at,
c.email,
SUM(li.qty * li.unit_price_cents) AS total_cents
FROM orders o
JOIN customers c ON c.id = o.customer_id
JOIN line_items li ON li.order_id = o.id
WHERE o.status = 'pending'
AND o.created_at > NOW() - INTERVAL '24 hours'
GROUP BY o.id, o.created_at, c.email
ORDER BY total_cents DESC
LIMIT 50;
This query in MongoDB requires either a $lookup aggregation (a join, with different syntax and worse optimization) or pre-aggregated data stored redundantly. Neither is simpler than the SQL version.
PostgreSQL with JSONB gives you a practical escape valve for the cases where your relational model has semi-structured exceptions: a product table with a metadata JSONB column for category-specific attributes, an events table with a payload JSONB column for different event types. You get relational structure for the stable schema and document flexibility for the variable parts.
When NoSQL is actually the right tool
Document databases (MongoDB, CouchDB) earn their place when:
The entity is genuinely document-centric — it is always read and written as a whole unit, rarely joined with other entities, and has variable structure that changes per record. A content management system where each document type has different fields, a product catalog where different product categories have completely different attributes, a form builder where each form has a different schema. These are legitimate document storage use cases.
The schema evolves extremely rapidly — weekly structural changes — and the overhead of SQL migrations is a genuine bottleneck. MongoDB's schema-less model removes migration gates, at the cost of losing schema enforcement at the database layer (move schema validation to your application layer explicitly, or use MongoDB Schema Validation).
Key-value stores (Redis, DynamoDB) earn their place when:
The access pattern is pure key lookups — you always query by primary key and never need to scan or filter by non-key attributes. Session storage, cache entries, leaderboards, counters. DynamoDB at 100,000+ writes per second with single-digit millisecond P99 latency is genuinely impressive for this use case. But the moment you need to query by a non-key attribute, DynamoDB requires a Global Secondary Index, and if you need multiple filter conditions on non-key attributes, you are paying for a query model that works against the database's design.
Wide-column stores (Cassandra, ScyllaDB) earn their place when:
Write throughput is the primary requirement, the access patterns are well-defined and narrow, and you are operating at a scale where distributed writes genuinely require a leaderless replication model. Time-series data from IoT devices, event logs at millions of events per second, audit trails at extreme scale. ScyllaDB in particular (Cassandra-compatible, written in C++) delivers 1-2 million writes per second on a 3-node cluster on modern NVMe hardware. PostgreSQL with TimescaleDB can handle significant time-series loads, but at this scale, a purpose-built column store wins.
The schema evolution argument, honestly
The claim that document databases are better for evolving schemas is true in a narrow sense and misleading in practice. Yes, you can add a field to MongoDB documents without a migration. But you now have documents in production with different shapes. Your application code must handle both the old shape (field missing) and the new shape (field present). That is not simpler than a SQL migration — it is the same complexity, moved from the database layer to the application layer, where it is less visible and harder to enforce.
Rails migrations and Flyway (Java) make SQL schema evolution low-friction for the common cases. Adding a nullable column to a PostgreSQL table with NOT VALID constraint checking is non-blocking on modern Postgres. Large table alterations can be done with pg_repack or ALTER TABLE ... SET NOT NULL USING patterns that avoid table locks.
The decision I actually run
Before picking a database, I write down the top five query patterns the application needs to support. Not hypothetically — the actual queries, with the WHERE clauses and JOINs. Then:
- If the queries join multiple entities with filters across them: relational database, full stop.
- If the queries are all primary key lookups and the data is document-like: document database is justified.
- If the write throughput exceeds 50,000 writes per second and the query patterns are narrow: wide-column or key-value store.
- If there is mixed data: relational database as the primary store, with Redis for caching and possibly a separate search index for full-text needs.
Start with PostgreSQL. Add purpose-built stores when you have data that does not fit the relational model and you have proven that it does not fit, not because you anticipated that it might not.