SQL vs NoSQL — How I Actually Make This Decision for a New Project

by Arif Ikhsanudin, Backend Developer

The scalability argument that misled a generation of engineers

The decade of NoSQL evangelism built on "relational databases don't scale" produced a generation of teams that chose MongoDB for applications that had perfectly relational data, then spent the next year fighting the lack of joins, the write amplification from denormalization, and the consistency surprises from eventually-consistent reads. Meanwhile, PostgreSQL was scaling to tens of millions of rows on commodity hardware with read replicas and partitioning, serving companies with millions of users without incident.

SQL scales. NoSQL solves different problems. Here is how to think clearly about which problems you have.

When relational databases are clearly correct

Your data has relationships. Orders belong to customers. Line items belong to orders. Payments reference orders. Inventory records reference products. This is a relational model. Forcing it into a document database means either embedding documents (denormalization — now you have update anomalies when a customer email changes) or storing references and doing application-level joins (now you are implementing a worse version of the query planner in your service layer).

-- Relational model — the joins are the feature, not the problem
SELECT
    o.id AS order_id,
    o.created_at,
    c.email,
    SUM(li.qty * li.unit_price_cents) AS total_cents
FROM orders o
JOIN customers c ON c.id = o.customer_id
JOIN line_items li ON li.order_id = o.id
WHERE o.status = 'pending'
  AND o.created_at > NOW() - INTERVAL '24 hours'
GROUP BY o.id, o.created_at, c.email
ORDER BY total_cents DESC
LIMIT 50;

This query in MongoDB requires either a $lookup aggregation (a join, with different syntax and worse optimization) or pre-aggregated data stored redundantly. Neither is simpler than the SQL version.

PostgreSQL with JSONB gives you a practical escape valve for the cases where your relational model has semi-structured exceptions: a product table with a metadata JSONB column for category-specific attributes, an events table with a payload JSONB column for different event types. You get relational structure for the stable schema and document flexibility for the variable parts.

When NoSQL is actually the right tool

Document databases (MongoDB, CouchDB) earn their place when:

The entity is genuinely document-centric — it is always read and written as a whole unit, rarely joined with other entities, and has variable structure that changes per record. A content management system where each document type has different fields, a product catalog where different product categories have completely different attributes, a form builder where each form has a different schema. These are legitimate document storage use cases.

The schema evolves extremely rapidly — weekly structural changes — and the overhead of SQL migrations is a genuine bottleneck. MongoDB's schema-less model removes migration gates, at the cost of losing schema enforcement at the database layer (move schema validation to your application layer explicitly, or use MongoDB Schema Validation).

Key-value stores (Redis, DynamoDB) earn their place when:

The access pattern is pure key lookups — you always query by primary key and never need to scan or filter by non-key attributes. Session storage, cache entries, leaderboards, counters. DynamoDB at 100,000+ writes per second with single-digit millisecond P99 latency is genuinely impressive for this use case. But the moment you need to query by a non-key attribute, DynamoDB requires a Global Secondary Index, and if you need multiple filter conditions on non-key attributes, you are paying for a query model that works against the database's design.

Wide-column stores (Cassandra, ScyllaDB) earn their place when:

Write throughput is the primary requirement, the access patterns are well-defined and narrow, and you are operating at a scale where distributed writes genuinely require a leaderless replication model. Time-series data from IoT devices, event logs at millions of events per second, audit trails at extreme scale. ScyllaDB in particular (Cassandra-compatible, written in C++) delivers 1-2 million writes per second on a 3-node cluster on modern NVMe hardware. PostgreSQL with TimescaleDB can handle significant time-series loads, but at this scale, a purpose-built column store wins.

The schema evolution argument, honestly

The claim that document databases are better for evolving schemas is true in a narrow sense and misleading in practice. Yes, you can add a field to MongoDB documents without a migration. But you now have documents in production with different shapes. Your application code must handle both the old shape (field missing) and the new shape (field present). That is not simpler than a SQL migration — it is the same complexity, moved from the database layer to the application layer, where it is less visible and harder to enforce.

Rails migrations and Flyway (Java) make SQL schema evolution low-friction for the common cases. Adding a nullable column to a PostgreSQL table with NOT VALID constraint checking is non-blocking on modern Postgres. Large table alterations can be done with pg_repack or ALTER TABLE ... SET NOT NULL USING patterns that avoid table locks.

The decision I actually run

Before picking a database, I write down the top five query patterns the application needs to support. Not hypothetically — the actual queries, with the WHERE clauses and JOINs. Then:

  • If the queries join multiple entities with filters across them: relational database, full stop.
  • If the queries are all primary key lookups and the data is document-like: document database is justified.
  • If the write throughput exceeds 50,000 writes per second and the query patterns are narrow: wide-column or key-value store.
  • If there is mixed data: relational database as the primary store, with Redis for caching and possibly a separate search index for full-text needs.

Start with PostgreSQL. Add purpose-built stores when you have data that does not fit the relational model and you have proven that it does not fit, not because you anticipated that it might not.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

The Contractor Who Documents Everything Wins. Here Is Why.

Documentation is not a chore to get through after the real work is done. It is a professional differentiator that determines whether clients can trust you with more.

Read more

New York Startups Are Rethinking Full-Time Backend Hires — Here Is Why

You posted the job listing six weeks ago. You're still interviewing — and your backend hasn't moved an inch.

Read more

Testing Your Docker Setup Before It Hits Production

Most Docker configuration bugs — wrong users, missing volumes, read-only filesystem failures, resource limit mismatches — are discoverable before production if you know what to test and how. A structured local validation process catches the class of issues that only appear at runtime.

Read more

Message Queues vs Direct API Calls — A Decision Guide With Real Trade-offs

The choice between publishing to a message queue and calling a downstream API directly determines your system's failure boundary — and getting it wrong in either direction creates either over-engineering or brittle coupling.

Read more