Sharding Your Database Sounds Exciting Until You Actually Have to Do It

by Arif Ikhsanudin, Backend Developer

What Sharding Actually Is

Sharding means horizontally partitioning your data across multiple database instances. Instead of one database holding all your data, you have N databases (shards) each holding 1/N of the data. Queries that access data on one shard execute locally. Queries that need data across shards require cross-shard coordination.

The performance benefit is real: write throughput scales with shard count, each shard's dataset is smaller (better index performance, less I/O), and hardware costs can be distributed. The operational cost is also real, and it is large.

Most teams that say they "need sharding" actually need read replicas, better query optimization, or a larger instance. Sharding is the right answer for a small class of problems. It is usually the wrong first move.

What You Give Up

Cross-shard queries become expensive or impossible. If your data is sharded by user ID, a query that aggregates data across users requires querying every shard and merging results in the application. A report that took one SQL query now requires N queries plus merge logic. If the merge is complex — sorting, grouping, deduplication — the application code bears that burden.

Transactions across shards require distributed coordination. ACID transactions within a single shard work normally. Transactions that span shards — moving a record from shard A to shard B, or updating records on two shards atomically — require two-phase commit (2PC) or saga patterns. 2PC is slow and failure-prone. Sagas are complex to implement correctly. Most applications simply avoid cross-shard transactions, which constrains the data model.

Joins across shards are gone. SQL joins work within a shard. Cross-shard joins require fetching data from multiple shards and joining in the application layer, which is typically slower and harder to optimize than a database join.

Rebalancing is painful. When you add a new shard, you need to redistribute some data to the new shard. This requires a data migration while the system is live — reading records from existing shards, writing them to the new shard, updating routing, deleting the migrated records. Done wrong, this is a data loss event.

# What "simple" sharding adds to your operational footprint:
- Shard routing logic in the application or a middleware layer
- Shard key design (wrong key = hot shards, uneven distribution)
- Cross-shard query handling (application-level aggregation)
- Distributed transaction handling for any cross-shard writes
- Rebalancing procedures when adding shards
- Per-shard monitoring and alerting
- Schema migrations across all shards simultaneously
- Backup and restore per shard

When It Is Actually Warranted

Sharding is warranted when you have exhausted these options in order:

  1. Query optimization and indexing
  2. Connection pooling (PgBouncer)
  3. Read replicas for read offload
  4. Vertical scaling the primary
  5. Caching for hot-path reads
  6. Application-level data archiving (moving old data to cold storage, reducing working set)

If you have done all of these and write throughput or primary storage is still the bottleneck, sharding may be necessary. The typical threshold where sharding becomes worth the cost is write volumes in the millions per second sustained, or datasets in the tens of terabytes where the working set no longer fits in memory.

Alternatives That Are Often Sufficient

Partitioning within a single database. PostgreSQL's table partitioning (declarative partitioning by range or list) splits a large table into physical partitions within the same database instance. Queries that filter on the partition key only scan the relevant partition. This gives some of the query performance benefit of sharding with none of the cross-shard coordination complexity.

Functional decomposition. Split different features into different databases before sharding any single database. An analytics database separate from the transactional database, each sized and scaled for its workload, is often more effective than sharding the combined database.

Managed distributed databases. CockroachDB and Google Spanner handle sharding and replication internally. You use them like a single PostgreSQL-compatible database; they handle shard management. You still pay in consistency trade-offs (CockroachDB uses serializable isolation with some distributed transaction latency; Spanner is expensive) but you offload the operational complexity.

Try everything else first. Sharding is a last resort, not a scaling strategy.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

How to Laugh at Yourself After a Huge Mistake

We’ve all been there: the code breaks, the email goes to the wrong person, or the deployment wipes out production. Learning to laugh at these moments can save your sanity and even make you a better professional.

Read more

Stop Skipping Integration Tests in Spring Boot

Unit tests give you confidence your classes work in isolation. Integration tests tell you whether your application actually works. Most Spring Boot projects have too few of the latter — and pay for it in production.

Read more

The Contractor Who Treats Every Project Like It Is Their Own Business Always Stands Out

The difference between a contractor who is easy to replace and one who becomes indispensable is not technical depth — it is the degree to which they genuinely care about the outcome.

Read more

Blue Green Deployment: The Strategy That Makes Rollbacks Painless

Blue-green deployment eliminates most of the risk in production releases by keeping the previous version fully operational until the new one is validated. The infrastructure cost is real — but it's often far less than the cost of a difficult rollback.

Read more