Caching Is Not a Silver Bullet. It Is a Trade-off.

by Arif Ikhsanudin, Backend Developer

The Instinct Is Right but Incomplete

When a database query is slow, the instinct to cache the result is correct. Read the data once, store it, serve subsequent reads from memory. Response time drops. Database load drops. Everything looks better.

What the dashboard does not show: you now have two copies of the data in two different systems with no automatic mechanism to keep them in sync. Every write to the database creates a potential inconsistency with the cache. Every cache entry has a lifetime after which it may be stale. Every cache miss falls through to the database you were trying to protect.

Caching trades consistency for performance. That trade is often worth making. The mistake is not seeing it as a trade at all.

What You Are Actually Buying

Caching buys reduced read latency and reduced load on the origin (database, external service). The performance benefit is real and often dramatic. A Redis cache with sub-millisecond latency serving a result that would take 200ms to compute is a 200x improvement for cache hits.

What you are paying: staleness window, consistency complexity, an additional failure mode (cache unavailability), and memory cost for the cached data.

The staleness window is the most underappreciated cost. A cache entry with a 60-second TTL means any client can see data that is up to 60 seconds old. For a product catalog, that is acceptable — prices change infrequently, and a 60-second lag has no business impact. For an account balance, it is not — a user who just transferred money expects to see the updated balance immediately.

Cache what changes infrequently and is expensive to compute. Do not cache what changes frequently and must be current.

The Failure Modes

Cache stampede (thundering herd). A popular cache entry expires. At the moment of expiration, 500 concurrent requests all miss the cache, all hit the database simultaneously, and all attempt to populate the cache simultaneously. The database receives 500 queries for data it was previously being shielded from.

Mitigation: probabilistic early expiration (refresh cache before it expires, with probability increasing as expiration approaches) or a distributed lock on cache population — the first miss acquires the lock and populates, others wait for the new value.

# Cache stampede prevention with a simple lock:
def get_with_lock(key, ttl, compute_fn):
    value = cache.get(key)
    if value is not None:
        return value

    lock_key = f"lock:{key}"
    acquired = cache.set(lock_key, "1", nx=True, ex=5)  # 5s lock timeout

    if acquired:
        value = compute_fn()
        cache.set(key, value, ex=ttl)
        cache.delete(lock_key)
        return value
    else:
        # Another process is computing -- wait briefly and retry
        time.sleep(0.1)
        return cache.get(key)  # May still be None on retry -- handle appropriately

Cache as a crutch for a slow query. If the query behind the cache is slow, cache misses are expensive. If cache hit rates drop — new users, cache restarts, invalidation events — the database is exposed. A slow query that runs 100ms with 99% cache hit rate becomes a 100ms multiplied by 100 concurrent miss problem during a cache restart. Fix slow queries; do not hide them.

Over-caching writes. Write-through caching (update the cache on every write) seems safe but adds latency to every write. Write-behind caching (update the database asynchronously from the cache) risks data loss if the cache fails before the write is persisted. Neither is a good default for write-heavy data.

When Not to Cache

Do not cache user-specific data at a shared cache layer without scoping it carefully by user ID — caching the wrong user's data for another user is a serious security incident. Do not cache data with regulatory requirements for freshness (financial balances, health records). Do not cache computed results from operations with side effects.

Cache reads. Be careful about caching writes. Be explicit about staleness tolerance before you choose a TTL. These decisions belong in the design, not as an afterthought when performance problems surface.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Why Niching Down as a Backend Contractor Makes You More Hireable Not Less

Narrowing your focus feels like limiting your options. In practice, it makes you dramatically easier to hire — because clients can finally see exactly why they need you.

Read more

The First Impression You Make Before the Client Even Talks to You

Before any call or email exchange, clients have already formed an opinion about you. What they find when they look you up is the first impression that matters most.

Read more

Stop Paying Local Rates for Backend Work That Can Be Done Async and Remotely

Most backend work doesn't require someone in the same room, the same city, or even the same timezone. The pricing model should reflect that.

Read more

Spring Security in Practice — Authentication, Authorization, and the Filters That Run on Every Request

Spring Security is comprehensive and opaque until you understand its filter chain model. Here is how authentication and authorization actually work, how to configure each layer, and what runs on every request before your controller sees it.

Read more