Why Your Services Can't Stop Talking to Each Other

by Arif Ikhsanudin, Backend Developer

What chatty services are telling you

Your order service calls the user service for profile data, the credit service for limit checks, the inventory service for availability, and the shipping service for rate calculations — all within a single request. You've added aggressive caching, reduced timeout windows, and deployed Envoy as a service mesh, and the latency is still unacceptable. The problem is not the network. The problem is that you've drawn your service boundaries in the wrong places and are now compensating with infrastructure.

Chatty services — services that can't serve a request without making multiple synchronous calls to other services — are a consistent indicator of one or more of these underlying issues: bounded contexts that were cut along technical layers rather than business domains, data that belongs in one service but lives in another, or orchestration logic that should be events-based but is synchronous by design.

The layered architecture trap

The most common cause of chatty services is drawing service boundaries along technical layers rather than business capabilities. Teams coming from layered monolith architecture (presentation, business logic, data access) replicate that structure as services: a "data service," a "business logic service," an "API gateway service." This is backwards.

A "data service" that just wraps database access for other services is not a microservice. It's a remote repository layer. Every business operation requires calling it, which means every service is permanently coupled to it. Adding caching doesn't fix this — it just trades staleness risk for latency improvement while keeping the fundamental coupling intact.

Services should own their data and expose business capabilities, not raw data access:

❌ Layered (creates chatty services)
Request → Order Logic Service
           → GET /data/users/{id}       (User Data Service)
           → GET /data/inventory/{id}   (Inventory Data Service)
           → GET /data/prices/{id}      (Price Data Service)
           → do logic locally
           → POST /data/orders          (Order Data Service)

✅ Domain-oriented (services own their data)
Request → Order Service
           → GET /users/{id}/order-context   (User Service — returns only what ordering needs)
           → POST /orders/initiate            (Order Service does its own writes)
         Async: publishes OrderInitiated event

Domain data replication as a coupling solution

When a service legitimately needs data from another domain for its own operations, the answer is often not a synchronous call — it's a local copy of the relevant data, kept current via events.

The Order Service needing to check whether a user is in good standing (active account, no fraud flags) does not require a synchronous call to the User Service on every order request. The User Service can publish UserStatusChanged events to a Kafka topic. The Order Service maintains a local user_status table, consuming those events:

CREATE TABLE user_order_eligibility (
  user_id       UUID PRIMARY KEY,
  is_eligible   BOOLEAN NOT NULL DEFAULT TRUE,
  reason        VARCHAR(255),
  updated_at    TIMESTAMP NOT NULL
);

Now the Order Service checks eligibility locally with a single DB read. No network call. No dependency on User Service uptime. The data is eventually consistent — if a user is flagged for fraud, there's a short window where they could still place orders. For most systems, that window (seconds to milliseconds, depending on event processing lag) is acceptable. If it's not acceptable, you have a synchronous query requirement, and you should model it that way explicitly.

Orchestration versus choreography

Another source of chatty services is orchestration-heavy design: one service calling a sequence of other services to drive a workflow. The Order Service calls Inventory Service to reserve stock, calls Payment Service to charge the card, calls Fulfillment Service to schedule delivery. Every step is a synchronous dependency, every failure cascades.

Choreography — event-driven coordination — reduces this coupling. Each service reacts to events from the previous step without being called:

Order Service publishes: OrderConfirmed
  → Inventory Service consumes: reserves stock, publishes: StockReserved
  → Payment Service consumes: charges card, publishes: PaymentCollected
  → Fulfillment Service consumes: schedules delivery, publishes: ShipmentScheduled

No service calls another directly. The workflow emerges from event subscriptions. Adding a new step (fraud check between order confirmation and inventory reservation) means a new consumer, not a change to Order Service. Removing a step means removing a consumer. The coupling is to the event schema, not to other services' APIs.

The downside: workflow state is distributed. Debugging a failed workflow requires correlating events across multiple topics and services. You need distributed tracing and event correlation IDs from the start, not as an afterthought.

When synchronous calls are unavoidable

Some inter-service calls are genuinely synchronous requirements: real-time credit decisions, inventory availability at checkout, pricing at point of sale. These should be the exception, not the default, and they should be designed with the assumption that the downstream service will sometimes be slow or unavailable.

If after restructuring your domain model you still have five synchronous calls per request, look at whether those calls can be parallelized. If they're independent, fan them out concurrently:

CompletableFuture<UserContext> userFuture = 
    CompletableFuture.supplyAsync(() -> userClient.getOrderContext(userId));
CompletableFuture<InventoryStatus> inventoryFuture = 
    CompletableFuture.supplyAsync(() -> inventoryClient.getStatus(itemIds));

CompletableFuture.allOf(userFuture, inventoryFuture).join();
// total latency = max(user latency, inventory latency), not sum

But if you find yourself doing this routinely, it's still a signal that the domain model is wrong — you're compensating for a boundary problem with concurrency tricks.

The right question when services won't stop talking: which of these calls could be eliminated by moving data ownership to the service that needs it? Answer that first. Then optimize the calls that remain.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Imposter Syndrome Hits Hard: What to Do

That sinking feeling that you’re faking it, even when you’re not. Every developer, founder, or manager hits this wall at some point.

Read more

Client Office Requirements That Kill Contractor Efficiency

“Just come to the office five days a week and use our setup.” It sounds normal—until you realize how much productivity quietly disappears.

Read more

The Real Cost of a Backend Hire in Stockholm in 2025 — And the Async Alternative

You budgeted SEK 65K a month for a backend engineer. The actual cost turned out to be closer to SEK 100K once you added everything the job listing didn't mention.

Read more

Caching Is Not a Performance Fix. It Is a Performance Tool.

Caching solves a specific class of problems well and creates a different class of problems in return. Reaching for it without understanding both sides is how you introduce subtle data consistency bugs that take months to find.

Read more