Your API Gateway Should Be Doing More Than Just Routing

by Arif Ikhsanudin, Backend Developer

What most teams use their gateway for

Route /api/orders/* to the Order Service. Route /api/users/* to the User Service. Forward the request. Return the response. This is load balancing, not an API gateway. If that's all your gateway does, you've added a network hop without capturing the value that a gateway layer can provide.

The API gateway is the single entry point for all external traffic. That position in your architecture makes it the right place to enforce policies that apply to every request, regardless of which downstream service handles it. Moving those concerns into the gateway means they're enforced consistently and service teams don't re-implement them.

Authentication and authorization at the gateway

JWT validation at the gateway — verifying signature, checking expiry, validating issuer and audience — is the single-validation pattern. Downstream services receive verified identity in request headers and don't need to re-validate tokens.

Kong (a widely used open-source gateway) makes this declarative:

# Kong plugin: JWT validation on all routes
plugins:
- name: jwt
  config:
    secret_is_base64: false
    claims_to_verify:
    - exp
    - nbf
    key_claim_name: kid
    # Public keys fetched from JWKS endpoint
    jwks_uri: https://auth.internal/.well-known/jwks.json

After validation, Kong can forward verified claims as headers. Services receive X-User-Id, X-User-Roles, X-Tenant-Id without having touched the JWT. When your token format changes, you update the gateway plugin configuration — not eight individual service auth implementations.

Coarse-grained authorization (is this route accessible to unauthenticated users? does this endpoint require admin role?) also belongs at the gateway. Fine-grained authorization (can this user modify this specific order?) belongs in the service.

Rate limiting

Without rate limiting at the gateway, a single misbehaving client (or a DDoS) can exhaust your downstream services. Rate limiting in each service is wasteful — each service re-implements the same concern, and limits are applied per-service rather than per-client across your API.

Gateway-level rate limiting is applied before requests reach your services:

# Kong rate limiting: 1000 requests per hour per authenticated user
plugins:
- name: rate-limiting
  config:
    hour: 1000
    policy: local          # or redis for distributed rate limiting
    limit_by: consumer     # per authenticated user
    error_code: 429
    error_message: "Rate limit exceeded"

For authenticated APIs, rate limit by user identity. For public endpoints, rate limit by IP with more generous limits. For internal service-to-service calls that bypass the gateway, rate limiting is handled at the service mesh layer or not at all (internal services are trusted not to abuse each other).

Request and response transformation

Gateways can adapt request and response shapes without service changes. Common patterns:

Header injection: add headers downstream services need (request ID for tracing, user context, timestamp).

API versioning routing: route /v1/orders and /v2/orders to different backend service versions or versions of the same service, without exposing version implementation to clients.

Response filtering: strip internal fields from responses before they reach external clients (internal database IDs, implementation details, debugging fields that should not be in the public API).

Protocol translation: accept REST externally, translate to gRPC for internal service calls. Kong's gRPC-gateway plugin, or a custom transformer, handles this translation at the gateway layer rather than in every client.

Observability: where gateway instrumentation pays off

The gateway has visibility into every external request your system receives. That position makes it uniquely valuable for observability:

  • Access logs with client identity, route, response code, latency, and response size — the baseline for API usage analysis and abuse detection
  • Correlation ID injection: generate a X-Correlation-Id header on every incoming request if not already present, and propagate it downstream. All downstream services log with this ID. Tracing a user complaint becomes a log search rather than archaeology.
  • Latency histograms by route: gateway-level latency metrics show the client-perceived response time for every API endpoint, which is the number your SLA actually cares about (not internal service latency, which excludes gateway processing and network time)
# Kong Prometheus plugin: gateway-level metrics
plugins:
- name: prometheus
  config:
    status_code_metrics: true
    latency_metrics: true
    bandwidth_metrics: true
    upstream_health_metrics: true

What the gateway should not do

Business logic: the gateway is infrastructure. If you find yourself writing business rules in gateway plugins (validate that this field has this value, apply this discount logic), that logic belongs in a service.

Service orchestration: the gateway should route to one upstream per request. If you're building a Backend for Frontend (BFF) pattern that aggregates multiple service calls, that belongs in a dedicated BFF service, not in gateway plugin code.

All authorization: coarse-grained role checks belong at the gateway. Whether user 123 can delete order 456 — that requires domain knowledge and belongs in the Order Service.

The gateway is most valuable when it's thin, consistent, and limited to genuinely cross-cutting concerns. When it accumulates business logic, it becomes a deployment bottleneck for concerns that belong to individual service teams.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Testing Rails APIs with RSpec — My Practical Approach

Request specs in Rails test the full stack efficiently, but most teams either over-test at the wrong layer or under-test the cases that matter. Here is the structure that finds real bugs without slowing the suite down.

Read more

Sydney Startups Pay A$160K for Backend Engineers Who Get Poached in 6 Months — The Async Fix

You spent four months hiring her. She shipped one major feature. Then Atlassian called and she was gone before the next sprint started.

Read more

The Simplest System That Solves the Problem Is Almost Always the Right One

Simplicity is not a concession to lack of ambition. It is the deliberate choice to not carry complexity you have not earned by solving a problem that complexity actually addresses.

Read more

Rate Limiting Is Not Just for Big Companies

Rate limiting protects your service from abuse, accidental overload, and cascading failures. It is not infrastructure you add at scale — it is a basic safety property every production API should have from day one.

Read more