API Gateway — Deep Dive

Level: Intermediate
Pre-reading: 05 · API & Communication

What is an API Gateway?

An API Gateway is a single entry point for all client requests. It handles cross-cutting concerns and routes requests to backend services.

graph TD
    M[Mobile] --> GW[API Gateway]
    W[Web] --> GW
    P[Partner] --> GW
    GW --> O[Order Service]
    GW --> U[User Service]
    GW --> PAY[Payment Service]

Core Responsibilities

Responsibility	Description
Routing	Direct requests to appropriate backend services
Authentication	Validate JWT/API keys before forwarding
Authorization	Check permissions for routes
Rate Limiting	Throttle per client/IP
SSL Termination	Handle TLS; internal traffic can be plain HTTP
Load Balancing	Distribute across service instances
Caching	Cache GET responses
Request/Response Transform	Modify headers, body shape
Aggregation	Combine multiple service responses
Observability	Centralized logging and metrics

Routing Patterns

Path-Based Routing

routes:
  - path: /api/orders/**
    service: order-service
  - path: /api/users/**
    service: user-service
  - path: /api/products/**
    service: product-service

Header-Based Routing

routes:
  - headers:
      X-API-Version: v2
    service: order-service-v2
  - headers:
      X-API-Version: v1
    service: order-service-v1

Canary Routing

routes:
  - path: /api/orders/**
    targets:
      - service: order-service-v1
        weight: 90
      - service: order-service-v2
        weight: 10

Authentication at the Gateway

Validate tokens centrally; forward validated claims to services.

sequenceDiagram
    participant C as Client
    participant GW as API Gateway
    participant AS as Auth Server
    participant S as Service

    C->>GW: Request + JWT
    GW->>GW: Validate JWT signature
    GW->>GW: Check expiry, issuer, audience
    alt Invalid
        GW->>C: 401 Unauthorized
    else Valid
        GW->>S: Request + X-User-Id header
        S->>GW: Response
        GW->>C: Response
    end

Gateway JWT Configuration (Kong)

plugins:
  - name: jwt
    config:
      claims_to_verify:
        - exp
      key_claim_name: kid
      secret_is_base64: false
  - name: request-transformer
    config:
      add:
        headers:
          - X-User-Id:$(jwt.claims.sub)

Rate Limiting

Protect backend services from overload.

Rate Limiting Strategies

Strategy	Description
Per client	Limit by API key or user ID
Per IP	Limit by source IP
Per route	Different limits per endpoint
Global	Total gateway throughput

Rate Limiting Response

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1699900000

Configuration (Kong)

plugins:
  - name: rate-limiting
    config:
      minute: 100
      hour: 1000
      policy: redis
      redis_host: redis

Request/Response Transformation

Modify requests before forwarding or responses before returning.

Use Cases

Transformation	Example
Add headers	Add correlation ID, user context
Remove headers	Strip internal headers from response
Rename fields	Map external field names to internal
Filter response	Remove sensitive fields

Example (Kong)

plugins:
  - name: request-transformer
    config:
      add:
        headers:
          - X-Request-ID:$(uuid)
          - X-Forwarded-Host:$(host)
      remove:
        headers:
          - X-Internal-Header

Response Caching

Cache GET responses to reduce backend load.

plugins:
  - name: proxy-cache
    config:
      strategy: memory
      content_type:
        - application/json
      cache_ttl: 300
      cache_control: true

Cache Considerations

Consideration	Recommendation
TTL	Short for dynamic data; longer for static
Invalidation	Use cache headers or explicit purge
Vary headers	Cache per user if response varies
Storage	Redis for distributed; memory for single instance

Gateway Aggregation

Combine responses from multiple services into one response.

sequenceDiagram
    participant C as Client
    participant GW as API Gateway
    participant OS as Order Service
    participant US as User Service
    participant PS as Product Service

    C->>GW: GET /dashboard
    par Parallel
        GW->>OS: GET /orders
        GW->>US: GET /profile
        GW->>PS: GET /recommendations
    end
    OS->>GW: Orders
    US->>GW: Profile
    PS->>GW: Recommendations
    GW->>GW: Aggregate
    GW->>C: Combined dashboard

Aggregation Trade-offs

Benefit	Cost
Single client request	Gateway complexity
Reduced latency (parallel calls)	Partial failure handling
Tailored response	Business logic at gateway edge

API Gateway Tools

Tool	Type	Key Features
Kong	OSS + Enterprise	Plugin ecosystem, declarative config
AWS API Gateway	Managed	Lambda integration, pay-per-request
Spring Cloud Gateway	OSS	Java-native, reactive, WebFlux
Envoy	OSS	High-performance, service mesh ready
NGINX	OSS + Commercial	Proven, lightweight
Traefik	OSS	K8s-native, auto-discovery
Apigee	Commercial	Full API management platform

Selection Criteria

Factor	Consideration
Cloud native	AWS API Gateway, GCP Cloud Endpoints
Kubernetes	Kong, Traefik, Ambassador
Java ecosystem	Spring Cloud Gateway
Performance critical	Envoy, NGINX
Full API management	Kong Enterprise, Apigee

Gateway Deployment Patterns

Single Gateway

graph TD
    C[Clients] --> GW[API Gateway]
    GW --> S1[Service 1]
    GW --> S2[Service 2]
    GW --> S3[Service 3]

Simple; single point of failure.

Regional Gateways

graph TD
    subgraph US Region
        C1[Clients] --> GW1[Gateway US]
    end
    subgraph EU Region
        C2[Clients] --> GW2[Gateway EU]
    end
    GW1 --> Services
    GW2 --> Services

Lower latency; regional compliance.

Gateway per Domain

graph TD
    C[Clients] --> R[Router/LB]
    R --> GW1[Orders Gateway]
    R --> GW2[Users Gateway]
    R --> GW3[Products Gateway]
    GW1 --> OS[Order Services]
    GW2 --> US[User Services]
    GW3 --> PS[Product Services]

Domain team ownership; avoid monolithic gateway.

Anti-Patterns

Anti-Pattern	Problem	Fix
Business logic in gateway	Gateway becomes monolith	Keep gateway thin
Coupling to implementation	Changes require gateway changes	Route by contract, not implementation
No rate limiting	DoS vulnerability	Always configure limits
No timeout	Stuck connections	Set upstream timeouts
Monolithic gateway config	Hard to manage	Modular configs per team

Gateway vs Service Mesh

Aspect	API Gateway	Service Mesh
Traffic type	North-south (external → cluster)	East-west (service → service)
Deployment	Edge of cluster	Sidecar per pod
Scope	External API concerns	Internal communication
Examples	Kong, AWS API Gateway	Istio, Linkerd

Often used together: Gateway at edge, mesh for internal.

When should you use an API Gateway vs direct service calls?

Use a gateway when: (1) You need centralized auth, rate limiting, or logging. (2) Multiple clients with different needs. (3) You want to decouple clients from backend topology. Direct calls are fine for internal service-to-service (use service mesh instead).

How do you prevent the API Gateway from becoming a monolith?

(1) Keep business logic in services, not gateway. (2) Use gateway only for cross-cutting concerns. (3) Consider BFF pattern — gateway per client type. (4) Modular config — each team manages their routes. (5) Avoid aggregation that requires domain knowledge.

What happens when the API Gateway fails?

The gateway is a single point of failure. Mitigations: (1) High availability — multiple instances behind load balancer. (2) Health checks — remove unhealthy instances. (3) Circuit breaker — fail fast if backend down. (4) Regional failover — route to another region. (5) Client retry — with backoff.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search