Service-to-Service Authentication — Deep Dive
Level: Intermediate
Pre-reading: 08 · Security Summary · 08.02 · JWT Deep Dive
The Problem: Trust Between Services
In a microservices system, Service A calling Service B must prove its identity. There is no human user in the loop. The patterns below address machine identity — how a service proves who it is.
graph LR
GW[API Gateway] -->|mTLS or Bearer| OrdS[Order Service]
OrdS -->|mTLS or Bearer| PayS[Payment Service]
OrdS -->|mTLS or Bearer| InvS[Inventory Service]
PayS -->|mTLS or Bearer| ExtBank[External Bank API]
Pattern 1: mTLS — Mutual TLS
Standard TLS only authenticates the server to the client. mTLS requires both sides to present X.509 certificates — bidirectional authentication at the transport layer.
sequenceDiagram
participant A as Service A
participant B as Service B
participant CA as Certificate Authority
A->>B: ClientHello
B->>A: ServerCertificate (B's cert)
A->>A: Verify B's cert against CA
A->>B: ClientCertificate (A's cert)
B->>B: Verify A's cert against CA
A->>B: Encrypted request
B->>A: Encrypted response
| Aspect | Details |
|---|---|
| Auth level | Transport layer (L4) — below HTTP |
| Identity proof | X.509 certificate signed by trusted CA |
| Code changes needed | None if using a service mesh (Istio/Linkerd) |
| Key management | Certificates must be rotated before expiry |
| Overhead | Slight TLS handshake overhead; negligible in practice |
Let the service mesh handle mTLS
With Istio or Linkerd, mTLS is automatic and transparent — injected sidecar proxies handle the handshake. No application code changes. Enable STRICT mode so plaintext is rejected.
Istio mTLS Modes
| Mode | Behavior |
|---|---|
PERMISSIVE |
Accepts both mTLS and plaintext — migration mode |
STRICT |
Rejects all plaintext; only mTLS allowed — production target |
# Enforce STRICT mTLS for an entire namespace
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: production
spec:
mtls:
mode: STRICT
Pattern 2: SPIFFE / SPIRE — Workload Identity
SPIFFE (Secure Production Identity Framework for Everyone) is an open standard for workload identity in dynamic infrastructure — the identity of a running process, not a human.
SPIRE is the reference implementation of SPIFFE.
| Term | Meaning |
|---|---|
| SPIFFE ID | URI identity: spiffe://trust-domain/path e.g. spiffe://example.com/order-service |
| SVID | SPIFFE Verifiable Identity Document — an X.509 cert or JWT carrying the SPIFFE ID |
| Trust Domain | Administrative boundary (e.g. your company or cluster) |
| SPIRE Agent | Runs on each node; attests workloads and delivers SVIDs |
| SPIRE Server | Signs SVIDs; manages trust bundles; validates attestation |
graph TD
SS[SPIRE Server] -->|Signs SVIDs| SA[SPIRE Agent on Node]
SA -->|Delivers X.509 SVID| OS[Order Service Pod]
SA -->|Delivers X.509 SVID| PS[Payment Service Pod]
OS -->|mTLS using SVID| PS
Why SPIFFE over plain mTLS certs?
- Automatic rotation (SVIDs are short-lived, often 1 hour)
- Cryptographic workload attestation (proves what the workload is, not just where)
- Works across clouds and clusters
Pattern 3: Kubernetes Service Accounts
Every pod in Kubernetes is assigned a Service Account (SA). The SA token is a JWT mounted at /var/run/secrets/kubernetes.io/serviceaccount/token.
graph LR
K8sAPI[K8s API Server] -->|Issues SA Token| Pod[Pod]
Pod -->|Bearer SA Token| TargetSvc[Target Service]
TargetSvc -->|TokenReview API| K8sAPI
| Aspect | Details |
|---|---|
| Token format | JWT (since K8s 1.21: bound, audience-limited, time-limited) |
| Mounted at | /var/run/secrets/kubernetes.io/serviceaccount/token |
| Audience | Bound to specific audience since K8s 1.21 (prevents token reuse across services) |
| Rotation | Automatic since K8s 1.21 (kubelet rotates before expiry) |
| Validation | Via K8s TokenReview API or OIDC discovery endpoint |
Legacy SA tokens are long-lived and unbound
Pre-K8s 1.21 service account tokens never expired and had no audience. If you're running older clusters, explicitly configure --service-account-issuer and bound SA tokens. Disable auto-mounting with automountServiceAccountToken: false for pods that don't need it.
Pattern 4: OAuth2 Client Credentials (Machine-to-Machine)
For services calling external APIs or third-party services, OAuth2 Client Credentials grant is the standard.
sequenceDiagram
participant S as Service A
participant AS as Auth Server
participant API as External API
S->>AS: POST /token · client_id · client_secret · grant_type=client_credentials
AS->>S: Access Token (JWT)
S->>API: GET /resource · Bearer access_token
API->>S: Protected resource
| Aspect | Details |
|---|---|
| No user involved | Machine-to-machine only |
| Secret storage | Store client_secret in Vault / Secrets Manager — never in code |
| Token caching | Cache access token until exp - buffer; avoid requesting new token every call |
| Scopes | Use narrow scopes — minimum permissions principle |
Comparison: Which Pattern to Use?
| Scenario | Recommended Pattern |
|---|---|
| Microservices in same K8s cluster with Istio | mTLS via Istio (automatic, zero-code) |
| Microservices across clusters or clouds | SPIFFE/SPIRE (cross-boundary workload identity) |
| Pod calling K8s API directly | Kubernetes Service Account token |
| Service calling external third-party API | OAuth2 Client Credentials |
| On-prem services without service mesh | mTLS with manually managed certs |
Short-Lived Credentials — Why They Matter
| Credential Type | Recommended Lifetime | If Compromised |
|---|---|---|
| Access Token (JWT) | 5–15 minutes | Useless after expiry |
| SPIFFE SVID (X.509) | 1 hour | Useless after expiry |
| K8s SA token (bound) | 1 hour | Useless after expiry |
| Client Secret (OAuth2) | Rotate every 90 days | Must rotate immediately |
| Long-lived API key | ❌ Avoid | Attacker has persistent access |
Design for automatic rotation
The blast radius of a leaked credential is directly proportional to its lifetime. If your credentials rotate automatically every hour, a stolen credential is mostly worthless before anyone notices.
What is the difference between mTLS and a JWT Bearer token for service-to-service auth?
mTLS authenticates at the transport layer using X.509 certificates — authentication happens before any HTTP is sent. JWT Bearer authenticates at the application layer — the service validates the token in the Authorization header. mTLS is stronger (cannot be stripped once enforced) but requires PKI infrastructure. JWTs are easier to implement but must be validated correctly in each service.
What is SPIFFE and when would you use it over plain mTLS?
SPIFFE is a workload identity standard that assigns a cryptographic identity (spiffe://trust-domain/workload) to running processes. Use it when you need: cross-cluster identity, automatic cert rotation, or multi-cloud environments. Plain mTLS with manually managed certs works for single-cluster setups but becomes operationally painful at scale — SPIRE solves that with automated attestation and rotation.