Timeout Pattern — Deep Dive
Level: Intermediate
Pre-reading: 06 · Resilience & Reliability
Why Timeouts Matter
Without timeouts, a slow or unresponsive service can:
- Block threads indefinitely
- Exhaust connection pools
- Cascade failures upstream
- Leave users waiting forever
Rule: Every remote call needs a timeout.
Types of Timeouts
| Type | Description | Typical Value |
|---|---|---|
| Connect timeout | Time to establish TCP connection | 1-5 seconds |
| Read timeout | Time to receive response after request sent | 5-30 seconds |
| Write timeout | Time to send request body | 5-10 seconds |
| Request timeout | Total time for entire operation | 10-60 seconds |
sequenceDiagram
participant C as Client
participant S as Server
Note over C,S: Connect timeout window
C->>S: TCP SYN
S->>C: TCP SYN-ACK
C->>S: TCP ACK
Note over C,S: Connection established
Note over C,S: Write timeout window
C->>S: HTTP Request
Note over C,S: Read timeout window
S->>C: HTTP Response
Configuring Timeouts
RestTemplate
@Bean
public RestTemplate restTemplate() {
HttpComponentsClientHttpRequestFactory factory =
new HttpComponentsClientHttpRequestFactory();
factory.setConnectTimeout(5000); // 5 seconds
factory.setReadTimeout(10000); // 10 seconds
return new RestTemplate(factory);
}
WebClient
@Bean
public WebClient webClient() {
HttpClient httpClient = HttpClient.create()
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 5000)
.responseTimeout(Duration.ofSeconds(10));
return WebClient.builder()
.clientConnector(new ReactorClientHttpConnector(httpClient))
.build();
}
Feign
feign:
client:
config:
default:
connectTimeout: 5000
readTimeout: 10000
payment-service:
readTimeout: 30000 # Longer for payment
HTTP Client (Apache)
RequestConfig config = RequestConfig.custom()
.setConnectTimeout(5000)
.setSocketTimeout(10000)
.setConnectionRequestTimeout(3000) // Time to get connection from pool
.build();
CloseableHttpClient client = HttpClients.custom()
.setDefaultRequestConfig(config)
.build();
Deadline Propagation
Pass remaining time budget through service call chains. If the client has 10s total and the first hop takes 3s, only 7s remain for the rest.
sequenceDiagram
participant C as Client
participant A as Service A
participant B as Service B
participant D as Service C
Note over C: Deadline: T+10s
C->>A: Request (X-Request-Deadline: T+10s)
Note over A: Remaining: 10s
A->>A: Process (2s)
Note over A: Remaining: 8s
A->>B: Request (X-Request-Deadline: T+10s)
Note over B: Remaining: 8s
B->>B: Process (3s)
Note over B: Remaining: 5s
B->>D: Request (X-Request-Deadline: T+10s)
Note over D: Remaining: 5s
Implementation
@Component
public class DeadlineInterceptor implements ClientHttpRequestInterceptor {
@Override
public ClientHttpResponse intercept(HttpRequest request, byte[] body,
ClientHttpRequestExecution execution) throws IOException {
// Get deadline from context
Instant deadline = DeadlineContext.getDeadline();
if (deadline != null) {
request.getHeaders().set("X-Request-Deadline", deadline.toString());
}
return execution.execute(request, body);
}
}
// Check deadline before expensive operations
public Order processOrder(OrderRequest request) {
if (DeadlineContext.isExpired()) {
throw new DeadlineExceededException();
}
// Continue processing...
}
gRPC Deadlines
gRPC has built-in deadline support:
// Client sets deadline
OrderServiceGrpc.OrderServiceBlockingStub stub = OrderServiceGrpc
.newBlockingStub(channel)
.withDeadlineAfter(10, TimeUnit.SECONDS);
// Server checks remaining time
long remaining = Context.current().getDeadline().timeRemaining(TimeUnit.MILLISECONDS);
if (remaining <= 0) {
throw Status.DEADLINE_EXCEEDED.asRuntimeException();
}
Timeout vs Circuit Breaker
| Aspect | Timeout | Circuit Breaker |
|---|---|---|
| Purpose | Limit wait time for single call | Prevent calls to failing service |
| Scope | Per request | Across requests |
| When triggers | Call takes too long | Failure rate exceeds threshold |
| Result | Request fails | Request rejected immediately |
Use together: Timeout causes individual calls to fail; circuit breaker counts timeouts as failures.
Timeout Strategy by Service
| Service Type | Timeout | Rationale |
|---|---|---|
| Database | 5-10s | Queries should be fast |
| Cache (Redis) | 100-500ms | Cache should be very fast |
| Internal service | 5-15s | Within cluster, should be fast |
| External API | 15-30s | Third-party may be slow |
| Payment gateway | 30-60s | May involve bank processing |
| File upload | 60-120s | Large files take time |
Handling Timeout Errors
try {
return restTemplate.getForObject(url, Order.class);
} catch (ResourceAccessException e) {
if (e.getCause() instanceof SocketTimeoutException) {
// Read timeout
log.warn("Timeout calling order service: {}", url);
throw new ServiceTimeoutException("Order service timeout", e);
}
if (e.getCause() instanceof ConnectTimeoutException) {
// Connect timeout
log.error("Cannot connect to order service: {}", url);
throw new ServiceUnavailableException("Order service unreachable", e);
}
throw e;
}
Resilience4j TimeLimiter
TimeLimiterConfig config = TimeLimiterConfig.custom()
.timeoutDuration(Duration.ofSeconds(10))
.cancelRunningFuture(true)
.build();
TimeLimiter timeLimiter = TimeLimiter.of("orderService", config);
CompletableFuture<Order> future = CompletableFuture.supplyAsync(
() -> orderClient.getOrder(orderId)
);
Order order = timeLimiter.executeFutureSupplier(() -> future);
With Annotation
@TimeLimiter(name = "orderService", fallbackMethod = "orderFallback")
public CompletableFuture<Order> getOrder(String orderId) {
return CompletableFuture.supplyAsync(() ->
orderClient.getOrder(orderId)
);
}
Common Mistakes
| Mistake | Impact | Fix |
|---|---|---|
| No timeout configured | Threads blocked forever | Always set timeouts |
| Timeout too long | Poor user experience | Match SLA requirements |
| Timeout too short | False failures | Allow for normal variance |
| Same timeout everywhere | Suboptimal behavior | Tune per service |
| Ignoring deadline propagation | Chain exceeds total budget | Pass remaining time |
Timeout Anti-Pattern: The Timeout Cliff
When multiple services in a chain have the same timeout:
If Service C takes 9s, Service B might timeout at 10s before getting the response, even though C was "successful."
Fix: Decrease timeouts down the chain:
How do you choose the right timeout value?
Consider: (1) p99 latency of the downstream service — set timeout 2-3x higher. (2) User experience — how long can users wait? (3) SLA requirements — what's the total latency budget? (4) Service type — cache should be fast (500ms), payment can be slow (30s). Start conservative; tune based on production data.
What is deadline propagation and why is it important?
Deadline propagation passes the remaining time budget through a call chain. Without it, each service has its own timeout, and the total time can far exceed what the client expects. With it, each service knows how much time remains and can fail early if the deadline is already exceeded.
What happens to the request on the server when the client times out?
The server doesn't know the client gave up — it may continue processing. This can waste resources or cause issues if the operation has side effects. Mitigations: (1) Use deadline propagation — server checks and aborts. (2) Design for idempotency. (3) Use request cancellation (gRPC supports this).