Bulkhead Pattern — Deep Dive
Level: Intermediate
Pre-reading: 06 · Resilience & Reliability
What is Bulkhead?
Bulkhead isolates resources so a failure in one area doesn't exhaust resources for the entire system. Named after watertight compartments in a ship — if one floods, the others stay dry.
graph TD
subgraph Without Bulkhead
A1[All requests] --> P1[Shared Thread Pool]
P1 --> S1[Service A - Slow]
P1 --> S2[Service B - Fast]
P1 --> S3[Service C - Fast]
end
graph TD
subgraph With Bulkhead
A2[Requests] --> PA[Pool A - 10 threads]
A2 --> PB[Pool B - 10 threads]
A2 --> PC[Pool C - 10 threads]
PA --> SA[Service A - Slow]
PB --> SB[Service B - Fast]
PC --> SC[Service C - Fast]
end
Bulkhead Types
Thread Pool Bulkhead
Each service call has a dedicated thread pool.
| Service | Thread Pool | Max Concurrent |
|---|---|---|
| Payment Service | payment-pool | 20 threads |
| Inventory Service | inventory-pool | 10 threads |
| Notification Service | notification-pool | 5 threads |
Benefit: Slow service only exhausts its pool.
Semaphore Bulkhead
Limit concurrent calls without dedicated threads. Lighter weight.
Semaphore semaphore = new Semaphore(10); // Max 10 concurrent
public Result call() {
if (semaphore.tryAcquire(100, TimeUnit.MILLISECONDS)) {
try {
return doCall();
} finally {
semaphore.release();
}
}
throw new BulkheadFullException();
}
Comparison
| Aspect | Thread Pool | Semaphore |
|---|---|---|
| Isolation | Full (separate threads) | Partial (shared threads) |
| Overhead | Higher (context switching) | Lower |
| Timeout | Per-thread timeout | Relies on call timeout |
| Use case | Blocking I/O | Non-blocking / reactive |
Resilience4j Bulkhead
Thread Pool Bulkhead
ThreadPoolBulkheadConfig config = ThreadPoolBulkheadConfig.custom()
.maxThreadPoolSize(10)
.coreThreadPoolSize(5)
.queueCapacity(20)
.keepAliveDuration(Duration.ofMillis(100))
.build();
ThreadPoolBulkhead bulkhead = ThreadPoolBulkhead.of("paymentBulkhead", config);
CompletionStage<PaymentResult> result = bulkhead.executeSupplier(
() -> paymentClient.charge(order)
);
Semaphore Bulkhead
BulkheadConfig config = BulkheadConfig.custom()
.maxConcurrentCalls(10)
.maxWaitDuration(Duration.ofMillis(500))
.build();
Bulkhead bulkhead = Bulkhead.of("paymentBulkhead", config);
Supplier<PaymentResult> decorated = Bulkhead.decorateSupplier(
bulkhead, () -> paymentClient.charge(order)
);
Annotation-Based
@Service
public class PaymentService {
@Bulkhead(name = "paymentBulkhead", type = Bulkhead.Type.SEMAPHORE)
public PaymentResult processPayment(Order order) {
return paymentClient.charge(order);
}
}
Bulkhead Configuration
resilience4j:
bulkhead:
instances:
paymentBulkhead:
maxConcurrentCalls: 10
maxWaitDuration: 500ms
inventoryBulkhead:
maxConcurrentCalls: 20
maxWaitDuration: 0 # Fail immediately
thread-pool-bulkhead:
instances:
paymentBulkhead:
maxThreadPoolSize: 10
coreThreadPoolSize: 5
queueCapacity: 20
Connection Pool Bulkhead
Separate database connection pools for different operations.
@Configuration
public class DataSourceConfig {
@Bean("ordersDataSource")
public DataSource ordersDataSource() {
HikariConfig config = new HikariConfig();
config.setMaximumPoolSize(20);
config.setPoolName("orders-pool");
return new HikariDataSource(config);
}
@Bean("analyticsDataSource")
public DataSource analyticsDataSource() {
HikariConfig config = new HikariConfig();
config.setMaximumPoolSize(5); // Smaller pool for analytics
config.setPoolName("analytics-pool");
return new HikariDataSource(config);
}
}
Benefit: Slow analytics queries don't exhaust connections for orders.
Process Bulkhead
Separate critical and non-critical workloads at the deployment level.
graph TD
subgraph Critical Path
O[Order Service - 10 replicas]
P[Payment Service - 10 replicas]
end
subgraph Non-Critical
R[Recommendations - 3 replicas]
A[Analytics - 2 replicas]
end
- Deploy critical services with more resources
- Use separate node pools in Kubernetes
- Different scaling policies
Kubernetes Resource Isolation
apiVersion: v1
kind: Pod
metadata:
name: order-service
spec:
containers:
- name: order
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
nodeSelector:
workload-type: critical # Dedicated nodes
Bulkhead with Other Patterns
Bulkhead + Circuit Breaker
@CircuitBreaker(name = "paymentService")
@Bulkhead(name = "paymentBulkhead")
@Retry(name = "paymentService")
public PaymentResult processPayment(Order order) {
return paymentClient.charge(order);
}
Execution order: Bulkhead → Circuit Breaker → Retry → Call
Bulkhead + Rate Limiter
@Bulkhead(name = "searchBulkhead")
@RateLimiter(name = "searchRateLimiter")
public SearchResults search(String query) {
return searchClient.search(query);
}
Monitoring
Metrics
# Semaphore bulkhead
resilience4j_bulkhead_available_concurrent_calls{name="paymentBulkhead"} 8
resilience4j_bulkhead_max_allowed_concurrent_calls{name="paymentBulkhead"} 10
# Thread pool bulkhead
resilience4j_thread_pool_bulkhead_queue_depth{name="paymentBulkhead"} 5
resilience4j_thread_pool_bulkhead_current_thread_count{name="paymentBulkhead"} 10
resilience4j_thread_pool_bulkhead_available_queue_capacity{name="paymentBulkhead"} 15
Alerting
| Metric | Alert Condition |
|---|---|
| Available concurrent calls | < 20% of max sustained |
| Queue depth | > 80% capacity |
| Rejected calls | > 0 sustained |
Sizing Bulkheads
Little's Law
Example:
- 100 requests/second
- 200ms average latency
- Needed: 100 × 0.2 = 20 concurrent threads minimum
Add headroom for variance:
Start Conservative
- Begin with low limits
- Monitor queue depth and rejections
- Increase gradually
Common Mistakes
| Mistake | Impact | Fix |
|---|---|---|
| Bulkhead too small | Many rejections | Size based on Little's Law |
| Bulkhead too large | No isolation | Smaller pools per service |
| Single shared pool | No isolation | Separate pools |
| No timeout | Threads blocked forever | Always set timeouts |
| Not monitoring | Don't know when exhausted | Add metrics and alerts |
When should you use thread pool vs semaphore bulkhead?
Use thread pool for blocking I/O operations — it provides full isolation with separate threads. Use semaphore for non-blocking/reactive operations — it's lighter weight and limits concurrency without thread overhead. Most Spring MVC apps use thread pool; WebFlux apps use semaphore.
How do you size a bulkhead?
Use Little's Law: Concurrent = Rate × Latency. For 100 req/s and 200ms latency, you need ~20 concurrent slots. Add 50% headroom = 30. Monitor in production and adjust: if queue is always full, increase; if always empty, decrease.
How does bulkhead differ from rate limiting?
Bulkhead limits concurrent calls — how many requests are in-flight simultaneously. Rate limiter limits throughput — how many requests per time window. Bulkhead prevents resource exhaustion; rate limiter prevents overload. Use both: bulkhead for concurrency, rate limiter for throughput.