06 · Read Replicas — Scale Read-Heavy Workloads Horizontally
Scaling Reads · Topic 6 of 7
The Problem
A single primary database becomes a bottleneck when read QPS grows. Vertical scaling (bigger machines) has limits and is expensive.
Read Replicas
A read replica is a copy of the primary database that serves read-only traffic. All writes still go to the primary.
graph LR
App -->|Writes| Primary[(Primary)]
App -->|Reads| R1[(Read Replica 1)]
App -->|Reads| R2[(Read Replica 2)]
Primary -->|Async replication| R1
Primary -->|Async replication| R2
When to Use
- Reports and analytics queries that are expensive but don't need fresh data
- Search / autocomplete hitting the same tables as transactional writes
- Multi-region reads — serve users from the nearest replica
When NOT to Use
- When the application requires read-your-own-writes consistency
- Real-time dashboards that cannot tolerate any lag
- Low-latency OLTP reads — a well-indexed primary handles this better
Cloud Implementations
- Up to 15 read replicas (AWS RDS)
- Streaming WAL-based replication
- Route via separate read endpoint or connection pooler (PgBouncer)
- Read-only replicas placed in any region
- Strong reads vs. stale reads (bounded staleness for lower latency)
max_stalenesshint:READS WITH max_staleness = INTERVAL '15' SECOND
- Global Tables = multi-region replicas
ConsistentRead: falsefor eventually consistent (cheaper) reads from any replica
- Every node can serve reads; RF determines how many copies exist
ConsistencyLevel.ONEreads from nearest replica
readPreference: secondary— route reads to secondariesmaxStalenessSecondsto bound acceptable lag
Connection Routing Patterns
| Pattern | How |
|---|---|
| Application-level | App chooses primary or replica connection string per operation |
| Proxy-level | PgBouncer, ProxySQL, RDS Proxy route by query type |
| ORM-level | Django DATABASES routing, Spring @Transactional(readOnly=true) |