07 · Database Caching — Serve Hot Data from Memory Layer
Scaling Reads · Topic 7 of 7
Why Cache?
Disk reads: ~1ms. Memory reads: ~100ns. A cache sits between your application and database and serves frequently accessed data at memory speed.
Cache Strategies
Cache-Aside (Lazy Loading)
Application checks cache first. On miss, reads from DB and populates cache.
def get_user(user_id):
user = cache.get(f"user:{user_id}")
if user is None:
user = db.query("SELECT * FROM users WHERE id = %s", user_id)
cache.set(f"user:{user_id}", user, ttl=300)
return user
Write-Through
Every write goes to both cache and DB simultaneously.
- ✅ Cache always consistent
- ❌ Write latency doubles
Write-Behind (Write-Back)
Write to cache first; DB write happens asynchronously.
- ✅ Fast writes
- ❌ Risk of data loss if cache fails before DB write
Read-Through
Cache handles DB reads automatically (cache-aside done by the cache layer, not app).
Cache Invalidation Strategies
| Strategy | How |
|---|---|
| TTL expiry | Cache entry expires after N seconds |
| Event-driven invalidation | DB change triggers cache delete (via CDC) |
| Cache-aside + versioning | Cache key includes version number |
Cache invalidation is hard
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton
Cloud Implementations
- In-memory key-value store
- Data structures: strings, hashes, sorted sets, lists
- Persistence: RDB snapshots + AOF (Append-Only File)
- Clustering: Redis Cluster (horizontal sharding)
- DynamoDB Accelerator — in-memory cache fully managed by AWS
- Microsecond latency for cached reads
- Write-through cache; no cache invalidation needed
- Spanner has no native caching layer
- Application-managed Redis/Memcached for hot reads
- In-memory analysis layer for sub-second dashboard queries
- Automatically caches frequently queried data
- WiredTiger cache (50% of RAM by default)
- Working set must fit in cache for optimal performance
Cache Stampede (Thundering Herd)
When a popular cache key expires, thousands of requests hit the DB simultaneously.
Solutions: - Probabilistic early expiration: refresh before TTL expires - Mutex lock: only one request refreshes, others wait - Stale-while-revalidate: serve stale and refresh in background