Caches
Temporary storage that stores a part of the DB
- reduces the response time
- reduces the load on the database
Global Cache vs Local Cache
Local Cache
- in-memory
- no need for any network calls for the cache storage if its in RAM
Global Cache
- Redis or Memcached
- Redis as a Global Data cache.
- Can be scaled independently
Cache Eviction and Loading Policy
https://github.com/ByteByteGoHq/system-design-101?tab=readme-ov-file#top-caching-strategies
Write Policies in a cache
A write request means some entry is added, updated or deleted
in the cache
Write policy is when there is a write operation in a cache, usually a user operation
- Create
- Delete
- Update
Write back
- timeout based persistence
- Cache is always updated
- DB is eventually consistent
https://codeahoy.com/2017/08/11/caching-strategies-and-how-to-choose-the-right-one/
Write-Back Policy
- If the key-value pair to be updated, is present in the cache then it is
updated.
- the key-value pair is not immediately updated in the database.
Timeout-Based Persistence. In this mechanism, we have a TTL(Time to Live for each entry in cache). If the timestamp becomes greater than the TTL, the entry is evicted and the data is persisted in the database.
This makes the database eventually consistent
Event-Based Write Back. In this mechanism, keep an event-based trigger along with Key-Value Pair. For example, # of updates as a trigger, if it becomes greater than 5 -> triggers to persist cache value in the database
Replacement Write Back. For each entry in the cache, we have an attribute that tells us if the entry is updated or not. When we want to evict the entry we update the entry in the database if the entry was updated in the cache
Write-Through Policy
when there is a write request, we evict the key that is being updated, while simultaneously updating the database.
The next time there is a read request, the cache polls the database and receives the response.
- lock the data which is to be written and only unlock the data after the update operation is completed.
- provides a high level of consistency
- and a high level of persistence
- Less efficient compared to the write-back policy
Write-Around Policy
instead of updating the entry in the cache, we update the entry in the database.
Now when we get a read request, we will send the stale value from the cache.
And we will be getting stale value until the entry is evicted from the cach
Cache Eviction Policies
Policies to Evict the data from the cache.
A replacement policy is triggered when there is no space for a new key and a key is evicted from the cache Replacement policy is when a new entry is copied from the DB
LRU - Least Recently Used
evict the entry that has not been used for the longest
LFU Policy (Least Frequently Used)
evict the entry that is used the least number of times (frequency)
Add a frequency data point and determine which pair is to be evicted
In most cases,LRU Policy works better than LFU
Thrashing
- When the miss ratio is high and there are a lot of loads and evictions
- Cache never has required data and DB is asked everytime
Replacement Policy in Memcached
Mem-cached has three regions-cold, warm and hot.The hot region does not contain the Last Used timestamp.
As new entries come in,entries in the hot region are pushed to the warm or cold region
One data store is used to store entries that are less requested (cold region) Another data store is used to store entries that are more requested (warm region)
Entries gets promoted based on the usage from Cold to Warm regions
whether the entry lives in the hot or cold region is decided by the frequency of usage. And whether the entry needs to be evicted is decided by its LastUsed timestamp
This process is known as Segmented LRU.
https://redis.com/blog/count-min-sketch-the-art-and-science-of-estimating-stuff/
Caching Pitfalls
Cache Eviction, Cache Policies, and Cache Expiry Policy
Cache Eviction
Cache eviction refers to the process of removing items from a cache to make space for new data. It is necessary when the cache reaches its capacity limit and needs to discard older or less frequently accessed items to accommodate new entries.
Reasons for Cache Eviction
- Capacity Limit: When the cache is full and cannot store additional data.
- Policy Enforcement: Following cache replacement policies to manage data efficiently.
- Performance Optimization: Ensuring that cache access times remain fast by removing less useful or outdated entries.
Cache Policies
Cache policies determine how items are selected for eviction when the cache reaches its capacity limit. Various strategies are employed to maximize cache hit rates and minimize cache misses.
Common Cache Replacement Policies
- LRU (Least Recently Used): Evicts the least recently accessed items first.
- LFU (Least Frequently Used): Evicts the least frequently accessed items first.
- FIFO (First-In-First-Out): Evicts the oldest items first, regardless of access patterns.
- MRU (Most Recently Used): Evicts the most recently accessed items first.
- Random Replacement: Selects a random item for eviction.
- Write-Around: Data is written directly to the storage but not cached until read requests for that data occur.
- Write-Through: Data is written to both the cache and the underlying storage on write operations.
- Read-Through: Data is initially read from the cache. If not present, it is fetched from the underlying storage and then cached.
Choosing a Cache Policy
- Performance Considerations: Different policies perform better depending on access patterns (e.g., LRU for temporal locality).
- Implementation Complexity: Some policies are simpler to implement but may not perform optimally in all scenarios.
- Application Requirements: Align the policy with application-specific requirements for cache behavior.
Cache Expiry Policy
Cache expiry policy determines when cached items become invalid and should be removed from the cache. It helps maintain data freshness and consistency between the cached data and the source of truth.
Strategies for Cache Expiry
- Time-Based Expiry: Set an expiration time for each cached item. Items are automatically evicted when they expire.
- Event-Based Expiry: Invalidate items based on external events or changes in the source data.
- Lazy Expiry: Remove items from the cache only when they are accessed and determined to be expired (on-demand expiration).
Implementation Considerations
- TTL (Time-to-Live): Specifies how long an item can remain in the cache before it expires.
- Synchronization: Ensure that cache expiry is synchronized with updates to the source data to prevent stale cache entries.
- Notification Mechanisms: Use callbacks or event listeners to handle cache expiry events efficiently.
Cache eviction, cache policies, and cache expiry policies are crucial for optimizing cache performance, managing memory efficiently, and ensuring that cached data remains accurate and up-to-date with minimal overhead.