LSM Tree - Log-Structured Merge Tree

1 minute read

data structure designed for high-write-throughput database systems, Optimized for fast writes

It optimizes write operations by writing all updates to disk sequentially, thereby minimizing random I/O operations.

well-suited for write-heavy workloads and are used in many modern NoSQL databases like Cassandra, HBase, and LevelDB.

Key Concepts of LSM Trees

Write Optimization:

Writes are first recorded in an in-memory structure (called a memtable).

Once the memtable is full, its contents are flushed to disk in the form of a new immutable file called an SSTable (Sorted String Table).

This sequential writing minimizes disk seek time and improves write performance.

Read Optimization:

Reads require checking multiple SSTables, which can be optimized using Bloom filters, compaction, and indexing.

Compaction merges multiple SSTables into fewer, larger ones, discarding obsolete data and reducing the number of SSTables to check during reads.

Compaction:

This process merges SSTables to optimize space and performance. Compaction eliminates deleted entries and combines multiple SSTables, reducing the number of files on disk.

How LSM Trees Work

In-Memory Structure (Memtable)

All writes are initially written to a memtable.

The memtable is an in-memory, mutable data structure that stores key-value pairs in a sorted manner.

Immutable SSTables

When the memtable reaches a certain size, it is flushed to disk as an SSTable.

SSTables are immutable, meaning once they are written, they are never modified.

Bloom Filters and Indexes:

Bloom filters are used to quickly check if a key is present in an SSTable, reducing the need for expensive disk reads. Indexes help locate the exact position of a key within an SSTable.

Compaction:

Periodically, the database will compact SSTables, merging them to maintain read performance and manage storage efficiently. Compaction reduces the number of SSTables by merging and reorganizing them, ensuring that older, stale data is removed.