System Design Fundamentals

10 minute read

Executive summaries and core concepts for system design interviews.

Core Metrics

Latency

Executive Summary: Time for a single request to travel from source to destination. Measured in milliseconds (ms). Lower is better.

Type	Typical Values
L1 Cache Reference	0.5 ns
L2 Cache Reference	7 ns
Main Memory Reference	100 ns
SSD Random Read	150 μs
HDD Seek	10 ms
Round trip within same datacenter	500 μs
Round trip CA to Netherlands	150 ms

Throughput

Executive Summary: Rate of data transfer or requests processed per unit time. Measured in requests/second (RPS) or transactions/second (TPS). Higher is better.

Network: Measured in bits per second (bps), Mbps, Gbps
Application: Requests per second (RPS), Queries per second (QPS)
Database: Transactions per second (TPS)

Availability

Executive Summary: Percentage of time a system is operational. Expressed as “nines” (99.9% = three nines).

Availability	Downtime/Year	Downtime/Month	Downtime/Week
99% (two 9s)	3.65 days	7.31 hours	1.68 hours
99.9% (three 9s)	8.77 hours	43.83 min	10.08 min
99.99% (four 9s)	52.60 min	4.38 min	1.01 min
99.999% (five 9s)	5.26 min	26.30 sec	6.05 sec

Formula: Availability = Uptime / (Uptime + Downtime)

Reliability

Executive Summary: Probability that a system will perform its intended function without failure for a specified period. MTBF (Mean Time Between Failures) is key metric.

MTBF: Mean Time Between Failures
MTTR: Mean Time To Recovery
Availability = MTBF / (MTBF + MTTR)

Durability

Executive Summary: Guarantee that data, once written, will not be lost. Expressed as probability (e.g., 99.999999999% = 11 nines).

S3 Standard: 99.999999999% (11 nines) durability
Achieved through: Replication, Checksums, Error correction codes

CAP Theorem

Executive Summary: In a distributed system, you can only guarantee 2 of 3 properties: Consistency, Availability, Partition Tolerance. Since network partitions are inevitable, choose between CP (consistent but may be unavailable) or AP (available but may be inconsistent).

Consistency: Every read receives the most recent write or an error.

Availability: Every request receives a response (without guarantee it’s the most recent).

Partition Tolerance: System continues to operate despite network partitions.

Choice	Behavior	Examples
CP	Consistent, may reject requests during partition	MongoDB, HBase, Redis
AP	Available, may return stale data	Cassandra, DynamoDB, CouchDB

PACELC Extension

Executive Summary: Extends CAP - even when there’s no Partition, there’s a tradeoff between Latency and Consistency.

PA/EL: Sacrifice consistency for availability and lower latency (Cassandra, DynamoDB)
PC/EC: Prefer consistency over availability and latency (VoltDB, H-Store)
PA/EC: Availability during partition, consistency otherwise (MongoDB)

Consistency Models

Strong Consistency

Executive Summary: All nodes see the same data at the same time. Every read returns the most recent write. Higher latency, simpler programming model.

Implemented via: Synchronous replication, Consensus protocols (Paxos, Raft)
Use when: Financial transactions, inventory management

Eventual Consistency

Executive Summary: Given enough time without new updates, all replicas will converge to the same value. Lower latency, higher availability.

Implemented via: Asynchronous replication, Conflict resolution
Use when: Social media feeds, DNS, Shopping cart

Causal Consistency

Executive Summary: Operations that are causally related are seen by all nodes in the same order. Concurrent operations may be seen in different orders.

If A causes B, everyone sees A before B
Stronger than eventual, weaker than strong
Use when: Collaborative editing, Comment threads

Read-Your-Writes Consistency

Executive Summary: A user always sees their own writes immediately. Other users may see stale data.

Session consistency
Implemented via: Sticky sessions, Version vectors

Linearizability

Executive Summary: Strongest consistency model. Operations appear to happen instantaneously at some point between invocation and response.

Real-time ordering guaranteed
Expensive to implement at scale
Use when: Distributed locks, Leader election

Replication Strategies

Single Leader (Master-Slave)

Executive Summary: One node handles all writes; replicas handle reads. Simple, but leader is a bottleneck and single point of failure.

Client → Leader (Write) → Replicas (Async/Sync)
Client → Replicas (Read)

Pros: Simple, consistent writes, easy conflict resolution Cons: Leader bottleneck, failover complexity, write latency

Multi-Leader (Master-Master)

Executive Summary: Multiple nodes accept writes. Better write performance and availability, but complex conflict resolution required.

Use cases: Multi-datacenter, Offline-capable apps
Conflict resolution: Last-write-wins, Custom merge logic, CRDTs

Leaderless (Dynamo-style)

Executive Summary: Any node can accept reads/writes. Uses quorum to ensure consistency. Highly available, no single point of failure.

Quorum: W + R > N (Write nodes + Read nodes > Total replicas)
Read Repair: Fix inconsistencies on read
Anti-entropy: Background process to sync replicas
Examples: Cassandra, DynamoDB, Riak

Partitioning / Sharding

Executive Summary: Split data across multiple nodes to scale horizontally. Each shard contains a subset of data. Key challenge: choosing the right partition key.

Range Partitioning

Executive Summary: Data divided by key ranges (A-M, N-Z). Good for range queries, but can lead to hot spots.

Pros: Efficient range queries, sorted data
Cons: Hot spots, uneven distribution
Use when: Time-series data, alphabetical lookups

Hash Partitioning

Executive Summary: Hash function determines partition. Even distribution, but loses ordering and range query efficiency.

Pros: Even distribution, no hot spots
Cons: Inefficient range queries
Use when: Random access patterns, user data by ID

Consistent Hashing

Executive Summary: Nodes and keys mapped to a ring. Data moves only between adjacent nodes when topology changes. Minimizes data movement during scaling.

Virtual nodes improve load distribution
Use when: Distributed caches, DHTs
Examples: Cassandra, DynamoDB, Memcached

Directory-Based Partitioning

Executive Summary: Lookup service maintains mapping of keys to partitions. Flexible but lookup service is potential bottleneck.

Load Balancing

Executive Summary: Distribute traffic across multiple servers to improve performance, reliability, and scalability. Can operate at L4 (transport) or L7 (application).

Algorithms

Algorithm	Description	Use Case
Round Robin	Requests distributed sequentially	Equal capacity servers
Weighted Round Robin	Servers assigned weights	Different capacity servers
Least Connections	Route to server with fewest connections	Long-lived connections
IP Hash	Hash client IP to determine server	Session affinity
Least Response Time	Route to fastest responding server	Performance optimization
Random	Random server selection	Simple, stateless

Layer 4 vs Layer 7

Layer 4 (Transport)	Layer 7 (Application)
Faster (no content inspection)	Content-based routing
TCP/UDP level	HTTP headers, cookies
Limited routing options	URL, method-based routing
Examples: HAProxy, AWS NLB	Examples: NGINX, AWS ALB

Health Checks

Active: Load balancer periodically checks endpoints
Passive: Monitor response from real traffic
Criteria: HTTP status, response time, custom checks

Caching

Executive Summary: Store frequently accessed data in fast storage (memory) to reduce latency and database load. Key decisions: what to cache, where to cache, when to invalidate.

Cache Strategies

Cache-Aside (Lazy Loading)

Executive Summary: Application checks cache first, fetches from DB on miss, then populates cache. Simple but can have cache stampede issues.

Check cache
If miss → Read from DB → Write to cache
Return data

Read-Through

Executive Summary: Cache sits between app and DB. Cache handles misses by fetching from DB automatically.

Write-Through

Executive Summary: Data written to cache and DB synchronously. Consistent but higher write latency.

Write-Behind (Write-Back)

Executive Summary: Data written to cache immediately, DB updated asynchronously. Lower latency but risk of data loss.

Write-Around

Executive Summary: Data written directly to DB, bypassing cache. Good for write-heavy workloads where data isn’t immediately read.

Cache Invalidation

TTL (Time-To-Live): Expire after fixed duration
Event-based: Invalidate on data change
Version-based: Cache key includes version number

Eviction Policies

Policy	Description
LRU (Least Recently Used)	Evict least recently accessed
LFU (Least Frequently Used)	Evict least frequently accessed
FIFO	Evict oldest entries first
Random	Random eviction
TTL	Evict expired entries

Cache Levels

Browser Cache: Client-side, HTTP headers (Cache-Control, ETag)
CDN Cache: Edge servers close to users
Application Cache: In-memory (Redis, Memcached)
Database Cache: Query cache, buffer pool

Message Queues

Executive Summary: Asynchronous communication between services. Decouple producers and consumers. Provides buffering, load leveling, and guaranteed delivery.

Point-to-Point vs Pub/Sub

Point-to-Point	Pub/Sub
One consumer per message	Multiple subscribers
Message deleted after consumption	Message delivered to all subscribers
Work distribution	Event notification
Examples: SQS, RabbitMQ	Examples: Kafka, SNS, Redis Pub/Sub

Delivery Guarantees

Guarantee	Description	Trade-off
At-most-once	Fire and forget	May lose messages
At-least-once	Retry until acked	May have duplicates
Exactly-once	Deduplication + at-least-once	Complex, expensive

Message Ordering

FIFO: Strict ordering (higher latency)
Best effort: No ordering guarantee (higher throughput)
Partition ordering: Order within partition only (Kafka)

Dead Letter Queue (DLQ)

Executive Summary: Queue for messages that failed processing after max retries. Enables debugging and manual intervention.

Proxies

Forward Proxy

Executive Summary: Sits in front of clients. Hides client identity, caches responses, filters content.

Use cases: Corporate firewalls, Anonymity, Content filtering

Reverse Proxy

Executive Summary: Sits in front of servers. Load balancing, SSL termination, caching, compression.

Examples: NGINX, HAProxy, Cloudflare
Use cases: Load balancing, Security, Caching

API Gateway

Executive Summary: Single entry point for microservices. Handles routing, authentication, rate limiting, protocol translation.

Examples: Kong, AWS API Gateway, Apigee
Features: Auth, Rate limiting, Logging, Transformation

Database Indexing

Executive Summary: Data structures that improve query speed at the cost of write performance and storage. Choose indexes based on query patterns.

B-Tree Index

Executive Summary: Balanced tree structure. Good for range queries and equality. Most common index type.

O(log n) lookups, inserts, deletes
Supports range queries
Use for: Primary keys, Foreign keys, Range queries

Hash Index

Executive Summary: Hash table for O(1) lookups. Only supports equality queries.

No range query support
Use for: Exact match queries

Composite Index

Executive Summary: Index on multiple columns. Order matters (leftmost prefix rule).

INDEX (a, b, c)
-- Supports: WHERE a, WHERE a AND b, WHERE a AND b AND c
-- Does NOT support: WHERE b, WHERE c, WHERE b AND c

Full-Text Index

Executive Summary: Inverted index for text search. Supports natural language queries.

Examples: Elasticsearch, PostgreSQL tsvector
Use for: Search functionality

Covering Index

Executive Summary: Index contains all columns needed for query. No table lookup required.

Unique ID Generation

Executive Summary: Generate globally unique identifiers in distributed systems. Trade-offs between sortability, size, and coordination requirements.

UUID

Executive Summary: 128-bit universally unique identifier. No coordination required, but not sortable and large.

V1: Timestamp + MAC address
V4: Random (most common)
Pros: No coordination, globally unique
Cons: Not sortable, 128 bits, no locality

Snowflake ID

Executive Summary: 64-bit ID with timestamp, machine ID, sequence. Time-sortable, no coordination between machines.

| Timestamp (41 bits) | Machine ID (10 bits) | Sequence (12 bits) |

~69 years of IDs per machine
4096 IDs per millisecond per machine
Used by: Twitter, Discord

ULID

Executive Summary: 128-bit Universally Unique Lexicographically Sortable Identifier. Sortable, URL-safe.

First 48 bits: timestamp
Last 80 bits: randomness
Lexicographically sortable

Database Auto-Increment

Executive Summary: Simple sequential IDs. Requires coordination, single point of failure.

Pros: Simple, small, sortable
Cons: Coordination required, reveals information

API Design

REST

Executive Summary: Stateless, resource-oriented API using HTTP methods. Simple, cacheable, widely adopted.

Method	Action	Idempotent
GET	Read	Yes
POST	Create	No
PUT	Update (full)	Yes
PATCH	Update (partial)	No
DELETE	Delete	Yes

GraphQL

Executive Summary: Query language for APIs. Client specifies exact data needed. Reduces over/under-fetching.

Pros: Flexible queries, strong typing, single endpoint
Cons: Complexity, caching challenges, N+1 queries

gRPC

Executive Summary: High-performance RPC using Protocol Buffers. Strongly typed, supports streaming.

Uses HTTP/2 (multiplexing, header compression)
Binary format (smaller, faster)
Use for: Microservices, low-latency requirements

Serialization Formats

Format	Type	Human Readable	Size	Speed
JSON	Text	Yes	Large	Slow
XML	Text	Yes	Very Large	Slow
Protocol Buffers	Binary	No	Small	Fast
Avro	Binary	No	Small	Fast
Thrift	Binary	No	Small	Fast
MessagePack	Binary	No	Medium	Fast

Time Synchronization

NTP (Network Time Protocol)

Executive Summary: Synchronizes clocks over network. Accuracy: 1-50ms over internet, <1ms on LAN.

Hierarchical system of time sources (strata)
Stratum 0: Atomic clocks, GPS
Stratum 1: Directly connected to Stratum 0

Logical Clocks

Lamport Timestamps

Executive Summary: Partial ordering of events. If A happened-before B, then L(A) < L(B). Converse not guaranteed.

Vector Clocks

Executive Summary: Track causality between events. Can determine if events are concurrent or causally related.

Each node maintains vector of counters
Expensive in large systems

Hybrid Logical Clocks (HLC)

Executive Summary: Combines physical and logical time. Provides causality while staying close to real time.

Distributed Tracing

Executive Summary: Track requests across microservices. Essential for debugging and performance analysis in distributed systems.

Trace: End-to-end journey of a request
Span: Individual operation within a trace
Context Propagation: Pass trace ID between services

Tools: Jaeger, Zipkin, AWS X-Ray, OpenTelemetry