01 · SQL vs NoSQL — Choose the Right Data Model

Foundations · Topic 1 of 4

Why This Matters

Choosing the wrong data model is the most expensive early architectural decision. SQL and NoSQL are not competitors — they are tools optimized for different access patterns.

Core Concepts

Relational (SQL)

Data stored in tables with fixed schema
Relationships via foreign keys and joins
Strong ACID guarantees
Query via SQL (declarative)

Non-Relational (NoSQL)

Schema-flexible: documents, key-value, wide-column, graph
Optimized for horizontal scale and specific access patterns
Typically BASE semantics (not ACID by default)

Data Model Types

Type	Example DBs	Best For
Relational	PostgreSQL, Spanner	Structured data, complex queries, joins
Document	MongoDB	Semi-structured, nested objects
Key-Value	DynamoDB, Redis	Simple lookups, caching, sessions
Wide-Column	Cassandra, Bigtable	Write-heavy, time-series, IoT
Analytical	BigQuery	OLAP, large scans, aggregations
Graph	Neo4j	Relationship-heavy traversals

Decision Framework

graph TD
    A[Need ACID transactions?] -->|Yes| B[SQL: PostgreSQL / Spanner]
    A -->|No| C[What's your access pattern?]
    C -->|Simple key lookups| D[Key-Value: DynamoDB / Redis]
    C -->|Flexible nested docs| E[Document: MongoDB]
    C -->|High write throughput| F[Wide-Column: Cassandra]
    C -->|Large analytical scans| G[BigQuery]

Cloud Implementations

PostgreSQLGoogle SpannerDynamoDBMongoDBBigQueryCassandra

Self-managed or Cloud SQL (GCP), RDS (AWS), Azure Database
Full SQL, ACID, extensions (PostGIS, pg_trgm)
Best for: transactional workloads with complex queries

Globally distributed relational database
SQL + strong external consistency at planetary scale
Uses TrueTime API for global ordering
Best for: global applications needing SQL + horizontal scale

AWS managed key-value + document store
Single-digit millisecond latency at any scale
Partition key determines data distribution
Best for: high-throughput, predictable access patterns

Document model with flexible BSON schema
Atlas cloud offering (multi-cloud)
Best for: catalogs, content, user profiles

Serverless columnar analytical engine
No indexes — full scans optimized via columnar storage
Best for: analytics, reporting, ML features

Wide-column, masterless, peer-to-peer
Write path: commit log → memtable → SSTable
Best for: write-heavy IoT, time-series, event logs

Key Tradeoffs

Dimension	SQL	NoSQL
Schema	Fixed	Flexible
Joins	Native	Application-side
Transactions	Full ACID	Limited / eventual
Scale	Vertical (+ read replicas)	Horizontal
Query flexibility	High (SQL)	Limited (by design)

Common Pitfalls

Don't pick NoSQL just for scale

Many teams adopt NoSQL prematurely. PostgreSQL handles millions of rows and thousands of QPS with proper indexing.

Don't embed everything in a document

MongoDB's document model tempts over-embedding. Deep nesting makes updates and queries painful.