01 · SQL vs NoSQL — Choose the Right Data Model
Foundations · Topic 1 of 4
Why This Matters
Choosing the wrong data model is the most expensive early architectural decision. SQL and NoSQL are not competitors — they are tools optimized for different access patterns.
Core Concepts
Relational (SQL)
- Data stored in tables with fixed schema
- Relationships via foreign keys and joins
- Strong ACID guarantees
- Query via SQL (declarative)
Non-Relational (NoSQL)
- Schema-flexible: documents, key-value, wide-column, graph
- Optimized for horizontal scale and specific access patterns
- Typically BASE semantics (not ACID by default)
Data Model Types
| Type | Example DBs | Best For |
|---|---|---|
| Relational | PostgreSQL, Spanner | Structured data, complex queries, joins |
| Document | MongoDB | Semi-structured, nested objects |
| Key-Value | DynamoDB, Redis | Simple lookups, caching, sessions |
| Wide-Column | Cassandra, Bigtable | Write-heavy, time-series, IoT |
| Analytical | BigQuery | OLAP, large scans, aggregations |
| Graph | Neo4j | Relationship-heavy traversals |
Decision Framework
graph TD
A[Need ACID transactions?] -->|Yes| B[SQL: PostgreSQL / Spanner]
A -->|No| C[What's your access pattern?]
C -->|Simple key lookups| D[Key-Value: DynamoDB / Redis]
C -->|Flexible nested docs| E[Document: MongoDB]
C -->|High write throughput| F[Wide-Column: Cassandra]
C -->|Large analytical scans| G[BigQuery]
Cloud Implementations
- Self-managed or Cloud SQL (GCP), RDS (AWS), Azure Database
- Full SQL, ACID, extensions (PostGIS, pg_trgm)
- Best for: transactional workloads with complex queries
- Globally distributed relational database
- SQL + strong external consistency at planetary scale
- Uses TrueTime API for global ordering
- Best for: global applications needing SQL + horizontal scale
- AWS managed key-value + document store
- Single-digit millisecond latency at any scale
- Partition key determines data distribution
- Best for: high-throughput, predictable access patterns
- Document model with flexible BSON schema
- Atlas cloud offering (multi-cloud)
- Best for: catalogs, content, user profiles
- Serverless columnar analytical engine
- No indexes — full scans optimized via columnar storage
- Best for: analytics, reporting, ML features
- Wide-column, masterless, peer-to-peer
- Write path: commit log → memtable → SSTable
- Best for: write-heavy IoT, time-series, event logs
Key Tradeoffs
| Dimension | SQL | NoSQL |
|---|---|---|
| Schema | Fixed | Flexible |
| Joins | Native | Application-side |
| Transactions | Full ACID | Limited / eventual |
| Scale | Vertical (+ read replicas) | Horizontal |
| Query flexibility | High (SQL) | Limited (by design) |
Common Pitfalls
Don't pick NoSQL just for scale
Many teams adopt NoSQL prematurely. PostgreSQL handles millions of rows and thousands of QPS with proper indexing.
Don't embed everything in a document
MongoDB's document model tempts over-embedding. Deep nesting makes updates and queries painful.