Azure Databricks & Apache Spark
Lakehouse Medallion Architecture
After Data ingestion/integration Build the system in layers with 3 minimum layers
Data Collection/Ingestion
Batch Mode: Collection of Data in periodic intervals Stream Mode: Collect as and when the data is generated
Data Processing
- Bronze layer - Raw data after collecting from Source systems
- Silver Layer - Read data from Bronze layer, do required processing
- Gold Layer - preparing data for consumption, filling results into the desired data models