Azure Databricks & Apache Spark

less than 1 minute read

DataEngineer_RefArchtecture.png dataEngineeringPlatform-refArchi.png

Lakehouse Medallion Architecture

After Data ingestion/integration Build the system in layers with 3 minimum layers

lakehouse-medallion-architecture.png

Data Collection/Ingestion

Batch Mode: Collection of Data in periodic intervals Stream Mode: Collect as and when the data is generated

Data Processing

  • Bronze layer - Raw data after collecting from Source systems
  • Silver Layer - Read data from Bronze layer, do required processing
  • Gold Layer - preparing data for consumption, filling results into the desired data models

Azure Databricks Platform Architecture

azure-databrics-platform-architecture.png