Cloud Data Platform Engineering with Azure Databricks




Description
The Data Platform Engineering with Azure Databricks course is designed to provide participants with a comprehensive understanding of how to leverage Azure Databricks to build scalable and efficient data platforms. Throughout this course, participants will gain hands-on experience in designing, implementing, and managing data engineering workflows using Azure Databricks. From data ingestion and transformation to advanced analytics and machine learning, participants will explore the key features and best practices of Azure Databricks for building robust data platforms.
The course follows a hands-on approach, participants engage in live coding sessions and have access to all course materials upon completion of the course. The course is taught using the Microsoft Azure cloud platform and Databricks, the engineering principles are very much applicable to other cloud platforms and tooling.
Target audience
This course is designed for professionals working with data daily who want to understand what it takes to build a data platform in the cloud. This course is perfect for all:
- data engineers and data architects working on on-premises systems who want to learn about cloud platforms
- software developers working in the data domain
- BI developers and data scientists who want to learn more about the full stack of data systems.
Prerequisites:
- Proficiency in Python programming language and SQL
- Basic understanding of data engineering principles and technologies
- Basic understanding of DevOps principles



Topics covered
Topic 1: Introduction to Azure Databricks
- Overview of Azure Databricks and its role in data platform engineering
- Understanding the architecture and components of Azure Databricks
- Databricks dataplatform architecture, Bronze, Silver and Gold layers
- Exploring the Azure Databricks workspace and notebooks
Topic 2: Data Ingestion
- Orchestration, Azure Datafactory or Databricks
- Overview of data ingestion techniques with Azure Databricks
- Ingestion with Databricks or Azure Data Factory
- Integrating Azure Databricks with data sources like Azure Storage, Azure Data Lake, and more
- Configuring delta loads
Topic 3: Data Engineering Workflows
- Designing and implementing end-to-end data engineering workflows with Azure Databricks
- Building scalable ETL (Extract, Transform, Load) processes
- A discussion about different data source characteristics
- A discussion about the difference between data cleaning and applying business logic
- Performing data preparation and transformation using Spark DataFrame API
- Handling slowly changing dimensions
- ACID transactions and the delta log
Topic 4: Scheduling and monitoring
- Managing data pipelines and scheduling jobs in Azure Databricks
- Monitor data pipelines using Azure Log Analytics and Logic apps
- Strategies for resolving schema drift
Topic 5: Packaging your pipelines, Deployment and Integration
- Deployment strategies for Azure Databricks workspaces
- Integrating Azure Databricks with other Azure services like Azure Data Factory, Azure Synapse Analytics, etc.
- Continuous integration and deployment (CI/CD) workflows with Azure DevOps and Azure Databricks
- Unit tests for your solutions
Topic 6: Delivering analytics to the business
- Dimensional modelling in Spark SQL
- Leveraging slowly changing dimensions type 2 in your gold layer
- Presenting your data using Power BI
Note: The course outline provided above is a general guideline and can be customized or expanded based on specific requirements or audience needs.
No (suitable) date available? Or do you want to schedule this training as an in-company training? Contact us!
About the trainer

Bram is a seasoned data professional and open-source software enthusiast with over 15 years of experience in the business as Cloud platform engineer, systems architect, data scientist and engineer and data systems specialist on platforms such as the Microsoft ecosystem and Oracle SQL servers. Bram is driven to make high performance ETL pipelines and to provide value to the business from leveraging data.
In his role as lead engineer, Bram has educated and guided starting data engineers and scientists in their journey to becoming a well-respected data professional.
FAQ
The training can be given in Dutch or English, depending on the language of the participants.
You will need to bring your own laptop with the necessary development environment set up to participate in the coding exercises and projects.