
Apache Airflow on AWS EKS: The Hands-On Guide
Apache Airflow is an open-source platform to programmatically author, schedule and monitor workflows. If you have many ETL(s) to manage, Airflow is a must-have.
In the Apache Airflow on AWS EKS: The Hands-On Guide course, you are going to learn everything you need to set up a production ready architecture on AWS EKS with Airflow and the Kubernetes Executor. Discover how to execute tasks at scale like you will do in your company.
Enroll to the course
Materials (required for the course)
You will find the materials directly in a video of the course
Curriculum
Section 1: Introduction
- Important Prerequisites
- Who I am
- Your Airflow Journey
- Overview of the architecture
- The Checklist
Section 2: Configuring AWS
- Defining a budget
- [Practice] Creating the IAM admin group
- [Practice] Create the IAM admin user
Section 3: Exploring the DevOps world
- Why is knowing DevOps concepts important?
- Reminder about Kubernetes
- Kubernetes Quiz
- What is IaC or Infrastructure as code?
- IaC Quiz
- Deployments with GitOps
- GitOps made simple with Flux
- GitOps Quiz
Section 4: Creating the EKS cluster with GitOps
- [Practice] Creating the cloud9 environment for the workstation
- [Practice] Configuring the workstation
- [Practice] Configuring Cloud9 with the Admin account
- [Practice] Creating the IAM role to interact with the EKS cluster
- AZs, VPCs and Subnets in AWS
- What is AWS EKS?
- [Practice] Creating and configuring the Git repository for GitOps
- [Practice] Creating a multi-node EKS cluster with EKSCTL and GitOps
- [Practice] Configuring the EKS cluster with Flux
- Namespaces in Kubernetes
- [Practice] Creating dev, staging and prod namespaces
- Clean Up
Section 5: Deploying Airflow with DAGs
- Set Up
- Deployments with Helm
- [Practice] Overview of the Airflow Helm chart
- Scaling with the Kubernetes Executor
- [Practice] Creating your first release of Airflow
- [Practice] Deploying Airflow with Flux
- Troubleshooting deployments with Flux
- Synchronizing DAGs in Kubernetes
- [Practice] Fetching DAGs with Git-Sync
- [Practice] Running DAGs with Git-Sync
- Secrets in Kubernetes
- [Practice] Fetching DAGs with Git-Sync from a private repository
- [Practice] Adding the secret in the repo
- Volumes in Kubernetes
- Introduction to AWS EFS
- [Practice] Configuring AWS EFS
- [Practice] Sharing DAGs between pods with AWS EFS
- Clean Up
Section 6: Building CI/CD pipelines to deploy Airflow
- Set Up
- What is AWS CodePipeline?
- [Practice] Building a CI/CD pipeline with CodePipeline and ECR
- [Practice] Deploying Airflow in EKS with CodePipeline and Flux
- Unit testing in Airflow
- [Practice] Unit testing your DAGs
- [Practice] Building the CI/CD pipeline in dev with unit tests
- [Practice] Integration tests for testing tasks in DAGs
- [Practice] Building the CI/CD pipeline in staging with integration tests
- [Practice] Clean up
Section 7: Exposing the Airflow UI
- [Practice] Set up
- Services in Kubernetes
- Architecture with the Elastic Load Balancer
- [Practice] Exposing the Airflow UI with AWS Elastic Load Balancer
- What is an Ingress?
- Architecture with the AWS ALB Ingress controller
- [Practice] Exposing the Airflow UI with AWS ALB Ingress
- [Practice] Exposing the staging environment with AWS ALB
- Quick reminder about SSL
- [Practice] Creating a Domain for Airflow with ExternalDNS and AWS Route53
- [Practice] Activating SSL on the Airflow UI
- [Practice] Fix the AWS ALB’s health checks
- [Practice] Exporting the SSL secret object
- [Practice] Upgrading the staging environment
- [Exercise] Enabling DNS and SSL for staging
- [Practice] Creating subdomains to access the UIs of Airflow
- Clean Up
Section 8: Logging with Airflow in AWS EKS
- Set Up
- RBAC in Kubernetes
- Permission issues for accessing pod’s logs
- [Practice] Storing logs in AWS EFS
- [Practice] Remote logging with AWS S3
- Limitations of remote logging in AWS S3
- Remote logging with AWS CloudWatch
- Sensitive data with Secret Backends
- [Practice] Managing connections with AWS Secret Manager
- [Creating] Storing the secret object of AWS Secret Manager for Flux
- Clean Up
Section 9: Configuring the production environment
- Set up
- [Practice] Creating the production environment
- Identifying single point of failures
- [Practice] Making the Airflow UI highly available
- AWS Relational Database Service
- [Practice] Airflow with AWS RDS
- DAG Serialization
- [Practice] Making the web server stateless with DAG Serialization
- Clean Up
- Congratulations!