When you start using Apache Airflow in production, one of your top priority is to prevent its access to everyone. Indeed, since Airflow orchestrates data pipelines, it is a master piece of your data platform and can potentially deal with sensitive data. When we are working in a company, different teams may want to access Airflow and so, you have to know who is allowed or not. In this tutorial, you are going to discover how to set up password authentication in Apache Airflow which will be your first major security in place. Then, you will see how to filter DAGs by owner so that only the right teams access to the DAGs they belong to. By the way, if you are interesting by mastering Apache Airflow, check by course right here. Alright, without further waiting, let’s get started.
Authentication backends in Apache Airflow
Apache Airflow brings some authentication backends that you can use to filter the access of the user interface. Here is the exhaustive list:
- Github Enterprise
- Google Auth
If you want to learn more about their implementation you can take a look at the following link. For simplicity, we are going to start with the Password authentication in Apache Airflow. Don’t worry I will show you how to set up Kerberos and LDAP authentications as well in future tutorials. If you want to stay in touch, don’t forget to let share your email address, so that I can reach you.
Password Authentication in Apache Airflow
Alright, in the video below taken from my course The Ultimate Hands-On Course to Master Apache Airflow, I show you step by step how to set up the password authentication in Apache Airflow. Also, you will discover how to filter your DAGs by owner so that only the user can see the DAGs he/she belongs to.
Password authentication is the simplest way to force users to specify a password before logging in. Notice that the python module bcrypt should be installed as well as the package “password” along the install of Apache Airflow. There is no way to create a user from the user interface of Airflow so you have to use the code snippet shown from the video to generate one. In my opinion, this mechanism can be nice at the very beginning but if you start using Airflow at scale in your company, you should think about setting up RBAC (Role-Based Access Control) with Airflow. It is well integrated and perfect to narrow permissions of each user specifically to their needs. If you want to learn more about it, you can check my course where I show you how to do it.
If you like my tutorials and want to support my work, click here and become my Patron. ( The number of Patrons is limited )
I hope you enjoyed this tutorial and see you for the next one 🙂