-
First Time Setup
-
Create a python virtual environment the .venv folder for python to use
python3 -m venv .venv
-
Create a .env
cp env.example .env
-
-
Activating Virtual Environment
-
Make python download and use open source code inside your .venv folder
source .venv/bin/activate
-
Check if python using .venv folder
which python3
-
-
Install the latest packages for the project
pip install -r requirements.txt
pytest
- Note that all test files must end in
_test.py
pip freeze > requirements.txt
pip install 'apache-airflow==2.9.1' \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.9.1/constraints-3.8.txt"
- Change constraints-3.8.txt to your python version
- More details in Airflow document
-
Create an airflow folder under your object, and create a dags folder under airflow.
-
To configure Airflow to recognize your DAGs directory, you need to set the
AIRFLOW_HOME
environment variable. replacing/path/to/dags/folder/parent/folder
with the actual path to your desired directory:
export AIRFLOW_HOME= path/to/dags/flod/parent/flod
- Run Airflow Standalone, and get the default username and password.
airflow standalone
The command initializes the database, creates a user, and starts all components.
- If you prefer to run individual components of Airflow manually, or if you need personalized user information, instead of using the all-in-one standalone command, you can run the following:
airflow db init
airflow users create \
--username admin \
--firstname Peter \
--lastname Parker \
--role Admin \
--email [email protected]
airflow webserver --port 8080
airflow scheduler
airflow triggerer
- More details in Airflow document
-
Access the Airflow UI: Visit localhost:8080 in your browser.
-
Connect to AWS S3: Choose connections under Admin, create a new connection. Input
oncokb_s3
inConnection Id
, chooseConnection Type
as Amazon Web Services , and inputAWS Access Key ID
andAWS Secret Access Key
. -
Connect to MySQL: Choose connections under Admin, create a new connection. Input
oncokb_mysql
inConnection Id
, chooseConnection Type
as MySQL , and inputHost
,schema
,login
,password
andport
.
- If you want to close all Airflow DAG or connectionsexamples on Airflow webserver. Open airflow.cfg and change
load_examples = False
orload_default_connections = False
.
With the Airflow CLI, run to test your dag, you can check the result and logs at Airflow UI.
airflow dags test <dag_id>
You can use CLI to list all dags you have.
airflow dags list