Fjelltopp ETL Framework

Structure of fjelltopp ETL batch projects

Define a function called run_pipeline() that will run the ETL pipeline

This function can then be called in a name == "main" block or by in a lambda function.

Extract functions should get the needed data and return a pandas dataframe.

Transform functions should be pure functions that take a pandas dataframe as an argument and returns a data frame. Ideal functions would allow the use of pandas https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pipe.html.

Load functions should take data frames and write them to their indent destinations

Secrets

We support two ways of including secrets in ETL pipelines. The first is through environment variables. The second using AWS Secret manager.

Logging:

It is important to log the execution of our data pipelines. To log directly to cloudwatch get a logger from the get_logger function in etl.logging. This supports normal python logging.

Scheduling

Scheduling by crontab or other means should be clearly documented in the etl_pipeline

Relesing a new version

Update the requirements in setup.py file.
Update the verion in setup.py in a separate commit.
Build dist tar.gz file:
```
python setup.py sdist
```
And after publish the artifact to pypi
```
python3 -m twine upload  dist/*
```
Tag the new release on github.
1. Visit releases
2. Draft a new release (keep the format same as setup.py file, e.g. v0.0.4
3. Submit the new release

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github/workflows		.github/workflows
etl		etl
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fjelltopp ETL Framework

Structure of fjelltopp ETL batch projects

Secrets

Logging:

Scheduling

Relesing a new version

About

Releases 7

Packages

Contributors 2

Languages

License

fjelltopp/fjelltopp-etl

Folders and files

Latest commit

History

Repository files navigation

Fjelltopp ETL Framework

Structure of fjelltopp ETL batch projects

Secrets

Logging:

Scheduling

Relesing a new version

About

Resources

License

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 2

Languages

Packages