Remove unused library imports, update dbt task #539

sydneynotthecity · 2024-11-11T17:31:23Z

PR Checklist

PR Structure

This PR has reasonably narrow scope (if not, break it down into smaller PRs).
This PR avoids mixing refactoring changes with feature changes (split into two PRs
otherwise).
This PR's title starts with the jira ticket associated with the PR.

Thoroughness

This PR adds tests for the most critical parts of the new functionality or fixes.
I've updated the README with the added features, breaking changes, new instructions on how to use the repository.

What

Updating orchestration flows for the dbt marts. Historically, all staging tables and intermediate tables had to be manually tagged in order to execute for a domain pipeline (ie, fee_stats, enriched_history). This required the developer to remember to tag all models appropriately, which led to mistagged and forgotten models. The flow now uses + operator where appropriate to orchestrate pipelines.

I also cleaned up library imports.

Also disabled the scd_snapshot state and soroban workflows as the tables are broken and should not be used.

Why

Staging and intermediate models do not need tagging if node selection is inclusive of upstream models.

Known limitations

[TODO or N/A]

sydneynotthecity · 2024-11-11T17:43:52Z

TODO: update dbt images once built for testing

dags/stellar_etl_airflow/build_dbt_task.py

…stellar/stellar-etl-airflow into patch/update-dbt-node-selection

sydneynotthecity · 2024-11-14T19:42:07Z

airflow_variables_dev.json

@@ -124,7 +124,7 @@
    "partnership_assets__account_holders_activity_fact": false,
    "partnership_assets__asset_activity_fact": false
  },
-  "dbt_image_name": "stellar/stellar-dbt:96cd862b1",
+  "dbt_image_name": "stellar/stellar-dbt-dev:4b8a2ecc4",


Will swap once finished testing in test

sydneynotthecity · 2024-11-14T19:42:28Z

dags/dbt_enriched_base_tables_dag.py

@@ -35,9 +35,9 @@

 # DBT models to run
 enriched_history_operations_task = dbt_task(
-    dag, tag="enriched_history_operations", excluded="singular_test"


singular test is set through a env var

sydneynotthecity · 2024-11-14T19:42:57Z

dags/dbt_stellar_marts_dag.py

+    operator="+",
+    excluded="stellar_dbt_public",
+)
+trade_agg_task = dbt_task(dag, tag="trade_agg", operator="+")


trade agg is defined in the public project

amishas157 · 2024-11-18T16:56:18Z

dags/dbt_stellar_marts_dag.py

+    operator="+",
+    excluded="stellar_dbt_public",
+)
+trade_agg_task = dbt_task(dag, tag="trade_agg", operator="+")
 fee_stats_agg_task = dbt_task(dag, tag="fee_stats")


We do not exclude stellar_dbt_public when mart is part of public repo.
When do we not want to put operator='+'?

imo, only when the mart uses only tables built in the enriched_history or current_state data pipelines. ie, if your model only uses enriched_history_operations as a source, has no intermediate tables, etc. This is why I left default operator settings for fee_stats_agg and network_stats_agg

The models cannot have custom intermediate or staging models for this to work.

Got it. So, we check lineage. If mart is not dependent on intermediate tables / custom sources, we use default operator to avoid race conditions. Otherwise, we use + operator.
That should work. But also, let's continue our discussion in ofc hours for using other ways of model selection.

i agree. This is still imperfect and I think we can come up with an better alternative

amishas157

One question on the usage of operator. Otherwise looks good to me

Remove unused library imports, update dbt task default

8e889ee

sydneynotthecity requested a review from a team as a code owner November 11, 2024 17:31

Format isort

6eaac5c

amishas157 reviewed Nov 11, 2024

View reviewed changes

dags/stellar_etl_airflow/build_dbt_task.py Outdated Show resolved Hide resolved

sydneynotthecity added 2 commits November 11, 2024 12:47

Revert operator default and pass on caller

289dc17

Lint errors

e58a775

amishas157 approved these changes Nov 11, 2024

View reviewed changes

sydneynotthecity added 6 commits November 11, 2024 13:41

Bump images

a5c150c

Merge branch 'master' into patch/update-dbt-node-selection

8cdd5ec

Use exclude package name on dbt tasks

825d896

Merge branch 'master' into patch/update-dbt-node-selection

8b15c6c

Reformat code

c6c9b5d

Merge branch 'patch/update-dbt-node-selection' of https://github.com/…

044e2bf

…stellar/stellar-etl-airflow into patch/update-dbt-node-selection

sydneynotthecity commented Nov 14, 2024

View reviewed changes

sydneynotthecity added 5 commits November 14, 2024 13:53

Update task_sla key

9cc3213

Update all task names

28d5306

Lint

6894987

Remove logic to append exclude to task name

adbc7d1

Bump stellar-dbt image

aa31203

sydneynotthecity requested a review from amishas157 November 18, 2024 16:01

amishas157 reviewed Nov 18, 2024

View reviewed changes

Update dbt image

3b785ac

sydneynotthecity changed the title ~~Remove unused library imports, update dbt task default~~ Remove unused library imports, update dbt task Nov 18, 2024

sydneynotthecity merged commit 33c0ff8 into master Nov 18, 2024
3 checks passed

sydneynotthecity deleted the patch/update-dbt-node-selection branch November 18, 2024 18:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove unused library imports, update dbt task #539

Remove unused library imports, update dbt task #539

sydneynotthecity commented Nov 11, 2024 •

edited

Loading

sydneynotthecity commented Nov 11, 2024

sydneynotthecity Nov 14, 2024

sydneynotthecity Nov 14, 2024

sydneynotthecity Nov 14, 2024

amishas157 Nov 18, 2024

sydneynotthecity Nov 18, 2024

amishas157 Nov 18, 2024

sydneynotthecity Nov 18, 2024

amishas157 left a comment

Remove unused library imports, update dbt task #539

Remove unused library imports, update dbt task #539

Conversation

sydneynotthecity commented Nov 11, 2024 • edited Loading

PR Structure

Thoroughness

What

Why

Known limitations

sydneynotthecity commented Nov 11, 2024

sydneynotthecity Nov 14, 2024

Choose a reason for hiding this comment

sydneynotthecity Nov 14, 2024

Choose a reason for hiding this comment

sydneynotthecity Nov 14, 2024

Choose a reason for hiding this comment

amishas157 Nov 18, 2024

Choose a reason for hiding this comment

sydneynotthecity Nov 18, 2024

Choose a reason for hiding this comment

amishas157 Nov 18, 2024

Choose a reason for hiding this comment

sydneynotthecity Nov 18, 2024

Choose a reason for hiding this comment

amishas157 left a comment

Choose a reason for hiding this comment

sydneynotthecity commented Nov 11, 2024 •

edited

Loading