You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/development.rst
+25-9Lines changed: 25 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,21 +10,19 @@ Poetry
10
10
11
11
airflow-dbt-python uses `Poetry <https://python-poetry.org/>`_ for project management. Ensure it's installed before running: see `Poetry's installation documentation <https://python-poetry.org/docs/#installation>`_.
12
12
13
-
Additionally, we recommend running the following commands in a virtual environment.
14
-
15
13
Installing Airflow
16
14
------------------
17
15
18
-
For running unit-tests we require a local installation of Airflow. We can install a specific version using ``pip``:
16
+
Development requires a local installation of Airflow, as airflow-dbt-python doesn't come bundled with one. We can install a specific version using ``pip``:
19
17
20
18
.. code-block:: shell
21
19
22
-
pip install apache-airflow==1.10.12
20
+
pip install apache-airflow==2.2
23
21
24
22
.. note::
25
-
Installin any 1.X version of Airflow will raise warnings due to dependency conflicts with ``dbt-core``. These conflicts should not impact airflow-dbt-python.
23
+
Installing any 1.X version of Airflow will raise warnings due to dependency conflicts with ``dbt-core``. However, these conflicts should not impact airflow-dbt-python.
26
24
27
-
Or install the ``airflow`` extra which will fetch the latest version of Airflow with major version 2:
25
+
Installing the ``airflow`` extra will fetch the latest version of Airflow with major version 2:
28
26
29
27
.. code-block:: shell
30
28
@@ -45,6 +43,27 @@ Clone the main repo and install it:
45
43
poetry install
46
44
47
45
46
+
Pre-commit hooks
47
+
----------------
48
+
49
+
A handful of `pre-commit <https://pre-commit.com/>`_ hooks are provided, including:
Copy file name to clipboardExpand all lines: docs/example_dags.rst
+247Lines changed: 247 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,3 +2,250 @@ Example DAGs
2
2
============
3
3
4
4
This section contains a few DAGs showing off some dbt pipelines to get you going.
5
+
6
+
Basic DAG
7
+
^^^^^^^^^
8
+
9
+
This basic DAG shows off a single ``DbtRunOperator`` that executes daily:
10
+
11
+
.. code-block:: python
12
+
:linenos:
13
+
:caption: basic_dag.py
14
+
15
+
"""Sample basic DAG which dbt runs a project."""
16
+
import datetime as dt
17
+
18
+
from airflow importDAG
19
+
from airflow.utils.dates import days_ago
20
+
from airflow_dbt_python.dbt.operators import DbtRunOperator
21
+
22
+
with DAG(
23
+
dag_id="example_basic_dbt_run",
24
+
schedule_interval="0 * * * *",
25
+
start_date=days_ago(1),
26
+
catchup=False,
27
+
dagrun_timeout=dt.timedelta(minutes=60),
28
+
) as dag:
29
+
30
+
dbt_run = DbtRunOperator(
31
+
task_id="dbt_run_hourly",
32
+
project_dir="/path/to/my/dbt/project/",
33
+
profiles_dir="~/.dbt/",
34
+
select=["+tag:hourly"],
35
+
exclude=["tag:deprecated"],
36
+
target="production",
37
+
profile="my-project",
38
+
full_refresh=False,
39
+
)
40
+
41
+
42
+
Run and Docs from S3
43
+
^^^^^^^^^^^^^^^^^^^^
44
+
45
+
This DAG shows off a ``DbtRunOperator`` followed by a ``DbtDocsGenerateOperator``. Both execute daily, and run from dbt project files available in an S3 URL:
46
+
47
+
.. code-block:: python
48
+
:linenos:
49
+
:caption: dbt_project_in_s3_dag.py
50
+
51
+
"""Sample basic DAG which showcases a dbt project being pulled from S3."""
52
+
import datetime as dt
53
+
54
+
from airflow importDAG
55
+
from airflow.utils.dates import days_ago
56
+
from airflow_dbt_python.dbt.operators import DbtDocsGenerateOperator, DbtRunOperator
57
+
58
+
with DAG(
59
+
dag_id="example_basic_dbt_run_with_s3",
60
+
schedule_interval="0 * * * *",
61
+
start_date=days_ago(1),
62
+
catchup=False,
63
+
dagrun_timeout=dt.timedelta(minutes=60),
64
+
) as dag:
65
+
66
+
# Project files will be pulled from "s3://my-bucket/dbt/profiles/key/prefix/"
This DAG shows off a (almost) complete dbt workflow as it would be run from the CLI: we begin by running ``DbtSourceOperator`` to test the freshness of our source tables, ``DbtSeedOperator`` follows to load up any static data. Then, two instances of ``DbtRunOperator`` are created: one to handle incremental data, and the other one to run any non-incremental models. Finally, we run our tests to ensure our models remain correct.
92
+
93
+
.. code-block:: python
94
+
:linenos:
95
+
:caption: complete_dbt_workflow_dag.py
96
+
97
+
"""Sample DAG showcasing a complete dbt workflow.
98
+
99
+
The complete workflow includes a sequence of source, seed, and several run commands.
The following DAG showcases how to use `dbt artifacts <https://docs.getdbt.com/reference/artifacts/dbt-artifacts/>`_ that are made available via XCom by airflow-dbt-python. A sample function calculates the longest running dbt model by pulling the artifacts that were generated after ``DbtRunOperator`` executes. We specify which dbt artifacts via the ``do_xcom_push_artifacts`` parameter.
172
+
173
+
.. code-block:: python
174
+
:linenos:
175
+
:caption: use_dbt_artifacts_dag.py
176
+
177
+
"""Sample DAG to showcase pulling dbt artifacts from XCOM."""
178
+
import datetime as dt
179
+
180
+
from airflow importDAG
181
+
from airflow.operators.python_operator import PythonOperator
182
+
from airflow.utils.dates import days_ago
183
+
from airflow_dbt_python.dbt.operators import DbtRunOperator
184
+
185
+
186
+
defprocess_dbt_artifacts(**context):
187
+
"""Report which model or models took the longest to compile and execute."""
0 commit comments