Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion dbt_sql/resources/dbt_sql.job.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,6 @@ resources:
environments:
- environment_key: default
spec:
environment_version: "2"
environment_version: "4"
dependencies:
- dbt-databricks>=1.8.0,<2.0.0
10 changes: 10 additions & 0 deletions default_minimal/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
.databricks/
build/
dist/
__pycache__/
*.egg-info
.venv/
scratch/**
!scratch/README.md
**/explorations/**
**/!explorations/README.md
3 changes: 3 additions & 0 deletions default_minimal/.vscode/__builtins__.pyi
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Typings for Pylance in Visual Studio Code
# see https://github.com/microsoft/pyright/blob/main/docs/builtins.md
from databricks.sdk.runtime import *
7 changes: 7 additions & 0 deletions default_minimal/.vscode/extensions.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"recommendations": [
"databricks.databricks",
"redhat.vscode-yaml",
"ms-python.black-formatter"
]
}
39 changes: 39 additions & 0 deletions default_minimal/.vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
{
"jupyter.interactiveWindow.cellMarker.codeRegex": "^# COMMAND ----------|^# Databricks notebook source|^(#\\s*%%|#\\s*\\<codecell\\>|#\\s*In\\[\\d*?\\]|#\\s*In\\[ \\])",
"jupyter.interactiveWindow.cellMarker.default": "# COMMAND ----------",
"python.testing.pytestArgs": [
"."
],
"files.exclude": {
"**/*.egg-info": true,
"**/__pycache__": true,
".pytest_cache": true,
"dist": true,
},
"files.associations": {
"**/.gitkeep": "markdown"
},

// Pylance settings (VS Code)
// Set typeCheckingMode to "basic" to enable type checking!
"python.analysis.typeCheckingMode": "off",
"python.analysis.extraPaths": ["src", "lib", "resources"],
"python.analysis.diagnosticMode": "workspace",
"python.analysis.stubPath": ".vscode",

// Pyright settings (Cursor)
// Set typeCheckingMode to "basic" to enable type checking!
"cursorpyright.analysis.typeCheckingMode": "off",
"cursorpyright.analysis.extraPaths": ["src", "lib", "resources"],
"cursorpyright.analysis.diagnosticMode": "workspace",
"cursorpyright.analysis.stubPath": ".vscode",

// General Python settings
"python.defaultInterpreterPath": "./.venv/bin/python",
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter",
"editor.formatOnSave": true,
},
}
54 changes: 54 additions & 0 deletions default_minimal/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# default_minimal

The 'default_minimal' project was generated by using the default-minimal template.

* `src/`: SQL source code for this project.
* `resources/`: Resource configurations (jobs, pipelines, etc.)

## Getting started

Choose how you want to work on this project:

(a) Directly in your Databricks workspace, see
https://docs.databricks.com/dev-tools/bundles/workspace.

(b) Locally with an IDE like Cursor or VS Code, see
https://docs.databricks.com/dev-tools/vscode-ext.html.

(c) With command line tools, see https://docs.databricks.com/dev-tools/cli/databricks-cli.html

If you're developing with an IDE, dependencies for this project should be installed using uv:

* Make sure you have the UV package manager installed.
It's an alternative to tools like pip: https://docs.astral.sh/uv/getting-started/installation/.
* Run `uv sync --dev` to install the project's dependencies.


# Using this project using the CLI

The Databricks workspace and IDE extensions provide a graphical interface for working
with this project. It's also possible to interact with it directly using the CLI:

1. Authenticate to your Databricks workspace, if you have not done so already:
```
$ databricks configure
```

2. To deploy a development copy of this project, type:
```
$ databricks bundle deploy --target dev
```
(Note that "dev" is the default target, so the `--target` parameter
is optional here.)

This deploys everything that's defined for this project.

3. Similarly, to deploy a production copy, type:
```
$ databricks bundle deploy --target prod
```

4. To run a job or pipeline, use the "run" command:
```
$ databricks bundle run
```
42 changes: 42 additions & 0 deletions default_minimal/databricks.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# This is a Databricks asset bundle definition for default_minimal.
# See https://docs.databricks.com/dev-tools/bundles/index.html for documentation.
bundle:
name: default_minimal
uuid: 8127e9c1-adac-4c9c-b006-d3450874f663

include:
- resources/*.yml
- resources/*/*.yml

# Variable declarations. These variables are assigned in the dev/prod targets below.
variables:
catalog:
description: The catalog to use
schema:
description: The schema to use

targets:
dev:
# The default target uses 'mode: development' to create a development copy.
# - Deployed resources get prefixed with '[dev my_user_name]'
# - Any job schedules and triggers are paused by default.
# See also https://docs.databricks.com/dev-tools/bundles/deployment-modes.html.
mode: development
default: true
workspace:
host: https://company.databricks.com
variables:
catalog: catalog
schema: ${workspace.current_user.short_name}
prod:
mode: production
workspace:
host: https://company.databricks.com
# We explicitly deploy to /Workspace/Users/user@company.com to make sure we only have a single copy.
root_path: /Workspace/Users/user@company.com/.bundle/${bundle.name}/${bundle.target}
variables:
catalog: catalog
schema: prod
permissions:
- user_name: user@company.com
level: CAN_MANAGE
1 change: 1 addition & 0 deletions default_minimal/resources/.gitkeep
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

1 change: 1 addition & 0 deletions default_minimal/src/.gitkeep
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

2 changes: 2 additions & 0 deletions default_python/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@ __pycache__/
.venv/
scratch/**
!scratch/README.md
**/explorations/**
**/!explorations/README.md
4 changes: 2 additions & 2 deletions default_python/.vscode/extensions.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"recommendations": [
"databricks.databricks",
"ms-python.vscode-pylance",
"redhat.vscode-yaml"
"redhat.vscode-yaml",
"ms-python.black-formatter"
]
}
31 changes: 27 additions & 4 deletions default_python/.vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,16 +1,39 @@
{
"python.analysis.stubPath": ".vscode",
"jupyter.interactiveWindow.cellMarker.codeRegex": "^# COMMAND ----------|^# Databricks notebook source|^(#\\s*%%|#\\s*\\<codecell\\>|#\\s*In\\[\\d*?\\]|#\\s*In\\[ \\])",
"jupyter.interactiveWindow.cellMarker.default": "# COMMAND ----------",
"python.testing.pytestArgs": [
"."
],
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
"python.analysis.extraPaths": ["src"],
"files.exclude": {
"**/*.egg-info": true,
"**/__pycache__": true,
".pytest_cache": true,
"dist": true,
},
"files.associations": {
"**/.gitkeep": "markdown"
},

// Pylance settings (VS Code)
// Set typeCheckingMode to "basic" to enable type checking!
"python.analysis.typeCheckingMode": "off",
"python.analysis.extraPaths": ["src", "lib", "resources"],
"python.analysis.diagnosticMode": "workspace",
"python.analysis.stubPath": ".vscode",

// Pyright settings (Cursor)
// Set typeCheckingMode to "basic" to enable type checking!
"cursorpyright.analysis.typeCheckingMode": "off",
"cursorpyright.analysis.extraPaths": ["src", "lib", "resources"],
"cursorpyright.analysis.diagnosticMode": "workspace",
"cursorpyright.analysis.stubPath": ".vscode",

// General Python settings
"python.defaultInterpreterPath": "./.venv/bin/python",
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter",
"editor.formatOnSave": true,
},
}
25 changes: 14 additions & 11 deletions default_python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,12 @@

The 'default_python' project was generated by using the default-python template.

For documentation on the Databricks Asset Bundles format use for this project,
and for CI/CD configuration, see https://docs.databricks.com/aws/en/dev-tools/bundles.
* `src/`: Python source code for this project.
* `src/default_python/`: Shared Python code that can be used by jobs and pipelines.
* `resources/`: Resource configurations (jobs, pipelines, etc.)
* `tests/`: Unit tests for the shared Python code.
* `fixtures/`: Fixtures for data sets (primarily used for testing).


## Getting started

Expand All @@ -13,17 +17,17 @@ Choose how you want to work on this project:
https://docs.databricks.com/dev-tools/bundles/workspace.

(b) Locally with an IDE like Cursor or VS Code, see
https://docs.databricks.com/vscode-ext.
https://docs.databricks.com/dev-tools/vscode-ext.html.

(c) With command line tools, see https://docs.databricks.com/dev-tools/cli/databricks-cli.html


Dependencies for this project should be installed using uv:
If you're developing with an IDE, dependencies for this project should be installed using uv:

* Make sure you have the UV package manager installed.
It's an alternative to tools like pip: https://docs.astral.sh/uv/getting-started/installation/.
* Run `uv sync --dev` to install the project's dependencies.


# Using this project using the CLI

The Databricks workspace and IDE extensions provide a graphical interface for working
Expand All @@ -42,17 +46,16 @@ with this project. It's also possible to interact with it directly using the CLI
is optional here.)

This deploys everything that's defined for this project.
For example, the default template would deploy a job called
`[dev yourname] default_python_job` to your workspace.
You can find that job by opening your workpace and clicking on **Jobs & Pipelines**.
For example, the default template would deploy a pipeline called
`[dev yourname] default_python_etl` to your workspace.
You can find that resource by opening your workpace and clicking on **Jobs & Pipelines**.

3. Similarly, to deploy a production copy, type:
```
$ databricks bundle deploy --target prod
```

Note that the default job from the template has a schedule that runs every day
(defined in resources/default_python.job.yml). The schedule
Note the default template has a includes a job that runs the pipeline every day
(defined in resources/sample_job.job.yml). The schedule
is paused when deploying in development mode (see
https://docs.databricks.com/dev-tools/bundles/deployment-modes.html).

Expand Down
20 changes: 16 additions & 4 deletions default_python/databricks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,21 @@ bundle:
name: default_python
uuid: 87d5a23e-7bc7-4f52-98ee-e374b67d5681

include:
- resources/*.yml
- resources/*/*.yml

artifacts:
python_artifact:
type: whl
build: uv build --wheel

include:
- resources/*.yml
- resources/*/*.yml
# Variable declarations. These variables are assigned in the dev/prod targets below.
variables:
catalog:
description: The catalog to use
schema:
description: The schema to use

targets:
dev:
Expand All @@ -23,13 +30,18 @@ targets:
default: true
workspace:
host: https://company.databricks.com

variables:
catalog: catalog
schema: ${workspace.current_user.short_name}
prod:
mode: production
workspace:
host: https://company.databricks.com
# We explicitly deploy to /Workspace/Users/user@company.com to make sure we only have a single copy.
root_path: /Workspace/Users/user@company.com/.bundle/${bundle.name}/${bundle.target}
variables:
catalog: catalog
schema: prod
permissions:
- user_name: user@company.com
level: CAN_MANAGE
23 changes: 5 additions & 18 deletions default_python/fixtures/.gitkeep
Original file line number Diff line number Diff line change
@@ -1,22 +1,9 @@
# Fixtures
# Test fixtures directory

This folder is reserved for fixtures, such as CSV files.

Below is an example of how to load fixtures as a data frame:
Add JSON or CSV files here. In tests, use them with `load_fixture()`:

```
import pandas as pd
import os

def get_absolute_path(*relative_parts):
if 'dbutils' in globals():
base_dir = os.path.dirname(dbutils.notebook.entry_point.getDbutils().notebook().getContext().notebookPath().get()) # type: ignore
path = os.path.normpath(os.path.join(base_dir, *relative_parts))
return path if path.startswith("/Workspace") else "/Workspace" + path
else:
return os.path.join(*relative_parts)

csv_file = get_absolute_path("..", "fixtures", "mycsv.csv")
df = pd.read_csv(csv_file)
display(df)
def test_using_fixture(load_fixture):
data = load_fixture("my_data.json")
assert len(data) >= 1
```
Loading