Skip to content

Replace Makefile with Taskfile.yml and move all generation-related logic there#5050

Open
denik wants to merge 43 commits intomainfrom
denik/task-file
Open

Replace Makefile with Taskfile.yml and move all generation-related logic there#5050
denik wants to merge 43 commits intomainfrom
denik/task-file

Conversation

@denik
Copy link
Copy Markdown
Contributor

@denik denik commented Apr 21, 2026

Each task describes inputs (sources) and outputs as precise as possible.

The tasks mostly match what we had in Makefile but slightly more consistent approach:

  • All go related tasks (tidy, lint, test) now have 3 subtasks for each submodule (root, tools, codegen) which are aggregated into main one. Previously we did not run tests & linters for tools and codegen modules.
  • The targets that use lintdiff.py to do incremental run have -q suffix (old fmtfull -> new fmt, old fmt -> new fmt-q).
  • generate:genkit is task that runs genkit + follow ups. “generate” is aggregate over all generation work.

This also consolidates all knowledge about generation in one place:

  • tools/post-process.sh is removed, the followup commands are now part of generate:genkit task
  • testmask no longer contains of dependencies of CI targets, it reads those from Taskfile.yml

The runner (go-task) is packaged as a go tool, no installation need. Shortcut ./task is available to run it. Makefile is temporarily a wrapper that calls ./task.

This enables caching - go-task will only re-run if sources changed. This helps when checking after agent work - if they did run linters / tests already, then running ./task is very quick.

~/work/cli-trees/task-file % time ./task

./task 612.63s user 310.22s system 726% cpu 2:06.95 total

~/work/cli-trees/task-file % time ./task

./task 2.43s user 10.58s system 133% cpu 9.727 total

Additionally, all golangci-lint tasks are fixed to use per-worktree & per-submodule tmp directory, which enables parallel runs within worktree and across worktrees.

denik added 30 commits April 21, 2026 14:08
Replace the hardcoded prefix table with parsing of Taskfile.yml so the
list of paths that triggers each CI test target is derived from the
test:exp-aitools, test:exp-ssh, and test:pipelines tasks' `sources:`.
Also makes the tool CWD-independent via `git rev-parse --show-toplevel`.

Co-authored-by: Isaac
Introduce Taskfile.yml as the single source of truth for build, test,
lint, format, and codegen commands. Provide a thin Makefile wrapper that
forwards all targets to task so existing `make <target>` invocations keep
working for single-word targets. Move python/Makefile into the same
Taskfile under python:* tasks.

Task is vendored via a separate tools/task module to avoid its
charmbracelet dependencies conflicting with tools/go.mod (golangci-lint
pins older charmbracelet versions).

Also update acceptance test helper to build yamlfmt directly — the
Makefile wrapper can no longer forward the old `tools/yamlfmt` target
since it's now a Task-only dependency.

Co-authored-by: Isaac
Replace `make <target>` with `go tool -modfile=tools/task/go.mod task <target>`
in CI workflows and the setup-build-environment action. This skips the
Makefile wrapper and surfaces task's real target names (e.g. task:exp-aitools,
fmt:full) in CI logs.

release-build.yml inlines the python wheel build steps (which used to live
in python/Makefile) so the release job does not depend on task.

Co-authored-by: Isaac
Replace make commands with their task equivalents in the build/test and
code-quality sections so AI agents and human readers see the current
invocation style.

Co-authored-by: Isaac
Each submodule gets its own minimal .golangci.yaml so lint:tools and
lint:codegen can run from those directories without inheriting the root
config. The root config enables the ruleguard gocritic check against
`libs/gorules/rule_*.go`, which golangci-lint resolves relative to the
working directory — from inside tools/ or bundle/internal/tf/codegen/
that path does not exist and lint aborts before running.

Co-authored-by: Isaac
Shortens `go tool -modfile=tools/task/go.mod task <target>` to
`./task <target>` and resolves the modfile via `git rev-parse
--show-toplevel` so the script works from any subdirectory.

Co-authored-by: Isaac
Replace every `go tool -modfile=tools/task/go.mod task` invocation in CI
workflows, the Makefile wrapper, and the dbr:test deco command with the
shorter `./task`. The dbr:test command previously pointed at tools/go.mod,
which does not include task as a tool — the wrapper fixes that too.

Co-authored-by: Isaac
- build: correct generates from `databricks` to `cli` (Go module name)
- drop build:vm (unused)
- snapshot, snapshot:release, schema:for-docs, docs: add sources/generates
- lint:tools, lint:codegen: include .golangci.yaml, go.mod, go.sum in sources
- python:docs, python:codegen: include docs and databricks inputs
- split fmt into fmt:python, fmt:go, fmt:yaml; fmt and fmt:full dispatch
  the three in parallel via deps

Co-authored-by: Isaac
Renames lint:check/lint:tools/lint:codegen to lint:go:root/lint:go:tools/
lint:go:codegen to make it explicit they're Go-specific. Adds lint:go that
runs all three in parallel so a single invocation covers the whole repo.

Co-authored-by: Isaac
The `fmt` aggregate uses the incremental `fmt:go` (via lintdiff.py) for a
fast interactive loop, while `fmt:full` uses `fmt:go:full` to sweep the
whole tree. Python and YAML tasks are shared between the two aggregates.

Co-authored-by: Isaac
The root CLI binary only imports from bundle/, cmd/, experimental/,
internal/, libs/, so test-only trees (acceptance/, integration/), the
tools/ and bundle/internal/tf/codegen/ submodules, and _test.go files
are excluded from build/snapshot/snapshot:release source globs. This
lets Task's checksum cache skip rebuilds when only tests change.

Task v3 uses `- exclude: <pattern>` for glob exclusion (the `!` prefix
from doublestar isn't honored by Task's source matching).

Also remove the _build-yamlfmt task — nothing referenced it, and
acceptance tests build yamlfmt directly via BuildYamlfmt with exeSuffix
for Windows.

Co-authored-by: Isaac
Introduce `generate` as an aggregator that runs each generator in
sequence:
- `generate:commands` (universe + genkit; was top-level `generate`)
- `generate:schema` (was `schema`; now has tight sources, no longer
  gated by `git diff` against merge-base)
- `generate:schema-docs` (was `schema:for-docs`)
- `generate:docs` (was `docs`)
- `generate:validation` (now has tight sources)
- `generate:direct` (unchanged)
- `generate:openapi-json` (was `codegen:openapi-json`)

Each reflection-based generator (schema, schema-docs, validation, docs)
lists its Go sources explicitly and includes `go.mod`/`go.sum`, so the
aggregator picks up SDK version bumps from `generate:commands`
automatically. Test files are excluded from the source globs.

Trim `tools/post-generate.sh` to only post-process genkit output
(consistency check, tagging.py relocation, whitespace fix). The former
`make schema`/`make schema-for-docs`/`make generate-validation`/
`make -C python codegen` calls are gone — those run as standalone tasks
in the aggregator now.

Update `.github/workflows/push.yml`, `AGENTS.md`,
`.agent/rules/auto-generated-files.md`,
`.agent/skills/pr-checklist/SKILL.md`, `bundle/docsgen/README.md`, and
schema test comments to reference the new task names.

Also add `.databricks/` (snapshot binary output) to `.gitignore`.

Co-authored-by: Isaac
…efault

- Move refschema regeneration into a standalone generate:refschema task
  with a tight sources list so the cache tracks what actually affects
  `bundle debug refschema` (cmd, dresources adapters, libs/structs, go.mod).
- generate:commands now only runs genkit + post-process.
- Aggregator runs generate:refschema between commands and the other
  generators so fresh out.fields.txt feeds generate:direct.
- default now invokes the full generate aggregator instead of just
  generate:schema + generate:validation.
- Fix acceptance/install_terraform.py to skip doc-comment lines when
  reading ProviderVersion (regressed by the recent doc comment added to
  bundle/internal/tf/codegen/schema/version.go).

Co-authored-by: Isaac
Convention change: the plain task name is the full variant. Quick /
incremental variants carry a -q suffix on the top-level namespace.

- fmt:full -> fmt, fmt:go:full -> fmt:go (full is the default)
- fmt (old incremental) -> fmt-q; fmt:go (old incremental) -> fmt-q:go
- lint:full removed; lint is now the aggregator across all Go modules
- lint (old incremental root+fix) -> lint-q:go:root; lint-q alias added
- lint:go aggregator runs modules sequentially (concurrent golangci-lint
  invocations have shown unreliable behavior)
- default task now runs the -q variants for a quick dev loop
- new `all` task runs the full suite (checks, fmt, lint, test, generate,
  python:lint, python:test) — replaces old default semantics

Close the gap where CI only linted the root module: CI now runs
`./task lint` which covers root + tools + bundle/internal/tf/codegen.

lintdiff.py gains --root-module to skip paths under nested go.mod files;
needed for `golangci-lint run` (typechecks) but not for `fmt` (filesystem
walk), so only lint-q:go:root opts in.

Co-authored-by: Isaac
Drop the --root-module flag; infer from the wrapped subcommand instead.
`golangci-lint run` typechecks and needs the filter; `golangci-lint fmt`
walks the filesystem and wants to see changed files across modules.
Callers no longer need to know which mode applies.

Co-authored-by: Isaac
generate:commands is expensive (bazel/genkit build) and requires a
universe checkout. Its outputs are committed, so a fresh worktree at any
commit already has them. Run `./task generate` explicitly when codegen
inputs change.

Co-authored-by: Isaac
- Use {{.GO_TOOL}} everywhere (redefine with {{.ROOT_DIR}} so tasks with
  `dir:` can reference it).
- Drop dead `{{.X | default ...}}` fallbacks in task-local vars; the
  top-level vars are always set and pass through via `deps:` + `vars:`.
- Serialize `checks` (tidy/ws/links) and `test:update-all` — they were
  racing on the same files via parallel `deps:`.
- Add `deps: [tidy]` to `snapshot` and `snapshot:release` (matches
  `build`).
- Add `desc:` to `generate:direct-apitypes` and
  `generate:direct-resources`.
- Expand `lint-q:go:root` desc to note it skips tools/ and codegen/.
- Drop `sources:` from `ws` (script is fast; no point caching).
- Fix code-generation section comment to list all reflection-based
  generators (refschema, schema-docs missing).

Co-authored-by: Isaac
- `lint:go:root` and `lint-q:go:root` now exclude tools/** and
  bundle/internal/tf/codegen/** from sources. golangci-lint `./...` stops
  at module boundaries, so changes there never affect root-module lint
  output — no reason to invalidate the cache.

- Split `tidy` by module, matching the `lint:go` layout:
  - `tidy` — aggregator across all three modules
  - `tidy:root`  — root (previously `tidy`)
  - `tidy:tools` — tools/
  - `tidy:codegen` — bundle/internal/tf/codegen

  `build`, `snapshot`, `snapshot:release` switch to `deps: [tidy:root]`
  (they only build the root module); `checks` still calls the
  aggregator.

Co-authored-by: Isaac
Mirrors main's e24ae4e (Remove Slow tests #5032): drop test:slow,
test:slow-unit, test:slow-acc, and the `-short` flag from test:unit
and test:acc. Slow tests support in the acceptance runner was removed
because the split between short and fast tests adds complexity and
makes it possible to miss failures.

Co-authored-by: Isaac
Switch lint:go and tidy aggregators from sequential cmds: to parallel
deps:. Each lint:go module sets its own TMPDIR under .tmp/ so concurrent
golangci-lint invocations — both siblings here and sibling worktrees
running lint at the same time — don't serialize on the shared /tmp lock.
The TMPDIR path lives under the repo (but not at ROOT_DIR itself, which
would trip Go's "go.mod in os.TempDir" check). Add /.tmp/ to .gitignore.

Co-authored-by: Isaac
Rename generate:direct-{apitypes,resources,clean} to
generate:direct:{apitypes,resources,clean} so the direct engine
subtasks follow the same colon-namespaced convention as lint:go:root,
tidy:root, etc.

Also broaden generate:refschema sources to all of bundle/ (excluding
_test.go) plus go.mod / go.sum — the command reflects over types
reachable from bundle/, so narrowing to a handful of packages risks
missing changes that affect the schema.

Co-authored-by: Isaac
…nerate.sh

Renamed because the task does more than generate CLI commands: it also emits
the tagging workflow + tagging.py, bumps the SDK version, and post-processes
the output. Moved the steps from tools/post-generate.sh (previously invoked
via .codegen.json post_generate hook) inline into the task so ./task generate
is self-contained and the tree is left clean afterwards. Also dropped
generate:direct:clean which was out of place.

Co-authored-by: Isaac
The tasks target the python/databricks-bundles package (pydabs), not
Python tooling in general (fmt:python stays, since that one does apply
repo-wide). Renaming avoids confusion and matches how the project
refers to the package elsewhere.

Co-authored-by: Isaac
The old `test:unit` only ran `go test ./...` in the root module; tests in
tools/ and bundle/internal/tf/codegen weren't run. Split into per-module
tasks plus an aggregator that concatenates their JSON outputs.

Also makes tools/testmask tolerate non-string `sources:` entries
(e.g. `- exclude: tools/**`), which previously made yaml.Unmarshal fail
once those excludes were introduced.

Co-authored-by: Isaac
Previously `cat … > test-output-unit.json` (and the top-level `test`
merge) ran every invocation even when all sub-tasks were cache-hit.
Declaring the inputs as `sources:` and the merged file as `generates:`
lets Task skip the cat when nothing upstream changed.

Co-authored-by: Isaac
Consistent with other colon-namespaced tasks (test:unit:*, test:acc).
testmask now derives the Taskfile task name from the CI output name by
replacing "-" with ":", so the list of targets no longer needs both
forms — a single slice suffices.

CI target identifiers (job names, cache keys) remain dash-separated.

Co-authored-by: Isaac
Rename to integration-short, dbr-integration, dbr-test.
These are single-task variants without a parent aggregator, so dash
suffixes fit better than colon namespacing.

Co-authored-by: Isaac
Redundant — the `deps:` structure already makes parallel execution
obvious to anyone reading the task definition, and users running
`./task --list` don't need that implementation detail in the summary.

Co-authored-by: Isaac
Previous comments claimed generate:genkit bumps the SDK version in
go.mod/go.sum; it doesn't. Genkit regenerates CLI command stubs from
.codegen/_openapi_sha and refreshes .gitattributes, but SDK bumps are
a manual `go get` step. TestConsistentDatabricksSdkVersion is what
keeps the two in sync.

Co-authored-by: Isaac
denik added 7 commits April 21, 2026 14:08
Replace every `:` with `-` in task names, so every task is a valid
target name for tools that don't allow `:` — primarily `make`, where
`make test:unit` parses as target `test` with prerequisite `unit`
rather than as a single target `test:unit`.

With dashes everywhere:
  * `make X` and `./task X` invoke the same task for every X, making
    the Makefile wrapper a true drop-in for former `make` users.
  * testmask no longer needs the CI-name ↔ Taskfile-name translation:
    the CI job output names are the task names.

CI job identifiers (job names, cache-key strings) are unchanged; they
were already dash-separated.

Co-authored-by: Isaac
The previous `%: @./task "$@"` pattern rule didn't trigger for phony
targets in GNU Make 3.81 (Apple's default); `make test-unit` printed
"Nothing to be done". `.DEFAULT` is the built-in hook for unmatched
targets and fires reliably.

Co-authored-by: Isaac
Mirrors `default` but swaps in the non-incremental fmt/lint variants.
Also tightens the `all:` description to mention regeneration scope
(skips genkit) and test updates.

Co-authored-by: Isaac
Drop the leading \`-\` on test-update and test-update-templates cmds.
The dash tells Taskfile to ignore a non-zero exit, which masked broken
acceptance tests — we want these to fail the dev loop.

Co-authored-by: Isaac
Removing as unused. out.test.toml is regenerated by the normal
test-update flow.

Co-authored-by: Isaac
Replace the broad \`**/*.go + acceptance/**\` source globs with the
same curated list the \`build\` task uses (acceptance_test.go builds
the CLI in-process via BuildCLI), plus acceptance/** for the test
harness and fixtures, plus go.mod/go.sum.

Two separate source lists because out.* files play different roles:
- test-acc: out.* are golden inputs. A change to a committed out.*
  must re-run the test to confirm it still matches.
- test-update*: out.* are outputs. The task rewrites them, so
  keeping them in sources would change the checksum each run and
  force re-execution on every invocation.

The latter uses a shared &ACC_SOURCES_UPDATE YAML anchor across
test-update, test-update-templates, and test-update-aws.

Co-authored-by: Isaac
@@ -1,3 +1,3 @@
---
description: Rules for how to deal with auto-generated files
globs:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a task that parses Taskfile.yml to auto-gen this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be a follow up

Comment thread Taskfile.yml
generates:
- test-output.json
cmds:
- cat test-output-unit.json test-output-acc.json > test-output.json
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cating two JSON doesn't create a JSON. Better to use jq here (ditto for others)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are json-lines files

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use .jsonl then?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could rename, but out of scope there (and might need updating other places like deco repo)

Comment thread Taskfile.yml Outdated
sed -i 's|tagging.py|internal/genkit/tagging.py|g' .github/workflows/tagging.yml
fi
- "{{.GO_TOOL}} yamlfmt .github/workflows/tagging.yml"
- ./tools/validate_whitespace.py --fix
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we reference ws-fix task?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, 4d19c6b

denik added 2 commits April 21, 2026 15:39
The test failed on shallow CI clones (default fetch-depth: 1) because
HEAD~2 doesn't resolve. The well-known empty tree SHA
(4b825dc) is always available to git
regardless of clone depth, and diffing HEAD against it produces the
full file set — still exercising the non-empty result path.

Co-authored-by: Isaac
Addresses review feedback from @janniklasrose — dedupe the direct
validate_whitespace.py --fix call.

Co-authored-by: Isaac
Comment thread Taskfile.yml
generates:
- test-output.json
cmds:
- cat test-output-unit.json test-output-acc.json > test-output.json
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use .jsonl then?

Drops the \`git rev-parse --show-toplevel\` subshell in favor of
\`dirname "$0"\`, which is simpler and works even outside a git repo
(e.g. in archive extractions or release artifacts).

Co-authored-by: Isaac
@denik denik requested review from pietern and simonfaltum April 21, 2026 14:44
denik added 3 commits April 21, 2026 16:46
The system \`task\` binary in some PATHs is broken / missing; the
repo's wrapper at ./task always works. Update AGENTS.md / CLAUDE.md,
.agent/rules/auto-generated-files.md, bundle/docsgen/README.md, and
the comment block in bundle/internal/schema/main_test.go.

Co-authored-by: Isaac
Why: the minimal configs in tools/ and bundle/internal/tf/codegen/
existed only to opt nested modules out of root's ruleguard check
(whose rules path is cwd-relative and unresolvable from those dirs).
Disabling gocritic via --disable is equivalent and removes the
duplicated enable lists.

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants