acceptance: route bundle test clusters through the shared instance pool by renaudhartert-db · Pull Request #5461 · databricks/cli

renaudhartert-db · 2026-06-07T13:16:53Z

The cli-isolated integration tests launch a large number of ephemeral clusters per run, and each one cold-pulls the multi-GB Databricks runtime image from the internet through the NAT gateway in the deco AWS test account. That NAT egress is the main driver of an opex.eng.deco budget overspend (ES-1912931): the traffic is about 99.6% inbound download, roughly 3 GB per node, which lines up with one runtime image per cold node.

Most cluster-launching bundle acceptance templates set node_type_id directly, which bypasses the shared warm instance pool. The pool already exists and is already exported to CI as TEST_INSTANCE_POOL_ID, and two templates (spark-jar-task and integration_whl/base) already use it. This change applies the same one-line pattern to the remaining cluster-launching templates, so their nodes come from the pool and reuse a cached runtime image instead of re-pulling it through NAT on every launch.

When instance_pool_id is set the cluster takes its node type from the pool, so the bundle drops node_type_id and driver_node_type_id; that accounts for the golden updates here. The affected acceptance tests were regenerated and pass locally.

One autoscale template (resources/clusters/deploy/update-and-resize-autoscale) is intentionally left out for now because its golden has environment-dependent fields that did not regenerate cleanly outside CI; it can be added in a follow-up. A separate optional follow-up in eng-dev-ecosystem can preload more Spark versions on the pool to warm first-use, but it is not required for the reuse win.

The cli-isolated integration tests launch ~30 ephemeral clusters per run, each cold-pulling the multi-GB DBR runtime image over the NAT gateway in the deco AWS test account. That NAT egress is the bulk of an opex.eng.deco budget overspend (ES-1912931); the traffic is ~99.6% inbound download, ~3 GB per node. These bundle acceptance templates set node_type_id directly, bypassing the existing warm instance pool that is already exported to CI as TEST_INSTANCE_POOL_ID and already used by spark-jar-task and integration_whl/base. Routing them through the pool lets nodes reuse a cached runtime image instead of re-pulling it through NAT on every launch. Adds instance_pool_id: $TEST_INSTANCE_POOL_ID to the cluster-launching templates, matching the existing pattern, and regenerates the affected acceptance goldens. Co-authored-by: Isaac

github-actions · 2026-06-07T13:17:24Z

Waiting for approval

Based on git history, these people are best suited to review:

@andrewnester -- recent work in acceptance/bundle/resources/clusters/lifecycle-started-terraform-error/, acceptance/bundle/resources/clusters/lifecycle-started-toggle/, acceptance/bundle/resources/clusters/lifecycle-started/

Eligible reviewers: @anton-107, @denik, @janniklasrose, @lennartkats-db, @pietern, @shreyas-goenka

_{Suggestions based on git history. See OWNERS for ownership rules.}

eng-dev-ecosystem-bot · 2026-06-07T13:48:06Z

Commit: 354063e

Run: 27093685326

	Env	🪲BUG	🟨KNOWN	💚RECOVERED	🙈SKIP	✅pass	🙈skip	Time
🪲	aws linux	2	7		15	259	923	9:57
🪲	aws windows	2	7		15	261	921	12:20
🪲	aws-ucws linux	2	1	6	15	355	837	6:55
🪲	aws-ucws windows	2	1	6	15	357	835	11:01
🪲	azure linux	2	1		17	262	921	9:22
🪲	azure windows	2	1		17	264	919	10:28
🪲	azure-ucws linux	2	1		17	360	833	18:17
🪲	azure-ucws windows	2	1		17	362	831	11:20
🪲	gcp linux	2	1		17	258	924	9:49
🪲	gcp windows	2	1		17	260	922	11:30

24 interesting tests: 15 SKIP, 7 KNOWN, 2 BUG

	Test Name	aws linux	aws windows	aws-ucws linux	aws-ucws windows	azure linux	azure windows	azure-ucws linux	azure-ucws windows	gcp linux	gcp windows
🟨	TestAccept	🟨K	🟨K	🟨K	🟨K	🟨K	🟨K	🟨K	🟨K	🟨K	🟨K
🙈	TestAccept/bundle/invariant/no_drift	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/permissions	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions	🟨K	🟨K	💚R	💚R	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct	🟨K	🟨K	💚R	💚R
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform	🟨K	🟨K	💚R	💚R
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions	🟨K	🟨K	💚R	💚R	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct	🟨K	🟨K	💚R	💚R
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform	🟨K	🟨K	💚R	💚R
🙈	TestAccept/bundle/resources/postgres_branches/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/recreate	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/replace_existing	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/update_protected	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/without_branch_id	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_endpoints/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_endpoints/recreate	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_projects/update_display_name	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/synced_database_tables/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/vector_search_indexes/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/vector_search_indexes/grants/select	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🪲	TestAccept/bundle/run_as/job_default	🪲B	🪲B	🪲B	🪲B	🪲B	🪲B	🪲B	🪲B	🪲B	🪲B
🪲	TestAccept/bundle/run_as/job_default/DATABRICKS_BUNDLE_ENGINE=direct	🪲B	🪲B	🪲B	🪲B	🪲B	🪲B	🪲B	🪲B	🪲B	🪲B
🙈	TestAccept/ssh/connection	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S

Top 25 slowest tests (at least 2 minutes):

duration	env	testname
7:52	azure-ucws linux	TestSQLExecScalar
6:23	gcp linux	TestSecretsPutSecretStringValue
5:39	aws linux	TestSecretsPutSecretStringValue
5:38	azure linux	TestSecretsPutSecretStringValue
5:01	gcp windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:47	gcp windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:42	gcp linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:40	azure-ucws linux	TestSecretsPutSecretStringValue
4:05	gcp linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:36	azure-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:31	aws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:21	azure linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:14	azure windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:08	aws-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:05	aws-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:01	azure-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:58	aws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:53	aws-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:49	azure linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:46	aws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:42	azure-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:37	aws-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:35	azure windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:24	aws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:21	azure-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct

renaudhartert-db temporarily deployed to test-trigger-is June 7, 2026 13:17 — with GitHub Actions Inactive

renaudhartert-db deployed to test-trigger-is June 7, 2026 13:17 — with GitHub Actions Active

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

acceptance: route bundle test clusters through the shared instance pool#5461

acceptance: route bundle test clusters through the shared instance pool#5461
renaudhartert-db wants to merge 1 commit into
mainfrom
nat-pool-routing

renaudhartert-db commented Jun 7, 2026

Uh oh!

github-actions Bot commented Jun 7, 2026

Uh oh!

eng-dev-ecosystem-bot commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

renaudhartert-db commented Jun 7, 2026

Uh oh!

github-actions Bot commented Jun 7, 2026

Waiting for approval

Uh oh!

eng-dev-ecosystem-bot commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants