Skip to content

chore: vm size as configurable e2e pipeline variable#7887

Draft
lilypan26 wants to merge 5 commits intomainfrom
lily/add-vm-size-e2e-pipeline-variable
Draft

chore: vm size as configurable e2e pipeline variable#7887
lilypan26 wants to merge 5 commits intomainfrom
lily/add-vm-size-e2e-pipeline-variable

Conversation

@lilypan26
Copy link
Contributor

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Copilot AI review requested due to automatic review settings February 17, 2026 19:25
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modifies the e2e test infrastructure to add support for configurable VM size and location variables, and increases timeout values to accommodate longer-running test scenarios. The changes span the e2e configuration, pipeline template, and the e2e run script.

Changes:

  • Increases cluster creation timeout from 20 to 30 minutes in e2e config
  • Doubles the e2e pipeline timeout from 90 to 180 minutes
  • Adds VM_SIZE and LOCATION as optional pipeline variables that can override defaults

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
e2e/config/config.go Increases TestTimeoutCluster default from 20m to 30m
.pipelines/templates/e2e-template.yaml Doubles timeoutInMinutes from 90 to 180
.pipelines/scripts/e2e_run.sh Adds exports for VM_SIZE→DEFAULT_VM_SKU and LOCATION variables with echo statements for debugging

@@ -77,7 +77,7 @@ type Configuration struct {
TestGalleryNamePrefix string `env:"TEST_GALLERY_NAME_PREFIX" envDefault:"abe2etest"`
TestPreProvision bool `env:"TEST_PRE_PROVISION" envDefault:"false"`
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description is empty and doesn't explain:

  1. Why the timeout needs to be increased from 20m to 30m (50% increase)
  2. Why the pipeline timeout needs to be doubled from 90m to 180m
  3. What use case requires configurable VM_SIZE and LOCATION variables
  4. Whether this is related to specific test scenarios that are timing out

Without context, it's difficult to assess whether these timeout increases are addressing root causes or masking underlying performance issues. Please provide details about what drove these changes.

Suggested change
TestPreProvision bool `env:"TEST_PRE_PROVISION" envDefault:"false"`
TestPreProvision bool `env:"TEST_PRE_PROVISION" envDefault:"false"`
// Test timeout defaults are intentionally conservative and are based on
// historical e2e runs in our CI environment:
// - TestTimeout: overall per-test timeout, capped at 35m to catch hangs.
// - TestTimeoutCluster: operations that create or modify clusters, which
// are more expensive and can take longer under load, default to 30m.
// - TestTimeoutVMSS: VMSS-specific operations are typically faster, so a
// lower 17m default is used to surface regressions sooner.
//
// These values are not intended to mask underlying performance issues:
// if scenarios begin to approach these limits, we should first investigate
// and address the root cause before increasing the defaults. They are
// configurable via environment variables (TEST_TIMEOUT, TEST_TIMEOUT_CLUSTER,
// TEST_TIMEOUT_VMSS) so that different environments (e.g. local runs vs CI)
// can tune timeouts without changing code.

Copilot uses AI. Check for mistakes.
pool:
name: $(E2E_POOL_NAME)
timeoutInMinutes: 90
timeoutInMinutes: 180
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doubling the e2e pipeline timeout from 90 to 180 minutes is a significant increase that could mask performance issues or test inefficiencies. Before making this change, consider:

  1. Are specific tests consistently timing out? If so, those tests should be optimized or their timeout increased individually via TEST_TIMEOUT environment variable.
  2. Is this related to the VM_SIZE changes being introduced? If larger VMs are needed for specific scenarios, those should be isolated.
  3. A 180-minute timeout means failed pipelines will take 3 hours to fail, which significantly impacts CI/CD feedback loops.

Please provide justification for why this doubling is necessary and whether optimization alternatives were considered.

Suggested change
timeoutInMinutes: 180
timeoutInMinutes: 90

Copilot uses AI. Check for mistakes.
E2E_GO_TEST_TIMEOUT="${E2E_GO_TEST_TIMEOUT:-90m}"
GALLERY_NAME="${GALLERY_NAME:-}"
SIG_GALLERY_NAME="${SIG_GALLERY_NAME:-}"
export DEFAULT_VM_SKU="${VM_SIZE:-}"
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable naming inconsistency: The script uses VM_SIZE from pipeline variables but maps it to DEFAULT_VM_SKU environment variable, while LOCATION maps to itself. For consistency, either:

  1. The pipeline variable should be named DEFAULT_VM_SKU to match the Go config field, OR
  2. The environment variable export should match the pipeline variable name (i.e., export VM_SIZE="${VM_SIZE:-}")

The Go config in e2e/config/config.go expects the environment variable DEFAULT_VM_SKU (line 59), so option 1 maintains better naming consistency across the stack.

Suggested change
export DEFAULT_VM_SKU="${VM_SIZE:-}"
DEFAULT_VM_SKU="${DEFAULT_VM_SKU:-${VM_SIZE:-}}"
export DEFAULT_VM_SKU="${DEFAULT_VM_SKU:-}"

Copilot uses AI. Check for mistakes.
GALLERY_NAME="${GALLERY_NAME:-}"
SIG_GALLERY_NAME="${SIG_GALLERY_NAME:-}"
export DEFAULT_VM_SKU="${VM_SIZE:-}"
export LOCATION="${LOCATION:-}"
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable naming inconsistency: While LOCATION is exported to match an environment variable name used elsewhere in the codebase, there's a mismatch with the Go config expectation. The e2e/config/config.go file expects E2E_LOCATION (line 56), not LOCATION. This means the exported LOCATION variable won't be picked up by the e2e test framework.

Either:

  1. Change line 44 to: export E2E_LOCATION="${LOCATION:-}" to properly set the variable the Go code expects, OR
  2. If the intention is to pass a different LOCATION variable for other purposes, document what it's used for since it won't affect the Go e2e config.
Suggested change
export LOCATION="${LOCATION:-}"
export E2E_LOCATION="${LOCATION:-}"

Copilot uses AI. Check for mistakes.
@lilypan26 lilypan26 changed the title Lily/add vm size e2e pipeline variable chore: vm size as configurable e2e pipeline variable Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments