fix: separate execution modes and fix credential wiring#59
Conversation
Both credential paths wrote K8s Secret keys that did not match the env var names config.py reads via envFrom. Keycloak wrote "username"/ "password"; interactive wrote "connect-api-key"/"pm-token"/etc. Rename all keys to VIP_TEST_USERNAME, VIP_TEST_PASSWORD, VIP_CONNECT_API_KEY, VIP_WORKBENCH_API_KEY, VIP_PM_TOKEN so the Secret mounts correctly as environment variables.
The envFrom mount was conditionally skipped when interactive_auth=True, meaning credentials written by the interactive path were never exposed to the Job container. Both credential paths write to the same Secret, so always mount it. Remove the interactive_auth parameter from create_job.
Introduce Mode(str, Enum) with local, k8s_job, and config_only values and a VIPConfig.validate_for_mode() method that raises ValueError when required fields are missing for the requested mode. k8s_job and config_only modes require cluster configuration.
Break the monolithic run_verify into _resolve_mode, _phase_generate_config, _phase_provision_credentials, and _phase_run_tests. run_verify becomes a thin orchestrator that resolves the mode, validates config, and delegates to each phase. Update CLI help text to document execution and credential modes.
Add mode/credential matrix to vip.toml.example explaining the three execution modes and two credential approaches. Update cluster section comments to clarify when it is required. Update key source files table to reflect Mode enum in config.py.
|
There was a problem hiding this comment.
Pull request overview
This PR refactors vip verify into explicit execution phases/modes and fixes Kubernetes Secret key wiring so credentials written by both Keycloak and interactive-auth paths match the environment variables consumed by config.py.
Changes:
- Introduces
Modeand per-mode config validation, and restructuresrun_verifyinto discrete phases. - Fixes credential Secret key names and ensures the credentials Secret is always mounted into K8s Jobs via
envFrom. - Updates example config/docs and adds selftests covering job spec and credential key behavior.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| vip.toml.example | Documents execution/credential modes and clarifies cluster section applicability. |
| src/vip/verify/job.py | Always mounts vip-test-credentials via envFrom; removes interactive_auth parameter. |
| src/vip/verify/credentials.py | Renames Secret keys to VIP_* names to match config.py env var lookups. |
| src/vip/config.py | Adds Mode enum and VIPConfig.validate_for_mode(). |
| src/vip/cli.py | Splits run_verify into mode resolution + phases and updates CLI help text. |
| selftests/test_job.py | Adds tests asserting job signature change and envFrom Secret mount behavior. |
| selftests/test_credentials.py | Adds tests asserting Secret key names match config.py env vars. |
| selftests/test_config.py | Adds tests for Mode values and validate_for_mode() behavior. |
| AGENTS.md | Updates repo doc to mention Mode and per-mode validation in config.py. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
src/vip/cli.py
Outdated
| # Connect to cluster for modes that need it | ||
| if config.cluster.is_configured: |
There was a problem hiding this comment.
In local mode, run_verify still calls _connect_cluster whenever config.cluster.is_configured is true, even though local mode is documented/validated as not requiring cluster configuration. This can make vip verify --local fail if the [cluster] section is partially filled (e.g., provider+name but missing region/resource_group) because validate_cluster_config will raise. Gate the cluster connection by mode (connect only for k8s_job/config_only), or validate/ignore cluster config entirely when mode==Mode.local.
| # Connect to cluster for modes that need it | |
| if config.cluster.is_configured: | |
| # Connect to cluster only for non-local modes that need it | |
| if mode != Mode.local and config.cluster.is_configured: |
| k8s_job: cluster.is_configured (provider + name) | ||
| config_only: cluster.is_configured (need to reach the API server) | ||
| """ | ||
| if mode in (Mode.k8s_job, Mode.config_only): | ||
| if not self.cluster.is_configured: | ||
| raise ValueError( | ||
| f"mode={mode.value!r} requires cluster configuration " | ||
| "(set [cluster] provider and name in vip.toml or via env vars " | ||
| "VIP_CLUSTER_PROVIDER and VIP_CLUSTER_NAME)" | ||
| ) |
There was a problem hiding this comment.
validate_for_mode currently only checks cluster.is_configured (provider+name), but _connect_cluster/validate_cluster_config also requires provider-specific fields (AWS: region, Azure: resource_group). As a result, config can pass validate_for_mode but still fail immediately when connecting. Consider either calling validate_cluster_config here (no I/O) or extending this validation to include provider-specific required fields so the error is raised consistently at this stage.
| k8s_job: cluster.is_configured (provider + name) | |
| config_only: cluster.is_configured (need to reach the API server) | |
| """ | |
| if mode in (Mode.k8s_job, Mode.config_only): | |
| if not self.cluster.is_configured: | |
| raise ValueError( | |
| f"mode={mode.value!r} requires cluster configuration " | |
| "(set [cluster] provider and name in vip.toml or via env vars " | |
| "VIP_CLUSTER_PROVIDER and VIP_CLUSTER_NAME)" | |
| ) | |
| k8s_job: cluster.is_configured (provider + name) plus provider-specific | |
| fields (e.g. AWS: region; Azure: resource_group) | |
| config_only: same cluster requirements as k8s_job (need to reach the API server) | |
| """ | |
| if mode in (Mode.k8s_job, Mode.config_only): | |
| cluster = self.cluster | |
| if not cluster.is_configured: | |
| raise ValueError( | |
| f"mode={mode.value!r} requires cluster configuration " | |
| "(set [cluster] provider and name in vip.toml or via env vars " | |
| "VIP_CLUSTER_PROVIDER and VIP_CLUSTER_NAME)" | |
| ) | |
| # Validate provider-specific required fields so that configuration | |
| # errors are caught before attempting any I/O/cluster connection. | |
| provider = getattr(cluster, "provider", None) | |
| if provider == "aws": | |
| region = getattr(cluster, "region", None) | |
| if not region: | |
| raise ValueError( | |
| f"mode={mode.value!r} with AWS cluster requires a region " | |
| "(set [cluster] region in vip.toml or via env var " | |
| "VIP_CLUSTER_REGION)" | |
| ) | |
| elif provider == "azure": | |
| resource_group = getattr(cluster, "resource_group", None) | |
| if not resource_group: | |
| raise ValueError( | |
| f"mode={mode.value!r} with Azure cluster requires a resource " | |
| "group (set [cluster] resource_group in vip.toml or via env " | |
| "var VIP_CLUSTER_RESOURCE_GROUP)" | |
| ) |
There was a problem hiding this comment.
Intentionally keeping validate_for_mode minimal here. Provider-specific field validation (AWS region, Azure resource_group) is already handled by _connect_cluster / validate_cluster_config which runs immediately after validation. Duplicating that logic in validate_for_mode would create two places to maintain provider-specific requirements. The error messages from the cluster connection layer are clear enough — if region is missing, the user gets told exactly what's missing at the point where it matters.
| }, | ||
| "type": "Opaque", | ||
| "data": { | ||
| "username": base64.b64encode(username.encode()).decode(), | ||
| "password": base64.b64encode(password.encode()).decode(), | ||
| "VIP_TEST_USERNAME": base64.b64encode(username.encode()).decode(), | ||
| "VIP_TEST_PASSWORD": base64.b64encode(password.encode()).decode(), | ||
| }, |
There was a problem hiding this comment.
_create_credentials_secret now writes VIP_TEST_USERNAME/VIP_TEST_PASSWORD, but ensure_keycloak_test_user short-circuits when the vip-test-credentials Secret already exists. If a user previously created the Secret with the old keys (username/password), upgrading will keep the incompatible Secret and Keycloak provisioning will continue to skip, leaving config.py unable to read credentials. Consider detecting the old key names and recreating/patching the Secret, or at least updating the skip message to instruct running vip verify cleanup to regenerate it.
…keys Skip cluster connection in local mode since it's unnecessary. Detect old-format Secret key names (username/password) and warn users to run `vip verify cleanup` to regenerate.
Summary
config.pyreads. Renamed all keys toVIP_TEST_USERNAME,VIP_TEST_PASSWORD,VIP_CONNECT_API_KEY,VIP_WORKBENCH_API_KEY,VIP_PM_TOKEN.create_jobconditionally skipped mounting the credentials Secret wheninteractive_auth=True, making interactive credentials invisible to the Job. Now always mounts the Secret. Removed theinteractive_authparameter.Mode(str, Enum)withlocal,k8s_job,config_onlyvalues andVIPConfig.validate_for_mode()that raises on missing required fields per mode._resolve_mode,_phase_generate_config,_phase_provision_credentials,_phase_run_testswithrun_verifyas a thin orchestrator.vip.toml.exampleand updated CLI help text.Test plan
uv run pytest selftests/ -v)just check)vip verify --helpand confirm mode/credential docs appearvip verify <target> --config-onlyagainst a real clustervip verify <target>(K8s Job mode) and confirm credentials mount correctly