Skip to content

AGENT-1534: Decrease master memory for agent HA iso-no-registry job#80228

Open
bfournie wants to merge 1 commit into
openshift:mainfrom
bfournie:agent-ha-vcpu-increase
Open

AGENT-1534: Decrease master memory for agent HA iso-no-registry job#80228
bfournie wants to merge 1 commit into
openshift:mainfrom
bfournie:agent-ha-vcpu-increase

Conversation

@bfournie

@bfournie bfournie commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Decrease the master memory for the HA-dualstack test to resolve the 'soft lockup' errors. This makes the master memory the same as the HA5 configuration.

Summary by CodeRabbit

This PR increases the CPU resources allocated to master nodes in the OpenShift 5.0 agent-based installation CI test pipeline. Specifically, it adds MASTER_VCPU=15 to the DEVSCRIPTS_CONFIG environment variable for the e2e-agent-ha-dualstack-iso-no-registry-techpreview job, which is a nightly test running every 8 hours on bare metal infrastructure (equinix-ocp-metal cluster).

Practical impact: This configuration change provides additional CPU resources to master nodes during agent-based HA deployment testing with ISO provisioning, addressing reported "soft lockup" errors that were occurring during test execution. The test provisions an HA cluster configuration with dual-stack networking and uses ISO-based boot without a local registry—a resource-intensive scenario that benefits from the increased CPU allocation.

Change scope: Single modification to the OpenShift 5.0 nightly test configuration file (+1 line).

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 8, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

@bfournie: This pull request references AGENT-1534 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Increase the master VCPU to resolve the 'soft lockup' errors.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

This pull request adds a single configuration parameter MASTER_VCPU=15 to the DEVSCRIPTS_CONFIG environment variable for the e2e-agent-ha-dualstack-iso-no-registry-techpreview CI job in the OpenShift release configuration.

Changes

HA Dualstack ISO Test Configuration

Layer / File(s) Summary
Master CPU configuration for dev-scripts
ci-operator/config/openshift/release/openshift-release-main__nightly-5.0.yaml
DEVSCRIPTS_CONFIG for the e2e-agent-ha-dualstack-iso-no-registry-techpreview job is updated to include MASTER_VCPU=15, setting the vCPU count for the provisioned master nodes during test execution.

Possibly related PRs

  • openshift/release#80144: Both PRs target the same 5.0 dev-scripts configuration area, with this PR adding MASTER_VCPU=15 while the related PR bumps Metal3 dev-scripts master config inputs to OpenShift 5.0.

Suggested labels

lgtm, approved, rehearsals-ack

Suggested reviewers

  • dgoodwin
  • pawanpinjarkar
  • andfasano

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title mentions 'Decrease master memory' but the actual change adds 'MASTER_VCPU=15', which increases vCPU allocation, not decreases memory. The description states the goal is to 'Increase the master VCPU', creating a clear mismatch. Update the title to accurately reflect the change: 'AGENT-1534: Increase master VCPU for agent HA iso-no-registry job' (matching the PR description and actual code change).
✅ Passed checks (14 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed The PR primarily modifies CI configuration files (YAML) to add MASTER_VCPU=15. No Ginkgo test files or dynamic test names are present in the modifications.
Test Structure And Quality ✅ Passed Custom check for Ginkgo test code quality is not applicable to this PR, which only modifies CI configuration (YAML file), not test code.
Microshift Test Compatibility ✅ Passed This PR only modifies CI configuration (YAML file), not test code. It adds MASTER_VCPU=15 to a CI job environment. No Ginkgo e2e tests are added; the check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR only modifies CI config (adds MASTER_VCPU=15 to existing job), not test code. No new Ginkgo e2e tests added, so SNO compatibility check does not apply.
Topology-Aware Scheduling Compatibility ✅ Passed This PR modifies a CI test configuration, not deployment manifests or operator code. No topology-unaware scheduling constraints are introduced.
Ote Binary Stdout Contract ✅ Passed PR contains only YAML configuration changes to CI operator config (MASTER_VCPU setting); no code files modified, so OTE Binary Stdout Contract check does not apply.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR does not add new Ginkgo e2e tests. It only modifies CI job configuration (MASTER_VCPU setting), so the IPv6/disconnected compatibility check is not applicable.
No-Weak-Crypto ✅ Passed PR modifies only a YAML CI configuration file adding MASTER_VCPU=15; no cryptographic code, weak algorithms, or custom crypto implementations are present.
Container-Privileges ✅ Passed PR modifies only a CI Operator configuration file adding MASTER_VCPU=15 environment variable; no Kubernetes container manifests or privilege escalation configurations present in scope.
No-Sensitive-Data-In-Logs ✅ Passed The PR adds only MASTER_VCPU=15 (a numeric CPU configuration) to CI job environment variables; no passwords, tokens, API keys, PII, or other sensitive data are present.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from sosiouxme and wking June 8, 2026 14:46
@bfournie

bfournie commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-openshift-release-main-nightly-5.0-e2e-agent-ha-dualstack-iso-no-registry-techpreview

@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bfournie
Once this PR has been reviewed and has the lgtm label, please assign petr-muller for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@bfournie: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/config/openshift/release/openshift-release-main__nightly-5.0.yaml`:
- Line 2173: The file shows MASTER_VCPU=15 only in the
e2e-agent-ha-dualstack-iso-no-registry-techpreview job; inspect the sibling jobs
e2e-agent-ha5-dualstack-iso-no-registry-techpreview and
e2e-agent-ha-dualstack-conformance and either (A) confirm the 15 setting is
intentionally only for e2e-agent-ha-dualstack-iso-no-registry-techpreview and
document that decision, or (B) make them consistent by adding MASTER_VCPU=15 (or
change all three to MASTER_VCPU=8 if the two-node-fencing value is the intended
mitigation) inside those job definitions so the HA dualstack agent jobs share
the same MASTER_VCPU behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 495c91c1-32d3-4667-803f-12f3210d0387

📥 Commits

Reviewing files that changed from the base of the PR and between db8c4ac and cd32192.

📒 Files selected for processing (1)
  • ci-operator/config/openshift/release/openshift-release-main__nightly-5.0.yaml

Comment thread ci-operator/config/openshift/release/openshift-release-main__nightly-5.0.yaml Outdated
Increase the master VCPU to resolve the 'soft lockup' errors.
@bfournie bfournie force-pushed the agent-ha-vcpu-increase branch from cd32192 to 274d0bf Compare June 8, 2026 17:54
@bfournie

bfournie commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-openshift-release-main-nightly-5.0-e2e-agent-ha-dualstack-iso-no-registry-techpreview

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@bfournie: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@bfournie: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
periodic-ci-openshift-release-main-nightly-5.0-e2e-agent-ha-dualstack-iso-no-registry-techpreview N/A periodic Ci-operator config changed
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@bfournie

bfournie commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-openshift-release-main-nightly-5.0-e2e-agent-ha-dualstack-iso-no-registry-techpreview

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@bfournie: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@bfournie bfournie changed the title AGENT-1534: Increase master VCPU for agent iso-no-registry job AGENT-1534: Increase master memory for agent iso-no-registry job Jun 8, 2026
@bfournie bfournie changed the title AGENT-1534: Increase master memory for agent iso-no-registry job AGENT-1534: Increase master memory for agent HA iso-no-registry job Jun 8, 2026
@bfournie

bfournie commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-openshift-release-main-nightly-5.0-e2e-agent-ha-dualstack-iso-no-registry-techpreview

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@bfournie: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@bfournie bfournie changed the title AGENT-1534: Increase master memory for agent HA iso-no-registry job AGENT-1534: Decrease master memory for agent HA iso-no-registry job Jun 8, 2026
@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@bfournie: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants