Skip to content

OCPNODE-4125: Promote non-techpreview disruptive longrunning jobs to standard tier for component readiness#3537

Open
QiWang19 wants to merge 1 commit into
openshift:mainfrom
QiWang19:longrunning-tier
Open

OCPNODE-4125: Promote non-techpreview disruptive longrunning jobs to standard tier for component readiness#3537
QiWang19 wants to merge 1 commit into
openshift:mainfrom
QiWang19:longrunning-tier

Conversation

@QiWang19

@QiWang19 QiWang19 commented May 18, 2026

Copy link
Copy Markdown
Member

Summary by CodeRabbit

  • Chores
    • Updated job-tier classification to recognize disruptive long-running jobs, assigning tech-preview and standard tiers and ensuring these rules take precedence over hidden-job patterns.
    • Adjusted several job entries from candidate to standard to reflect the new tiering.
  • Tests
    • Updated expectations to match the new job-tier assignments.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@coderabbitai

coderabbitai Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 59ca4c4e-a697-4194-93c8-5ee32773faba

📥 Commits

Reviewing files that changed from the base of the PR and between a5f8d11 and 480bdfb.

📒 Files selected for processing (3)
  • pkg/variantregistry/ocp.go
  • pkg/variantregistry/ocp_test.go
  • pkg/variantregistry/snapshot.yaml
💤 Files with no reviewable changes (1)
  • pkg/variantregistry/snapshot.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/variantregistry/ocp.go

Walkthrough

Pattern matching in setJobTier now recognizes -disruptive-longrunning-techpreview as candidate and -disruptive-longrunning as standard. Four snapshot JobTier entries were changed from candidate to standard. One test expectation was updated to match the new tier.

Changes

Job-tier pattern and snapshot updates

Layer / File(s) Summary
Job-tier pattern and test update
pkg/variantregistry/ocp.go, pkg/variantregistry/ocp_test.go
Adds two setJobTier substring rules for -disruptive-longrunning-techpreview -> candidate and -disruptive-longrunning -> standard; updates test expectation for a disruptive-longrunning job.
Snapshot JobTier updates
pkg/variantregistry/snapshot.yaml
Updates four JobTier entries (specific snapshot lines) from candidate to standard.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • openshift/sippy#3561: Modifies pkg/variantregistry/ocp.go job-tier pattern matching for -disruptive-longrunning* jobs in the same rule area.

Suggested reviewers

  • petr-muller
🚥 Pre-merge checks | ✅ 19 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (19 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: promoting disruptive longrunning jobs (non-techpreview) to standard tier based on stability confirmation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Go Error Handling ✅ Passed PR only adds job-tier pattern entries to data structures without modifying error handling logic. No ignored errors, unsafe dereferences, or panic statements introduced.
Sql Injection Prevention ✅ Passed The PR modifies only job tier classifications and test expectations—no SQL queries are constructed, modified, or impacted by these changes.
Excessive Css In React Should Use Styles ✅ Passed Check not applicable. PR only modifies Go backend files and YAML configuration in pkg/variantregistry/ for job tier classification, with no React components or JSX files present.
Test Coverage For New Features ✅ Passed New job tier patterns are covered by TestVariantSyncer test cases with hardcoded expectations; snapshot.yaml is auto-generated and validated by TestVariantsSnapshot test.
Single Responsibility And Clear Naming ✅ Passed PR adds specific pattern entries to setJobTier method with clear names (-disruptive-longrunning), maintaining single responsibility, not using generic names, and preserving package coherence.
Stable And Deterministic Test Names ✅ Passed The PR uses standard Go testing (testing.T), not Ginkgo. All test names in the modified test case are static string literals with no dynamic components, timestamps, UUIDs, or generated identifiers.
Test Structure And Quality ✅ Passed The PR does not modify Ginkgo tests. The changed test file (ocp_test.go) uses standard Go testing package with testify/assert, not Ginkgo. The custom check is not applicable to this PR.
Microshift Test Compatibility ✅ Passed This PR does not add new Ginkgo e2e tests. It modifies job tier classification logic and updates related unit test data in a non-Ginkgo test repository.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR does not add new Ginkgo e2e tests; it only updates job tier classifications and variant registry metadata in the Sippy test analysis tool.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies test job classification metadata in Sippy (CI analysis tool), not deployment manifests, operator code, or pod scheduling constraints. Check not applicable.
Ote Binary Stdout Contract ✅ Passed This PR is in Sippy (job analysis tool), not OTE binary code. Changes are limited to variant registry data structures and patterns with no stdout writes in process-level code.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR contains no new Ginkgo e2e tests; it only updates job tier classification patterns and test expectations in Sippy variant registry package. IPv6/disconnected network check is not applicable.
No-Weak-Crypto ✅ Passed No weak cryptography detected. PR changes only job tier pattern classification entries and metadata; no crypto algorithms or insecure operations involved.
Container-Privileges ✅ Passed PR contains no container/Kubernetes manifests or Pod specs—only Go source code and job variant metadata. Custom check for privileged container configurations is not applicable.
No-Sensitive-Data-In-Logs ✅ Passed No logging of sensitive data found. PR only adds job tier pattern rules and updates test expectations. No new logging statements, password/token/API key logs, or sensitive data exposure detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from deepsm007 and petr-muller May 18, 2026 18:02
@openshift-ci

openshift-ci Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: QiWang19
Once this PR has been reviewed and has the lgtm label, please assign neisw for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label May 18, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e

@QiWang19 QiWang19 changed the title Promote disruptive longrunning jobs to standard tier for component readiness OCPNODE-4125: Promote disruptive longrunning jobs to standard tier for component readiness May 18, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 18, 2026
@openshift-ci-robot

openshift-ci-robot commented May 18, 2026

Copy link
Copy Markdown

@QiWang19: This pull request references OCPNODE-4125 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary by CodeRabbit

  • Chores
  • Updated job tier classifications across platform components to optimize resource management and improve scheduling of disruptive long-running jobs.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@QiWang19

Copy link
Copy Markdown
Member Author

After openshift/origin#31185 fixes the current failure of disruptive-longrunning jobs, will trigger several runs to confirm these jobs are stable to promote the jobTier.

@ngopalak-redhat

Copy link
Copy Markdown
Contributor

@QiWang19

  1. Add links to prove that tests are passing stable for both regular and techpreview jobs
  2. The process: If there is a failure, who will track it and how node team will be notified. I think this should be automated and some slack or email notification should come to node team members.

@QiWang19 QiWang19 force-pushed the longrunning-tier branch from aa3cad4 to a5f8d11 Compare May 19, 2026 13:38
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e

@QiWang19

Copy link
Copy Markdown
Member Author

OCP 5.0 Disruptive Longrunning Job Health --- Non-TechPreview AWS

Job: periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning
Sippy: Job view in Sippy

Overview

14 runs of the non-techpreview job were manually triggered. Sippy reports 21 total runs over the last 7 days (including scheduled periodics):

  • Passes: 16
  • Failures: 5
  • Pass Rate: 76.19%

Test Failure Analysis

Only one test failure was observed across all runs:
[Monitor:legacy-networking-invariants][sig-network] pods should successfully create sandboxes by writing network status

This test is not specific to the disruptive job. Across all OCP 5.0 jobs:

  • Pass rate: 99.45% (5,492 runs)
  • Failure rate: 0.18%
  • Flake rate: 0.36%
  • Component: Networking / cluster-network-operator
  • Open bugs: 0

Failure Pattern

All recent failure logs show the error etcdserver: request timed out during SetPodNetworkStatusAnnotation when attempting to create the pod sandbox. Some failures are tagged "race condition: sandbox failure at pod creation time" and others "never deleted."

The same failure pattern appears across multiple disruptive-longrunning variants:

  • 5.0-e2e-aws-disruptive-longrunning (non-techpreview)
  • 5.0-e2e-aws-disruptive-longrunning-techpreview-2of2
  • 5.0-e2e-azure-disruptive-longrunning-techpreview-2of2
  • 4.23-e2e-aws-disruptive-longrunning

Conclusion

The disruptive-longrunning job is in good shape. The only observed test failure is a low-frequency, cross-variant flake tied to transient etcd timeouts during disruptive testing --- not a job-specific regression.

[OpenShift AI Helpdesk] AI-generated. Review for accuracy.

@QiWang19

Copy link
Copy Markdown
Member Author

The process: If there is a failure, who will track it and how node team will be notified. I think this should be automated and some slack or email notification should come to node team members.

The component readiness review should automatically map the bug to the node team since the job has been annotated [sig-node] or [Jira:Node]

Signed-off-by: Qi Wang <qiwan@redhat.com>
@QiWang19 QiWang19 force-pushed the longrunning-tier branch from a5f8d11 to 480bdfb Compare May 31, 2026 00:36
@QiWang19 QiWang19 changed the title OCPNODE-4125: Promote disruptive longrunning jobs to standard tier for component readiness OCPNODE-4125: Promote non-techpreview disruptive longrunning jobs to standard tier for component readiness May 31, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e

@ngopalak-redhat

ngopalak-redhat commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

The last 2 runs are good before the PR merge:
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning/2061055472533770240
Even checked the 4.21 and 4.22 runs. They are running fine
/lgtm

@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

@QiWang19: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants