Skip to content

Add a monitortest for cluster region/zone/instance type autodl data#31305

Open
dgoodwin wants to merge 2 commits into
openshift:mainfrom
dgoodwin:autodl-region-instance-types
Open

Add a monitortest for cluster region/zone/instance type autodl data#31305
dgoodwin wants to merge 2 commits into
openshift:mainfrom
dgoodwin:autodl-region-instance-types

Conversation

@dgoodwin

@dgoodwin dgoodwin commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Goal is to begin driving our support matrix based on actual data of what was tested. These autodl artifacts are automatically ingested into bigquery tables giving us an easy path to query what was successfully tested. (joining in details for the job run and limiting to only job runs that were fully passing)

We will soon explore porting the qe job which churns through the requested regions/instance types slowly over time with an equivalent on the main prow system.

With this we can then query for where we successfully validated openshift was working and generate whatever artifacts and customer docs needed.

Example output:

aws: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/31305/pull-ci-openshift-origin-main-e2e-aws-ovn-fips/2067008173180456960/artifacts/e2e-aws-ovn-fips/openshift-e2e-test/artifacts/junit/cluster-instance-types_20260616-233046-autodl.json
gcp: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/31305/pull-ci-openshift-origin-main-e2e-gcp-ovn-upgrade/2067008173406949376/artifacts/e2e-gcp-ovn-upgrade/openshift-e2e-test/artifacts/junit/cluster-instance-types_20260616-232245-autodl.json
azure: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/openshift-origin-31305-ci-5.0-e2e-azure-ovn/2067295976640483328/artifacts/e2e-azure-ovn/openshift-e2e-test/artifacts/junit/cluster-instance-types_20260617-183416-autodl.json

Summary by CodeRabbit

Release Notes

  • New Features

    • Added a new “cluster instance types” monitor test that collects unique machine instance types across AWS, Azure, and GCP, grouped by platform, region, and machine role (control-plane vs worker). Results are exported as a table and an indented JSON file.
  • Tests

    • Expanded unit tests to verify deduplication, mixed worker instance handling, control-plane-first sorting, platform/region propagation, and skipping machines with empty provider configuration.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci openshift-ci Bot requested review from deads2k and sjenning June 15, 2026 22:58
@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dgoodwin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 15, 2026
@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

New clusterinstancetypes monitor test collects cloud instance types and machine roles for AWS, Azure, and GCP. It queries cluster Infrastructure for platform and region, lists machines in openshift-machine-api, classifies roles via labels, extracts InstanceType from provider specs, deduplicates, and exports as AutoDL dataset and JSON. Registered in universal monitor registry.

Changes

Cluster Instance Types Monitor Implementation

Layer / File(s) Summary
Data model and lifecycle foundation
pkg/monitortests/testframework/clusterinstancetypes/monitortest.go
Defines instanceTypeRow struct holding Platform, Region, Role, InstanceType tuple, and clusterInstanceTypes state holder maintaining REST config and collected rows. Constructor NewClusterInstanceTypes() returns MonitorTest interface. Lifecycle methods: PrepareCollection() no-op, StartCollection() stores REST config, CollectData() invokes collector, ConstructComputedIntervals() and EvaluateTestsFromConstructedIntervals() return nil, Cleanup() no-op.
Collection orchestration and data gathering
pkg/monitortests/testframework/clusterinstancetypes/monitortest.go
collect() creates config and machine clients, fetches cluster Infrastructure CR, filters to aws/azure/gcp platforms, derives region from infrastructure status or node topology label fallback, lists machines in openshift-machine-api namespace, and invokes buildRows() for classification and extraction.
Instance type extraction and row building
pkg/monitortests/testframework/clusterinstancetypes/monitortest.go
getRegionFromInfrastructure() extracts AWS/GCP region from Infrastructure CR; isMaster() classifies via machine role label; extractInstanceType() unmarshals provider-spec JSON into platform-specific config type and returns instance type field, skipping unparseable specs; buildRows() classifies as control-plane or worker, extracts instance type, deduplicates by (role, instanceType), constructs rows, and sorts by role then instance type.
Storage and dataset export
pkg/monitortests/testframework/clusterinstancetypes/monitortest.go
WriteContentToStorage() exports collected rows as AutoDL dataset file (cluster_instance_types) and indented JSON file to storage directory with Platform, Region, Role, InstanceType columns.
Monitor test registration
pkg/defaultmonitortests/types.go
Imports clusterinstancetypes package and registers cluster-instance-types monitor test under "Test Framework" component in universal monitor registry.
Unit test suite
pkg/monitortests/testframework/clusterinstancetypes/monitortest_test.go
awsMachine() helper constructs test machines with AWS provider specs. TestBuildRowsDeduplicates() validates deduplication. TestBuildRowsMixedWorkerTypes() verifies distinct worker instance types. TestBuildRowsSortsControlPlaneFirst() asserts control-plane row precedes worker rows. TestBuildRowsPropagatesPlatformAndRegion() confirms value propagation. TestBuildRowsSkipsEmptyProviderSpec() verifies machines with missing specs are ignored.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (14 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a new monitor test that collects cluster instance type data for AutoDL storage, which aligns with the code additions across three files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR uses standard Go testing package, not Ginkgo. Test names (TestBuildRowsDeduplicates, TestBuildRowsMixedWorkerTypes, TestBuildRowsSortsControlPlaneFirst, TestBuildRowsPropagatesPlatformAndRegion,...
Test Structure And Quality ✅ Passed The custom check requires reviewing "Ginkgo test code" but this PR contains only standard Go unit tests (using testing.T). Since no Ginkgo tests are present, the Ginkgo-specific check is not applic...
Microshift Test Compatibility ✅ Passed No Ginkgo e2e tests were added in this PR. The changes add monitor test infrastructure (unit tests and configuration), not Ginkgo e2e tests, so the MicroShift compatibility check does not apply.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests added in this PR. Code adds MonitorTest framework components and standard Go unit tests (testing.T), not Ginkgo tests (It/Describe/Context/etc.), so SNO compatibility check does...
Topology-Aware Scheduling Compatibility ✅ Passed PR adds a monitor test framework component (data collection only) in pkg/monitortests/testframework/, not deployment manifests, operator code, or controllers with scheduling constraints.
Ote Binary Stdout Contract ✅ Passed All logrus calls occur in runtime MonitorTest interface methods (CollectData, collect) called during test execution, not process-level initialization code (main, init, TestMain, etc.). No direct st...
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. The code contains only a monitor test framework implementation and standard Go unit tests with no IPv4 assumptions or external connectivity requirements.
No-Weak-Crypto ✅ Passed No weak cryptography, custom crypto implementations, or non-constant-time secret comparisons found in the added code.
Container-Privileges ✅ Passed PR contains only Go source files implementing a monitor test; no container security configurations, Kubernetes manifests, or privileged container settings are present.
No-Sensitive-Data-In-Logs ✅ Passed No sensitive data (passwords, tokens, API keys, PII, session IDs, hostnames, or customer data) is logged. All logging statements use only non-sensitive metadata like platform types, machine names,...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (1)
pkg/monitortests/testframework/clusterinstancetypes/monitortest_test.go (1)

92-97: ⚡ Quick win

Implement or remove the placeholder role test.

TestGetMachineRole has no assertions, so it always passes and currently provides no coverage for role-classification behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/monitortests/testframework/clusterinstancetypes/monitortest_test.go`
around lines 92 - 97, The TestGetMachineRole function is currently a placeholder
with no test logic or assertions, so it provides no actual test coverage despite
always passing. Either implement the test by adding actual test cases with
assertions that verify the role-classification behavior indirectly through the
pure functions classifyAWSZone and buildAutodlRows as noted in the comments, or
remove the TestGetMachineRole function entirely if the role-classification logic
is sufficiently covered by tests of those underlying pure functions elsewhere in
the test suite.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/monitortests/testframework/clusterinstancetypes/monitortest.go`:
- Around line 189-198: The getRegionFromInfrastructure function currently
handles AWS and GCP platform types in its switch statement but does not include
a case for Azure, even though Azure is explicitly supported as a platform. Add a
case for AzurePlatformType in the switch statement that checks if the Azure
platform status is not nil and returns the Region field from it, following the
same pattern used for AWS and GCP cases.
- Around line 217-249: The extractInstanceInfo function is silently ignoring
json.Unmarshal errors by only processing the specification when err == nil,
which causes invalid or missing data to be persisted. Modify the
extractInstanceInfo function signature to return both instanceInfo and an error.
For each platform case (aws, azure, gcp), when json.Unmarshal fails, return the
zero-value instanceInfo along with the error instead of silently proceeding.
Update all call sites of extractInstanceInfo throughout the file to handle the
returned error appropriately.
- Around line 143-149: The fallback path for region resolution silently ignores
errors from kubernetes.NewForConfig and Nodes().List operations by using if err
== nil checks, which violates the coding guideline to never ignore error
returns. Refactor this block to explicitly log or handle the errors from both
the kubernetes.NewForConfig call and the kubeClient.CoreV1().Nodes().List call
instead of discarding them. This will make region-resolution failures visible
and easier to diagnose.
- Around line 133-137: The code at line 133 dereferences
infra.Status.PlatformStatus.Type without first verifying that PlatformStatus is
not nil, which can cause a panic if the status is unset. Add a nil check for
infra.Status.PlatformStatus before attempting to access its Type field, and
return early (such as returning nil, nil) if the check fails, similar to the
existing platform type validation logic that follows.
- Around line 61-65: The collect method in the monitortest.go file currently
returns nil for the error when w.collect(ctx) fails, which masks the failure and
reports success instead of surfacing the error. Change the return statement in
the error handling block to return the actual err value as the third return
value instead of nil, so that collection failures are properly propagated to the
caller as expected by the monitor test contract.
- Around line 83-86: The early return when `len(rows) == 0` skips the JSON
snapshot output, losing valid collected metadata like platform and region
information. Remove the early return check after calling buildAutodlRows in the
code block containing the length check, allowing the JSON snapshot to be written
even when there are no AutoDL rows. This ensures that all collected metadata is
preserved in the output regardless of whether any instances were found.

---

Nitpick comments:
In `@pkg/monitortests/testframework/clusterinstancetypes/monitortest_test.go`:
- Around line 92-97: The TestGetMachineRole function is currently a placeholder
with no test logic or assertions, so it provides no actual test coverage despite
always passing. Either implement the test by adding actual test cases with
assertions that verify the role-classification behavior indirectly through the
pure functions classifyAWSZone and buildAutodlRows as noted in the comments, or
remove the TestGetMachineRole function entirely if the role-classification logic
is sufficiently covered by tests of those underlying pure functions elsewhere in
the test suite.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: b48dd313-3cf3-4641-8fc8-302fce9ad76e

📥 Commits

Reviewing files that changed from the base of the PR and between c7deaba and 3e3c214.

📒 Files selected for processing (3)
  • pkg/defaultmonitortests/types.go
  • pkg/monitortests/testframework/clusterinstancetypes/monitortest.go
  • pkg/monitortests/testframework/clusterinstancetypes/monitortest_test.go

Comment thread pkg/monitortests/testframework/clusterinstancetypes/monitortest.go
Comment thread pkg/monitortests/testframework/clusterinstancetypes/monitortest.go Outdated
Comment thread pkg/monitortests/testframework/clusterinstancetypes/monitortest.go
Comment thread pkg/monitortests/testframework/clusterinstancetypes/monitortest.go
Comment thread pkg/monitortests/testframework/clusterinstancetypes/monitortest.go
Comment thread pkg/monitortests/testframework/clusterinstancetypes/monitortest.go Outdated
@dgoodwin dgoodwin force-pushed the autodl-region-instance-types branch from 3e3c214 to 009b20d Compare June 16, 2026 11:33
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@openshift-trt

openshift-trt Bot commented Jun 16, 2026

Copy link
Copy Markdown

Risk analysis has seen new tests most likely introduced by this PR.
Please ensure that new tests meet guidelines for naming and stability.

New tests seen in this PR at sha: 009b20d

  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types cleanup" [Total: 10, Pass: 10, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types collection" [Total: 10, Pass: 10, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types interval construction" [Total: 10, Pass: 10, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types preparation" [Total: 10, Pass: 10, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types setup" [Total: 10, Pass: 10, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types test evaluation" [Total: 10, Pass: 10, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types writing to storage" [Total: 10, Pass: 10, Fail: 0, Flake: 0]

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/monitortests/testframework/clusterinstancetypes/monitortest_test.go`:
- Around line 8-21: These tests are tautological as they construct literal rows
slices and then assert those same literals without exercising the actual dedup,
sort, or role extraction logic from the collect function. Refactor both test
cases to instead call the real collect function (or a factored pure helper used
by collect) with duplicate and mixed input data, and then assert the
deduplicated and properly sorted output matches the expected result. This
ensures the tests actually verify the dedup and sorting behavior rather than
just testing hard-coded literals.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 6e0a5fd3-f5dc-49a7-b0e5-f05a532ea139

📥 Commits

Reviewing files that changed from the base of the PR and between 009b20d and 75a4890.

📒 Files selected for processing (2)
  • pkg/monitortests/testframework/clusterinstancetypes/monitortest.go
  • pkg/monitortests/testframework/clusterinstancetypes/monitortest_test.go

Comment thread pkg/monitortests/testframework/clusterinstancetypes/monitortest_test.go Outdated
@dgoodwin dgoodwin force-pushed the autodl-region-instance-types branch from 75a4890 to 554daf9 Compare June 16, 2026 17:27

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/monitortests/testframework/clusterinstancetypes/monitortest_test.go`:
- Line 16: The json.Marshal call on line 16 is discarding the error return value
using a blank identifier, which can produce invalid fixtures and mask test
failures. Instead of using the blank identifier, capture the error return value
from json.Marshal into a named variable and add proper error handling to check
if the error is non-nil. If an error occurs, handle it appropriately for the
test context (such as using a fatal assertion to fail the test with the error
details).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 67316fb2-8d12-4a90-a624-fdfbf4830b1b

📥 Commits

Reviewing files that changed from the base of the PR and between 75a4890 and 554daf9.

📒 Files selected for processing (2)
  • pkg/monitortests/testframework/clusterinstancetypes/monitortest.go
  • pkg/monitortests/testframework/clusterinstancetypes/monitortest_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/monitortests/testframework/clusterinstancetypes/monitortest.go

Comment thread pkg/monitortests/testframework/clusterinstancetypes/monitortest_test.go Outdated
@dgoodwin dgoodwin force-pushed the autodl-region-instance-types branch from 554daf9 to 8ebc142 Compare June 16, 2026 17:36
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@dgoodwin

Copy link
Copy Markdown
Contributor Author

/retest

@openshift-trt

openshift-trt Bot commented Jun 17, 2026

Copy link
Copy Markdown

Risk analysis has seen new tests most likely introduced by this PR.
Please ensure that new tests meet guidelines for naming and stability.

New Test Risks for sha: 8ebc142

Job Name New Test Risk
pull-ci-openshift-origin-main-e2e-aws-csi High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types cleanup" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-csi High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types collection" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-csi High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types interval construction" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-csi High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types preparation" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-csi High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types setup" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-csi High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types test evaluation" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-csi High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types writing to storage" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-fips High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types cleanup" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-fips High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types collection" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-fips High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types interval construction" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-fips High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types preparation" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-fips High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types setup" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-fips High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types test evaluation" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-fips High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types writing to storage" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-microshift High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types cleanup" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-microshift High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types collection" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-microshift High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types interval construction" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-microshift High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types preparation" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-microshift High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types setup" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-microshift High - "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types test evaluation" is a new test that was not present in all runs against the current commit.
(...showing 20 of 84 rows)

New tests seen in this PR at sha: 8ebc142

  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types cleanup" [Total: 24, Pass: 24, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types collection" [Total: 24, Pass: 24, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types interval construction" [Total: 24, Pass: 24, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types preparation" [Total: 24, Pass: 24, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types setup" [Total: 24, Pass: 24, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types test evaluation" [Total: 24, Pass: 24, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types writing to storage" [Total: 24, Pass: 24, Fail: 0, Flake: 0]

@dgoodwin

Copy link
Copy Markdown
Contributor Author

/payload-job periodic-ci-openshift-release-main-ci-5.0-e2e-azure-ovn

@openshift-ci

openshift-ci Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

@dgoodwin: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-ci-5.0-e2e-azure-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/a06cead0-6a70-11f1-9349-b8dc57b0112b-0

@dgoodwin

Copy link
Copy Markdown
Contributor Author

/test unit

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@openshift-ci

openshift-ci Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

@dgoodwin: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-fips 57eb894 link true /test e2e-aws-ovn-fips
ci/prow/e2e-vsphere-ovn-upi 57eb894 link true /test e2e-vsphere-ovn-upi
ci/prow/e2e-gcp-ovn-upgrade 57eb894 link true /test e2e-gcp-ovn-upgrade
ci/prow/e2e-gcp-ovn 57eb894 link true /test e2e-gcp-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-trt

openshift-trt Bot commented Jun 18, 2026

Copy link
Copy Markdown

Job Failure Risk Analysis for sha: 57eb894

Job Name Failure Risk
pull-ci-openshift-origin-main-e2e-vsphere-ovn Low
[Feature:NetworkSegmentation][ovn-kubernetes-ote][sig-network] Network Segmentation UserDefinedNetwork CRD Controller for primary UDN without required namespace label should not be able to update the namespace and add the UDN label [Suite:openshift/conformance/parallel]
This test has passed 62.52% of 3191 runs on release 5.0 [Overall] in the last week.
---
[Feature:NetworkSegmentation][ovn-kubernetes-ote][sig-network] Network Segmentation when primary network exist, ClusterUserDefinedNetwork status should report not-ready [Suite:openshift/conformance/parallel]
This test has passed 62.52% of 3191 runs on release 5.0 [Overall] in the last week.
---
[Feature:NetworkSegmentation][ovn-kubernetes-ote][sig-network] Network Segmentation when primary network exist, UserDefinedNetwork status should report not-ready [Suite:openshift/conformance/parallel]
This test has passed 62.52% of 3191 runs on release 5.0 [Overall] in the last week.
---
[Feature:NetworkSegmentation][ovn-kubernetes-ote][sig-network] Network Segmentation: services on a user defined primary network should be reachable through their cluster IP, node port and load balancer L2 primary UDN with custom network, cluster-networked pods, NodePort service [Suite:openshift/conformance/parallel]
This test has passed 0.00% of 1 runs on release 5.0 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:standard Network:ovn NetworkStack:ipv4 OS:rhcos9 Owner:eng Platform:vsphere Procedure:none SecurityMode:default Topology:ha Upgrade:none] in the last week.
---
Showing 4 of 10 test results

Risk analysis has seen new tests most likely introduced by this PR.
Please ensure that new tests meet guidelines for naming and stability.

New tests seen in this PR at sha: 57eb894

  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types cleanup" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types collection" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types interval construction" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types preparation" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types setup" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types test evaluation" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:cluster-instance-types][Jira:"Test Framework"] monitor test cluster-instance-types writing to storage" [Total: 12, Pass: 12, Fail: 0, Flake: 0]

@tthvo tthvo left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a few questions, but this is really cool 🤘✨

Comment on lines +152 to +155
machines, err := machineClientSet.MachineV1beta1().Machines("openshift-machine-api").List(ctx, metav1.ListOptions{})
if err != nil {
return nil, fmt.Errorf("failed to list machines: %w", err)
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get all the machine details from node labels. For example, we can extract the arch, instance type and region for all 3 aws, azure and gcp platforms👇:

kubernetes.io/arch: <arch>
node.kubernetes.io/instance-type: <type>
topology.kubernetes.io/region: <region>

For control plane nodes, they will have these labels:

node-role.kubernetes.io/control-plane: ""
node-role.kubernetes.io/master: ""

For worker nodes, they will have:

node-role.kubernetes.io/worker: ""

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought is that we likely need to switch to check ClusterAPI machines in a near future. So if we can inspect nodes, it'd ease our maintenance. WDYT?

Platform string `json:"platform"`
Region string `json:"region"`
Role string `json:"role"`
InstanceType string `json:"instance_type"`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On AWS, we also support local and wavelength zones via edge compute pool.

Let's also collect zone details (for AWS only?). We can get it by inspecting node label topology.kubernetes.io/zone, for example, in this run:

topology.kubernetes.io/zone: us-west-2-pdx-1a

@openshift-ci openshift-ci Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants