Skip to content

Conversation

@mgencur
Copy link
Contributor

@mgencur mgencur commented Aug 27, 2025

This PRs builds on top of #1921 and adds one more commit with the changes described below. Please review that PR first and then the last commit from the current PR. When pull/1921 is merged the current PR needs to be rebased.

  • Add external-rosa mode for HC_BACKUP_RESTORE_MODE to support existing ROSA clusters
  • Introduce HC_NAMESPACE parameter for configurable cluster namespace management
  • Add service cluster kubeconfig support via SC_KUBECONFIG parameter for ROSA ManifestWork operations
  • Implement ManifestWork backup/deletion functionality for ROSA cluster lifecycle management
  • Add open-cluster-management.io/api dependency to support ManifestWork operations
  • Create separate OADP deployment operations for default vs ROSA scenarios
  • Skip DPA HCP plugin modification for ROSA where DPA is managed via ManifestWork
  • Add VSL_AWS_PROFILE parameter for volume snapshot location AWS profile configuration
  • Refactor backup/restore suite to use pluggable deployment strategies
  • Update test configuration to handle both regular HCP and ROSA cluster workflows

Why the changes were made

  • This is important for testing OADP hypershift plugin in ROSA and for testing bug fixes such as OCPBUGS-56430. So far it was a manual process, but this PR extends the test suite so that the whole procedure can be run automatically.

How to test the changes made

  • Steps are described in TESTING.md where a new section was added for HCP/ROSA.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 27, 2025
@openshift-ci
Copy link

openshift-ci bot commented Aug 27, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mgencur
Once this PR has been reviewed and has the lgtm label, please assign mrnold for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mgencur mgencur force-pushed the hcp_oadp_full_rosa branch from 2765acd to cfd509b Compare August 27, 2025 06:49
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 27, 2025
@mgencur mgencur changed the title feat: Enable ROSA cluster support in HCP backup/restore tests [WIP] feat: Enable ROSA cluster support in HCP backup/restore tests Aug 27, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 27, 2025
@mgencur mgencur force-pushed the hcp_oadp_full_rosa branch 2 times, most recently from 78ef121 to d8296b4 Compare September 3, 2025 06:54
@mgencur mgencur changed the title [WIP] feat: Enable ROSA cluster support in HCP backup/restore tests feat: Enable ROSA cluster support in HCP backup/restore tests Sep 18, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 18, 2025
@kaovilai
Copy link
Member

go dependencies are out of date, run 'go mod tidy' to update are in order

@mgencur
Copy link
Contributor Author

mgencur commented Sep 19, 2025

/retest

@mgencur
Copy link
Contributor Author

mgencur commented Sep 19, 2025

IMO, the failure are not related to my changes.

@weshayutin
Copy link
Contributor

@mgencur we're close to getting CI fixed. Appologies we've had a lot of interrupts the last couple weeks.

@mgencur
Copy link
Contributor Author

mgencur commented Nov 10, 2025

/retest

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 10, 2025

Walkthrough

Adds Makefile variables for HCP external testing and AWS profile propagation; introduces OADPDeploymentOperation with Deploy/Undeploy flows; extends e2e test harness to support Service Cluster kubeconfig, ROSA external mode, ManifestWork management, new HCP helper methods, and adjusted timeouts.

Changes

Cohort / File(s) Summary
Build & test config
Makefile, tests/e2e/scripts/aws_settings.sh
Adds public Makefile vars HC_BACKUP_RESTORE_MODE, HC_NAME, HC_NAMESPACE; propagates VSL_AWS_PROFILE into test env and uses it in Velero snapshot location config.
Module dependency
go.mod
Adds dependency open-cluster-management.io/api v0.15.0.
OADP deployment abstraction
tests/e2e/backup_restore_suite_test.go, tests/e2e/backup_restore_cli_suite_test.go, tests/e2e/virt_backup_restore_suite_test.go
Introduces OADPDeploymentOperation type with NewOADPDeploymentOperationDefault/ROSA and Deploy/Undeploy; replaces prior readiness/wait/teardown helpers with these operations.
DPA / Velero helpers
tests/e2e/lib/dpa_helpers.go
Extracts BSL spec accessor and adds lifecycle helpers: BackupStorageLocationSpec, Create/DeleteBackupStorageLocation, Create/DeleteVolumeSnapshotLocation.
E2E infra & Service Cluster kubeconfig
tests/e2e/e2e_suite_test.go
Adds sc_kubeconfig/SC_KUBECONFIG handling, builds a second CR client (crClientForServiceCluster), preserves QPS/Burst, and wires ClientServiceCluster; selects OADP deployment op based on mode for cleanup.
HCP backup/restore tests
tests/e2e/hcp_backup_restore_suite_test.go, tests/e2e/hcp_external_cluster_backup_restore_suite_test.go
Adds HCModeExternalROSA ("external-rosa"), UUID-based backup/restore names, conditional DPA/plugin handling for ROSA vs non-ROSA, ManifestWork backup/deletion flows, namespace helpers (createTestNamespace, validateTestNamespace), and updated verification timing.
HCP library helpers & types
tests/e2e/lib/hcp/hcp.go, tests/e2e/lib/hcp/types.go
DeployHCManifest now accepts hcNamespace; adds ReadKubeconfigFromBytes, GetManifestWorkNamespace, DeleteManifestWork, BackupManifestWork, IsManifestWorkDeleted; adds ClientServiceCluster to HCHandler; introduces Wait30Min and updates HCPBackupTimeout.
Docs
docs/developer/testing/TESTING.md
Adds subsection documenting running HCP tests against external HostedControlPlane on ROSA with prerequisites and example invocation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review focus:
    • OADPDeploymentOperation Deploy/Undeploy behavior and side effects in test lifecycle.
    • HCP manifest/ManifestWork namespace changes and correctness of Get/Delete/Backup ManifestWork implementations.
    • Service Cluster kubeconfig parsing and creation of second CR client (scheme registration, QPS/Burst preservation).
    • DPA/Velero BSL & VSL helpers for idempotency, error handling, and interactions with Velero resources.
    • Branching between HCModeExternal vs HCModeExternalROSA across tests and teardown paths.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between c9e9437 and 6c07593.

📒 Files selected for processing (3)
  • tests/e2e/hcp_backup_restore_suite_test.go (6 hunks)
  • tests/e2e/lib/dpa_helpers.go (2 hunks)
  • tests/e2e/lib/hcp/hcp.go (11 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • tests/e2e/lib/hcp/hcp.go
  • tests/e2e/lib/dpa_helpers.go
  • tests/e2e/hcp_backup_restore_suite_test.go

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/e2e/lib/hcp/hcp.go (1)

728-741: Clarify the purpose of the config read-write cycle.

The GetHostedClusterKubeconfig method now reads the kubeconfig, then immediately writes it back before building the rest.Config. This appears to be a no-op unless there's a side effect in the Load/Write cycle (e.g., normalization, validation, or default-filling).

If this is intentional to normalize the kubeconfig data, consider adding a comment explaining why. Otherwise, this could be simplified to directly call buildConfigFromBytes(kubeconfigData).

If the read-write cycle serves no purpose, simplify:

 func (h *HCHandler) GetHostedClusterKubeconfig(hc *hypershiftv1.HostedCluster) (*rest.Config, error) {
 	kubeconfigSecret := &corev1.Secret{}
 	err := h.Client.Get(h.Ctx,
 		types.NamespacedName{
 			Namespace: hc.Namespace,
 			Name:      hc.Status.KubeConfig.Name},
 		kubeconfigSecret)
 	if err != nil {
 		return nil, err
 	}
 	kubeconfigData := kubeconfigSecret.Data["kubeconfig"]
-
-	config, err := ReadKubeconfigFromBytes(kubeconfigData)
-	if err != nil {
-		return nil, err
-	}
-
-	modifiedBytes, err := clientcmd.Write(*config)
-	if err != nil {
-		return nil, err
-	}
-
-	return buildConfigFromBytes(modifiedBytes)
+	return buildConfigFromBytes(kubeconfigData)
 }
docs/developer/testing/TESTING.md (1)

113-128: LGTM! Clear documentation for ROSA testing.

The new section properly documents:

  • Required kubeconfig setup (management and service clusters)
  • The destructive nature of the tests (deleting ManifestWork resources)
  • A complete example with all necessary environment variables

Consider simplifying line 117 per the static analysis hint:

-* In order to break the guest cluster, the tests delete ManifestWork resources on the Service Cluster.
+* To break the guest cluster, the tests delete ManifestWork resources on the Service Cluster.

This is a minor style improvement and completely optional.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 038659f and e31b5f7.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (13)
  • Makefile (4 hunks)
  • docs/developer/testing/TESTING.md (1 hunks)
  • go.mod (1 hunks)
  • tests/e2e/backup_restore_cli_suite_test.go (1 hunks)
  • tests/e2e/backup_restore_suite_test.go (4 hunks)
  • tests/e2e/e2e_suite_test.go (7 hunks)
  • tests/e2e/hcp_backup_restore_suite_test.go (6 hunks)
  • tests/e2e/hcp_external_cluster_backup_restore_suite_test.go (4 hunks)
  • tests/e2e/lib/dpa_helpers.go (2 hunks)
  • tests/e2e/lib/hcp/hcp.go (11 hunks)
  • tests/e2e/lib/hcp/types.go (1 hunks)
  • tests/e2e/scripts/aws_settings.sh (1 hunks)
  • tests/e2e/virt_backup_restore_suite_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • docs/developer/testing/TESTING.md
  • tests/e2e/lib/dpa_helpers.go
  • Makefile
  • tests/e2e/virt_backup_restore_suite_test.go
  • tests/e2e/backup_restore_cli_suite_test.go
  • tests/e2e/backup_restore_suite_test.go
  • tests/e2e/scripts/aws_settings.sh
  • tests/e2e/hcp_backup_restore_suite_test.go
  • go.mod
  • tests/e2e/lib/hcp/hcp.go
  • tests/e2e/hcp_external_cluster_backup_restore_suite_test.go
  • tests/e2e/e2e_suite_test.go
  • tests/e2e/lib/hcp/types.go
🪛 LanguageTool
docs/developer/testing/TESTING.md

[style] ~117-~117: Consider a more concise word here.
Context: ...e Cluster with ManifestWork resources * In order to break the guest cluster, the tests dele...

(IN_ORDER_TO_PREMIUM)

🔇 Additional comments (27)
go.mod (1)

25-25: LGTM!

The addition of the open-cluster-management.io/api dependency is necessary to support the new ManifestWork-based operations for ROSA cluster lifecycle management introduced in this PR.

Makefile (3)

68-74: LGTM! Well-documented parameters.

The new HC_BACKUP_RESTORE_MODE, HC_NAME, and HC_NAMESPACE parameters are properly introduced with clear comments explaining their purpose and default values.


754-754: LGTM! Consistent with existing AWS profile configuration.

The VSL_AWS_PROFILE parameter follows the established pattern of BSL_AWS_PROFILE and properly defaults to "default".


845-845: Documentation for SC_KUBECONFIG is adequate—no action required.

The documentation in docs/developer/testing/TESTING.md clearly states that "SC_KUBECONFIG must point to the Service Cluster with ManifestWork resources" and provides an example showing its usage in the HC_BACKUP_RESTORE_MODE=external-rosa context. Users are adequately informed of the requirement and how to set it.

tests/e2e/lib/hcp/hcp.go (6)

390-475: LGTM! Proper parameterization of namespace.

The DeployHCManifest function now correctly accepts hcNamespace as a parameter, making it more flexible for different deployment scenarios (default vs ROSA). The namespace is properly used throughout the function for all resource creations.


698-704: LGTM! Clean utility function.

ReadKubeconfigFromBytes is a well-defined utility function that properly encapsulates kubeconfig loading logic.


755-767: LGTM! Clean namespace resolution.

The GetManifestWorkNamespace method properly lists ManifestWorks and returns the namespace for the matching cluster ID.


769-818: LGTM! Comprehensive ManifestWork deletion with proper polling.

The DeleteManifestWork method properly:

  • Extracts the cluster ID from labels
  • Resolves the ManifestWork namespace
  • Deletes all three ManifestWork resources (main, workers, namespaces)
  • Polls for deletion completion with timeout
  • Handles not-found errors gracefully

820-877: LGTM! Well-structured backup functionality.

The BackupManifestWork method properly:

  • Creates timestamped subdirectories to avoid conflicts
  • Sets APIVersion and Kind for proper resource serialization
  • Uses proper file permissions (0755 for directories, 0644 for files)
  • Respects TMP_DIR environment variable with sensible fallback

879-893: LGTM! Consistent deletion check pattern.

IsManifestWorkDeleted follows the same pattern as IsHCDeleted and IsHCPDeleted, properly handling not-found errors and propagating other errors.

tests/e2e/backup_restore_cli_suite_test.go (1)

234-234: LGTM! Consistent with deployment operation refactoring.

The change from waitOADPReadiness to NewOADPDeploymentOperationDefault().Deploy aligns with the broader refactoring to use pluggable deployment strategies across the test suite.

tests/e2e/virt_backup_restore_suite_test.go (1)

220-220: LGTM! Consistent with deployment operation refactoring.

This change matches the refactoring pattern applied across all test suites to use the new deployment operation approach.

tests/e2e/scripts/aws_settings.sh (1)

43-43: LGTM! Proper parameterization of AWS profile.

The change from hardcoded "default" to $VSL_AWS_PROFILE makes the snapshot location profile configurable and aligns with the existing $BSL_AWS_PROFILE pattern for backup locations.

tests/e2e/lib/hcp/types.go (2)

135-138: LGTM! Good use of named constant.

Introducing Wait30Min as a named constant and using it for HCPBackupTimeout improves code maintainability and consistency with other timeout constants like Wait10Min.


143-148: LGTM! Proper extension for ROSA support.

The addition of the ClientServiceCluster field to HCHandler enables the new ManifestWork operations required for ROSA cluster management, while maintaining backward compatibility with existing code that doesn't use this field.

tests/e2e/e2e_suite_test.go (2)

157-169: LGTM: Service Cluster kubeconfig initialization is well-structured.

The conditional initialization properly:

  • Constructs the config using clientcmd.BuildConfigFromFlags
  • Preserves QPS/Burst settings from the main kubeconfig
  • Installs the workv1 scheme for ManifestWork operations
  • Creates a dedicated CR client for Service Cluster operations

240-244: LGTM: Dynamic deployment operation selection for teardown.

The teardown correctly selects between Default and ROSA deployment operations based on the HCBackupRestoreMode, ensuring proper cleanup for each scenario.

tests/e2e/lib/dpa_helpers.go (2)

113-133: LGTM: Clean extraction of BSL spec construction.

The BackupStorageLocationSpec() method properly encapsulates BSL configuration, enabling reuse in both Build() (line 79) and CreateBackupStorageLocation() (line 141).


135-203: LGTM: Idempotent BSL/VSL lifecycle methods.

All four methods (Create/Delete for BSL and VSL) are properly idempotent:

  • Create methods return nil on AlreadyExists
  • Delete methods return nil on NotFound
  • CreateVolumeSnapshotLocation validates SnapshotLocations presence before accessing the first element

This design supports safe retries and multiple test execution scenarios.

tests/e2e/backup_restore_suite_test.go (2)

46-70: LGTM: Well-designed deployment abstraction.

The OADPDeploymentOperation struct provides clear separation between default HCP testing (DPA/VolumeSnapshotClass managed) and ROSA scenarios (BSL/VSL managed). The boolean flags make the deployment strategy explicit and testable.


72-116: LGTM: Deploy() method properly orchestrates resource creation.

The deployment sequence is well-structured:

  1. Conditionally create DPA and wait for reconciliation
  2. Wait for Node Agent pods (when needed for KOPIA/CSIDataMover)
  3. Wait for Velero pod
  4. Conditionally create VolumeSnapshotClass for CSI scenarios
  5. Create BSL and wait for availability
  6. Conditionally create VSL

Note: The BSL availability check (lines 106-107) runs regardless of CreateBSL flag, which is correct since BSL exists in both default (via DPA) and ROSA (explicit creation) modes.

tests/e2e/hcp_external_cluster_backup_restore_suite_test.go (2)

83-118: LGTM: Clean refactoring of verification helpers.

The extraction of createTestNamespace() and validateTestNamespace() improves code organization and reusability. The use of errors.Join() for aggregating verification errors is appropriate, and createTestNamespace() is properly idempotent by ignoring AlreadyExists errors.


45-60: LGTM: Proper ROSA mode integration.

The HCHandler now receives ClientServiceCluster for Service Cluster operations, and the AfterEach correctly selects the deployment operation based on whether the mode is ExternalROSA, ensuring proper cleanup for each scenario.

tests/e2e/hcp_backup_restore_suite_test.go (4)

20-24: LGTM: Clear ROSA mode constant addition.

The HCModeExternalROSA constant is well-documented and properly added to the existing mode enumeration with consistent formatting.


44-61: LGTM: Proper OADP deployment orchestration for ROSA.

The code correctly:

  • Selects ROSA deployment operation when mode is HCModeExternalROSA
  • Deletes backup repositories to ensure clean state
  • Skips HCP plugin addition for ROSA (DPA is managed via ManifestWork in Service Cluster)
  • Retains HCP plugin addition and readiness wait for non-ROSA modes

110-121: LGTM: Conditional ManifestWork operations for ROSA.

The switch statement properly isolates ROSA-specific logic:

  • HCModeExternalROSA: backs up and deletes ManifestWork (DPA managed in Service Cluster)
  • Other modes: use standard RemoveHCP() for cleanup

This ensures ManifestWork operations only run in the appropriate context.


86-98: LGTM: Unified guest verification for external modes.

Both HCModeExternal and HCModeExternalROSA now support guest cluster verification with proper kubeconfig retrieval and client validation. The post-restore verification uses updated timeout constants (ValidateHCPTimeout, WaitForNextCheckTimeout) for more flexible timing control.

Also applies to: 134-146

- Add external-rosa mode for HC_BACKUP_RESTORE_MODE to support existing
ROSA clusters
- Introduce HC_NAMESPACE parameter for configurable cluster namespace
management
- Add service cluster kubeconfig support via SC_KUBECONFIG parameter for
ROSA ManifestWork operations
- Implement ManifestWork backup/deletion functionality for ROSA cluster
lifecycle management
- Add open-cluster-management.io/api dependency to support ManifestWork
operations
- Create separate OADP deployment operations for default vs ROSA
scenarios
- Skip DPA HCP plugin modification for ROSA where DPA is managed via
ManifestWork
- Add VSL_AWS_PROFILE parameter for volume snapshot location AWS profile
configuration
- Refactor backup/restore suite to use pluggable deployment strategies
- Update test configuration to handle both regular HCP and ROSA cluster
workflows

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between e31b5f7 and c9e9437.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (13)
  • Makefile (4 hunks)
  • docs/developer/testing/TESTING.md (1 hunks)
  • go.mod (1 hunks)
  • tests/e2e/backup_restore_cli_suite_test.go (1 hunks)
  • tests/e2e/backup_restore_suite_test.go (4 hunks)
  • tests/e2e/e2e_suite_test.go (7 hunks)
  • tests/e2e/hcp_backup_restore_suite_test.go (6 hunks)
  • tests/e2e/hcp_external_cluster_backup_restore_suite_test.go (4 hunks)
  • tests/e2e/lib/dpa_helpers.go (2 hunks)
  • tests/e2e/lib/hcp/hcp.go (11 hunks)
  • tests/e2e/lib/hcp/types.go (1 hunks)
  • tests/e2e/scripts/aws_settings.sh (1 hunks)
  • tests/e2e/virt_backup_restore_suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/e2e/scripts/aws_settings.sh
  • go.mod
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • docs/developer/testing/TESTING.md
  • tests/e2e/virt_backup_restore_suite_test.go
  • tests/e2e/lib/dpa_helpers.go
  • tests/e2e/backup_restore_cli_suite_test.go
  • tests/e2e/lib/hcp/types.go
  • tests/e2e/lib/hcp/hcp.go
  • tests/e2e/e2e_suite_test.go
  • tests/e2e/hcp_external_cluster_backup_restore_suite_test.go
  • Makefile
  • tests/e2e/backup_restore_suite_test.go
  • tests/e2e/hcp_backup_restore_suite_test.go
🔇 Additional comments (5)
tests/e2e/lib/dpa_helpers.go (5)

79-79: LGTM! Clean refactoring.

Delegating BSL spec construction to BackupStorageLocationSpec() improves code organization and reusability.


113-133: LGTM! Well-structured helper method.

The BackupStorageLocationSpec() method properly constructs a complete BSL spec with all necessary fields, improving code maintainability.


135-151: LGTM! Idempotent creation helper.

The method correctly creates a BSL resource and handles AlreadyExists errors gracefully, making it safe for repeated test execution.


153-166: LGTM! Idempotent deletion helper.

The method correctly deletes a BSL resource and handles NotFound errors gracefully, ensuring safe cleanup in tests.


190-203: LGTM! Idempotent deletion helper.

The method correctly deletes a VSL resource and handles NotFound errors gracefully, ensuring safe cleanup in tests.

Comment on lines 391 to 395
func tearDownBackupAndRestore(brCase BackupRestoreCase, installTime time.Time, report ginkgo.SpecReport) {
log.Println("Post backup and restore state: ", report.State.String())
gatherLogs(brCase, installTime, report)
tearDownDPAResources(brCase)
NewOADPDeploymentOperationDefault().Undeploy(brCase.BackupRestoreType)
deleteNamespace(brCase.Namespace)
Copy link
Contributor

@coderabbitai coderabbitai bot Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ensure ROSA teardown uses matching deployment operation

tearDownBackupAndRestore always instantiates the default operation, so when the suite runs in external-rosa mode we end up calling Undeploy with CreateDPA=true, which deletes the DPA and any CSI resources that are supposed to be preserved by the ManifestWork. That leaves the ROSA environment in a bad state for the next test run. Please select the ROSA variant when the mode is external-rosa, e.g.:

 func tearDownBackupAndRestore(brCase BackupRestoreCase, installTime time.Time, report ginkgo.SpecReport) {
 	log.Println("Post backup and restore state: ", report.State.String())
 	gatherLogs(brCase, installTime, report)
-	NewOADPDeploymentOperationDefault().Undeploy(brCase.BackupRestoreType)
+	op := NewOADPDeploymentOperationDefault()
+	if HCBackupRestoreMode(hcBackupRestoreMode) == HCModeExternalROSA {
+		op = NewOADPDeploymentOperationROSA()
+	}
+	op.Undeploy(brCase.BackupRestoreType)
 	deleteNamespace(brCase.Namespace)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func tearDownBackupAndRestore(brCase BackupRestoreCase, installTime time.Time, report ginkgo.SpecReport) {
log.Println("Post backup and restore state: ", report.State.String())
gatherLogs(brCase, installTime, report)
tearDownDPAResources(brCase)
NewOADPDeploymentOperationDefault().Undeploy(brCase.BackupRestoreType)
deleteNamespace(brCase.Namespace)
func tearDownBackupAndRestore(brCase BackupRestoreCase, installTime time.Time, report ginkgo.SpecReport) {
log.Println("Post backup and restore state: ", report.State.String())
gatherLogs(brCase, installTime, report)
op := NewOADPDeploymentOperationDefault()
if HCBackupRestoreMode(hcBackupRestoreMode) == HCModeExternalROSA {
op = NewOADPDeploymentOperationROSA()
}
op.Undeploy(brCase.BackupRestoreType)
deleteNamespace(brCase.Namespace)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The function tearDownBackupAndRestore is only called in the usual/classic cases when HCP is not external. The test cases which work with external HCP choose the operation explicitly there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgencur Thank you for the clarification! You're correct. Looking at the code structure, tearDownBackupAndRestore is only called in the AfterEach hook of the "Backup and restore tests" describe block, which contains regular application backup/restore test cases (MySQL CSI, Mongo KOPIA, etc.) that don't involve HCP or ROSA modes.

The external HCP tests that use ROSA mode are in separate test files (hcp_external_cluster_backup_restore_suite_test.go) and have their own AfterEach hooks that explicitly select the appropriate deployment operation based on the mode. So the concern about ROSA teardown doesn't apply to this function.

The original review comment was based on an incorrect assumption about the scope of this function. Your implementation is correct as-is.

@weshayutin
Copy link
Contributor

/retest

@weshayutin
Copy link
Contributor

/retest all

@openshift-ci
Copy link

openshift-ci bot commented Nov 21, 2025

@weshayutin: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:

/test 4.20-ci-index
/test 4.20-e2e-test-aws
/test 4.20-e2e-test-hcp-aws
/test 4.20-e2e-test-kubevirt-aws
/test 4.20-images
/test images
/test unit-test

Use /test all to run all jobs.

In response to this:

/retest all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@weshayutin
Copy link
Contributor

/test all

@kaovilai
Copy link
Member

/retest after #2017 merge

@openshift-ci
Copy link

openshift-ci bot commented Nov 21, 2025

@kaovilai: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:

/test 4.20-ci-index
/test 4.20-e2e-test-aws
/test 4.20-e2e-test-hcp-aws
/test 4.20-e2e-test-kubevirt-aws
/test 4.20-images
/test images
/test unit-test

Use /test all to run all jobs.

In response to this:

/retest after #2017 merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kaovilai
Copy link
Member

/retest

1 similar comment
@kaovilai
Copy link
Member

kaovilai commented Dec 5, 2025

/retest

@openshift-ci
Copy link

openshift-ci bot commented Dec 5, 2025

@mgencur: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/4.19-e2e-test-cli-aws 6c07593 link true /test 4.19-e2e-test-cli-aws
ci/prow/4.19-e2e-test-aws 6c07593 link true /test 4.19-e2e-test-aws
ci/prow/4.20-e2e-test-kubevirt-aws 6c07593 link true /test 4.20-e2e-test-kubevirt-aws
ci/prow/4.20-e2e-test-cli-aws 6c07593 link true /test 4.20-e2e-test-cli-aws

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants