HYPERFLEET-532 | docs: E2E test run strategy and resource management spike by yasun1 · Pull Request #84 · openshift-hyperfleet/architecture

yasun1 · 2026-01-30T07:32:35Z

Summary by CodeRabbit

Documentation
- Added an extensive E2E test run strategy document describing goals, problem statements, core design principles, a test-run model and lifecycle, fixture-adapter concepts, deployment/namespace lifecycle and naming, component ownership, resource isolation and race-condition prevention, test-suite organization and state management, cleanup/retention policies, observability and debugging guidance, open questions, and a phased implementation plan.

…spike

coderabbitai · 2026-01-30T07:32:50Z

Walkthrough

This change adds a Spike Report Markdown that defines an end-to-end (E2E) test-run strategy for HyperFleet. It documents problem statements, goals, core design principles, an E2E Test Run model (definition, identification, lifecycle), fixture-adapter concepts, deployment and component lifecycle ownership, resource isolation and naming conventions, race-condition mitigation, test scenario organization, state management and suite independence, cleanup and retention policies, observability/debugging guidance, open questions, and a phased action plan (Phase 1: adapter-landing-zone MVP; Phase 2: Fixture Adapter; Phase 3+: enhancements).

Sequence Diagram(s)

sequenceDiagram
  participant CI as CI / Test Runner
  participant Controller as Run Controller
  participant Adapter as Fixture Adapter
  participant K8s as Kubernetes API
  participant Core as Core Suite
  participant AdapterSuite as Adapter Suite

  CI->>Controller: Create TestRun CR (labels, retention)
  Controller->>K8s: Create Namespace (one-per-run)
  Controller->>Adapter: Request fixture provisioning (fixture refs)
  Adapter->>K8s: Provision adapter resources in run-namespace
  Adapter->>AdapterSuite: Deploy & trigger Adapter Suite
  Controller->>Core: Trigger Core Suite (after adapter readiness)
  Core->>K8s: Validate topology across namespaces
  AdapterSuite->>Controller: Report status (ready/passed/failed)
  Core->>Controller: Report status (passed/failed)
  Controller->>K8s: Begin teardown per retention policy (or retain)
  Controller->>Adapter: Notify cleanup
  Adapter->>K8s: Delete adapter resources
  Controller->>K8s: Delete Namespace (if not retained)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

xueli181114
rh-amarin
ciaranRoche

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: adding documentation for an E2E test run strategy and resource management spike. It is concise, directly related to the changeset content, and provides meaningful context.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

xueli181114 · 2026-01-30T11:22:03Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+**Test Suite Execution**:
+- Suites execute sequentially
+- Each suite may deploy/remove production adapters as needed


I assume it's not only suite level, it's also case level f adapter removing. And what does production mean here?

Yes, depending on the testing requirements, the Adapter Execution Suite, which validates adapter runtime behavior, is suite-level; the Adapter Deployment Suite, which validates adapter deployment process, is case-level.

Production refers to customized adapters. Perhaps changing it to functional would reduce confusion.

Is it better to separate deployment logic from test cases?
We could maintain both in the E2E repository; however, across different deployment scenarios, we should be able to reuse the same E2E test cases, since these tests can be compatible with different numbers and types of adapters.

We may have multiple scenarios. For example, we could deploy various dummy adapters covering all supported Kubernetes resource types. This would require defining the API-required adapters and their corresponding configurations. Based on the deployed environment, we would then run the appropriate E2E test cases as needed.

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

xueli181114 · 2026-01-30T11:27:04Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+| Sentinel           | Test Framework   | Per Test Run | |
+| Fixture Adapter    | Test Framework   | Per Test Run | Infrastructure component |
+| Production Adapter | Test Suite       | Suite-scoped | Dynamically managed |
+| Broker Resources   | Adapter/Sentinel | Per Test Run | |


I would say, topic and subscription create if missing should be enabled. And at the same time SA should have edit permission to make it happen.

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

ciaranRoche · 2026-01-30T13:13:15Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+### 4.4 Fixture Adapter
+
+The **Fixture Adapter** is a minimal test infrastructure component deployed as part of the Test Run.


Can you extend this section a little, i would like to determine what the 'core' is here, what events does Fixture Adapter consume, what status is reported back, does it provide latency and failure modes etc etc

Good suggestion. I add more for Fixture Adapter thinking, please help review.

how about to use "system level" test suits or "e2e" test suits instead of "core" to avoid confusion.

ciaranRoche · 2026-01-30T13:14:43Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+## 13. Action Items and Next Steps
+
+**Prerequisites**: This test strategy assumes HyperFleet system supports runtime adapter hot-plugging (dynamic adapter deployment without API/Sentinel restart). If this capability does not exist, it should be implemented as part of HyperFleet core development (separate from E2E framework work).


No it does not fully support hot swapping, as the API requires a list of expected adapters as config. This will be extended post MVP, to add more configuration options to the top level conditions of API objects

Yes, we currently use static adapter configuration. Hot-plugging is post-MVP work, but included in the spike to guide future E2E test design. For MVP, we'll focus on Core Suite testing with pre-deployed adapters.

ciaranRoche · 2026-01-30T13:15:38Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+### 6.2 Messaging and Broker Isolation
+
+Messaging resources (Topics / Subscriptions) are isolated using the Test Run ID.


Can you clarify how the clean up of brokers happens, when testing beyond the namespace, with GCP pub/sub.

Thank you. Our documentation doesn't cover this aspect. For these cloud resources, such as GCP Pub/Sub resources, we will explicitly call the CLI to delete cloud resources tagged with test_run_id in the teardown section.

ciaranRoche · 2026-01-30T13:16:42Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+### 4.2 Test Run Identification
+
+Each Test Run generates a unique identifier (Test Run ID) derived from:


What is the UUID here, can you provide an example, and ensure it does not hit k8s resource name restrictions please 🙏

Either a CI-provided timestamp or a Unix timestamp, it uses time.Now().UnixNano(), resulting in a 19-digit number: 1738152345678901234.

ciaranRoche · 2026-01-30T13:18:12Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+All policy decisions are encoded in namespace annotations. Reconciler is stateless.
+
+#### 9.3.2 Cross-Namespace Correlation


The more I am thinking about it, is this something we want to support 🤔 I was under the assumption we would always deploy to a single namespace. Might reduce the complexity for now if we just focus on single namespace, and wait and see how the framework is consumed by users

Yeah, you’re right — this was an over-assumption on my side. When I was writing the doc, this idea came up and I added it, but it’s indeed out of scope for now. Let’s remove this part and focus on the single-namespace assumption for the time being.

ciaranRoche · 2026-01-30T13:19:02Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+```
+Create Namespace
+      ↓
+Deploy Infrastructure


What happens here if the infra fails, does it retry or fail fast and cleanup or leave some artifacts for debugging

There is no retry but fail fast. In my current design(9. Resource Management and Cleanup), as the e2e jobs are cron jobs, I'd like to keep the whole environment for hours for debugging.

ciaranRoche · 2026-01-30T13:20:00Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+**Suite Independence**:
+
+- Suites can run independently if infrastructure is ready
+- Suite failures do not block subsequent suites (collect all failures)


Could this lead to noise, if core suite fails, like API failures, surely the rest of the tests will fail also 🤔

If the core suite fails(in your example, API failure), the next suites will be skipped to execute.

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

86254860 · 2026-02-02T06:41:21Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+**Test Suite Execution**:
+- Suites execute sequentially
+- Each suite may deploy/remove production adapters as needed


Is it better to separate deployment logic from test cases?
We could maintain both in the E2E repository; however, across different deployment scenarios, we should be able to reuse the same E2E test cases, since these tests can be compatible with different numbers and types of adapters.

We may have multiple scenarios. For example, we could deploy various dummy adapters covering all supported Kubernetes resource types. This would require defining the API-required adapters and their corresponding configurations. Based on the deployed environment, we would then run the appropriate E2E test cases as needed.

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

coderabbitai

Actionable comments posted: 7

🤖 Fix all issues with AI agents

In `@hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md`:
- Around line 650-654: Add a new subsection titled "Cleanup Failure Handling"
under Section 8.5 (currently referenced as 8.6) that specifies: test suites must
implement timeout-based forced cleanup with a default 60s, cleanup failures must
be detected and logged (but should not mark the test as failed—the test result
should reflect validation only), reconciler/orphan-detection should surface
failed cleanups, and the Test Framework must perform final
namespace/infrastructure deletion regardless of suite cleanup status; include
these four bullet points verbatim and ensure the subsection name is "Cleanup
Failure Handling".
- Around line 323-336: Update the "Component Lifecycle Ownership" table row for
"Broker Resources" and expand Section 6.2 to explicitly state the cleanup
responsibilities and sequence: declare that adapter-created subscriptions are
adapter-owned and must be removed by the adapter on removal (e.g., During its
BeforeEach/AfterAll hooks), sentinel-owned topics are removed by the Sentinel
during Test Run teardown, and the cleanup order is adapters-first then Sentinel
to avoid races; also require idempotent cleanup operations and a simple
coordination mechanism (e.g., atomic marker or lock) to avoid conflicts when
hot-plugging adapters as described in Section 8.2.2.
- Around line 309-311: The markdown code block showing the namespace example
currently has no language specifier; update the fenced block around the example
string "e2e-{TEST_RUN_ID}" to include a language (e.g., change the opening fence
from ``` to ```text) so the snippet is marked as plain text for rendering and
syntax highlighting.
- Around line 581-613: The fenced code block showing the Ginkgo test structure
should include a language specifier so syntax highlighting is enabled; update
the opening fence of the block that begins with Describe("DNS Adapter Suite",
func() { to use ```go instead of ```; leave the rest of the block unchanged so
the Ginkgo/Gomega test symbols (Describe, BeforeAll, It, AfterAll, BeforeEach,
AfterEach) are highlighted correctly.
- Around line 486-494: The fenced code block showing the example flow lacks a
language tag; update the block surrounding the numbered steps (the
triple-backtick block containing "1. Test creates Cluster via API...") to
include a language specifier such as ```text (or ```markdown) so the block is
explicitly marked and rendered correctly; ensure the opening fence becomes
```text and leave the content unchanged.
- Around line 139-149: The fenced lifecycle diagram block that currently starts
with ``` and contains the steps "Create Namespace → Deploy Infrastructure →
Infrastructure Ready → Execute Test Suites → Cleanup" should include a language
specifier (e.g., change the opening fence to ```text or ```mermaid) so the
diagram renders with proper formatting; update the code block opening fence
accordingly while leaving the block content unchanged.
- Around line 565-655: Add a new Section 8.5 between the existing "### 8.4 Test
Organization Guidelines" and "### 8.6 State Management and Suite Independence"
headers to restore contiguous numbering; create a brief header "### 8.5 [Title]"
(e.g., "Adapter Lifecycle Policies" or "Test Isolation Patterns") and include
2–4 sentences summarizing the intended content (or a placeholder note) so the
document numbering is corrected and readers aren’t confused by the gap; update
any cross-references if present.

🧹 Nitpick comments (3)

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md (3)
716-720: Consider reconciler frequency for active development environments.

The 30-minute reconciler frequency may be too infrequent for developers running tests locally in rapid succession. In resource-constrained development clusters, this could lead to:

Quota exhaustion while waiting for cleanup

Developer friction (manual cleanup required)

Reduced parallel test capacity

Consider either:

Shortening the default frequency for development (e.g., 5-10 minutes)

Making frequency configurable per environment

Adding immediate cleanup for passed tests (10-minute retention might still be too long)
⚙️ Proposed enhancement

Update the reconciler description:
 - Runs periodically (frequency configurable, typically 30 minutes)
+  - Default: 30 minutes (CI), 5 minutes (local development)
+  - Configurable via environment variable: E2E_RECONCILER_FREQUENCY
674-678: Reconsider default 2-hour TTL for development environments.

The 2-hour default TTL may be excessive for local development scenarios where:

Developers iterate rapidly (multiple test runs per hour)

Local clusters have limited resources

Immediate feedback on quota issues is preferred

While 2 hours provides safety for CI interruptions, it could cause resource exhaustion in active development. Consider a shorter default (e.g., 30-60 minutes) or making it environment-specific.
⏱️ Proposed refinement

Update the default retention section:
-- **Default TTL**: 2 hours from creation
+- **Default TTL**: 
+  - CI context (`ci=yes`): 2 hours from creation
+  - Local context (`ci=no`): 1 hour from creation
684-696: Consider shortening passed test retention.

The 10-minute retention for passed tests may still be too long in high-velocity development environments. Since passed tests have "minimal debugging value" (as stated), consider:

Reducing to 2-5 minutes (still prevents race conditions)

Immediate deletion by E2E flow after test completion

Making this configurable per environment

This would improve resource turnover and reduce quota pressure.

coderabbitai · 2026-02-03T02:51:23Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+```
+Create Namespace
+      ↓
+Deploy Infrastructure
+      ↓
+Infrastructure Ready
+      ↓
+Execute Test Suites
+      ↓
+Cleanup
+```


⚠️ Potential issue | 🟡 Minor

Add language specifier to code block.

The lifecycle diagram code block should specify a language for proper rendering and syntax highlighting. Consider using text or mermaid (if using a diagram syntax).

📝 Proposed fix

-``` +```text Create Namespace ↓

🧰 Tools

🪛 markdownlint-cli2 (0.20.0)

[warning] 139-139: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

In `@hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md` around lines 139 - 149, The fenced lifecycle diagram block that currently starts with ``` and contains the steps "Create Namespace → Deploy Infrastructure → Infrastructure Ready → Execute Test Suites → Cleanup" should include a language specifier (e.g., change the opening fence to ```text or ```mermaid) so the diagram renders with proper formatting; update the code block opening fence accordingly while leaving the block content unchanged.

coderabbitai · 2026-02-03T02:51:23Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+```
+e2e-{TEST_RUN_ID}
+```


⚠️ Potential issue | 🟡 Minor

Add language specifier to code block.

The namespace naming convention example should specify a language (e.g., text or shell).

📝 Proposed fix

-``` +```text e2e-{TEST_RUN_ID}

</details>  <details> <summary>📝 Committable suggestion</summary> > ‼️ **IMPORTANT** > Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements. ```suggestion

🧰 Tools

🪛 markdownlint-cli2 (0.20.0)

[warning] 309-309: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

In `@hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md` around lines 309 - 311, The markdown code block showing the namespace example currently has no language specifier; update the fenced block around the example string "e2e-{TEST_RUN_ID}" to include a language (e.g., change the opening fence from ``` to ```text) so the snippet is marked as plain text for rendering and syntax highlighting.

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

coderabbitai · 2026-02-03T02:51:23Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+```
+1. Test creates Cluster via API with labels: fixture.control/mode=delayed-success, fixture.control/delay-seconds=30
+2. API persists Cluster (including labels)
+3. Sentinel polls, detects new Cluster, publishes event
+4. Fixture Adapter consumes event, reads labels, waits 30s
+5. Fixture Adapter reports success status to API
+6. API updates Cluster status
+7. Test validates: Cluster phase = Ready (after ~30s)
+```


⚠️ Potential issue | 🟡 Minor

Add language specifier to code block.

The example flow should specify a language (e.g., text or markdown).

📝 Proposed fix

-``` +```text 1. Test creates Cluster via API with labels: fixture.control/mode=delayed-success, fixture.control/delay-seconds=30 2. API persists Cluster (including labels)

🧰 Tools

🪛 markdownlint-cli2 (0.20.0)

[warning] 486-486: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

In `@hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md` around lines 486 - 494, The fenced code block showing the example flow lacks a language tag; update the block surrounding the numbered steps (the triple-backtick block containing "1. Test creates Cluster via API...") to include a language specifier such as ```text (or ```markdown) so the block is explicitly marked and rendered correctly; ensure the opening fence becomes ```text and leave the content unchanged.

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

coderabbitai · 2026-02-03T02:51:24Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+```
+Describe("DNS Adapter Suite", func() {
+
+  // Test Group 1: Shared adapter (Test Group-level)
+  Describe("Deployment Validation", Ordered, func() {
+    var adapter *Adapter
+    BeforeAll: Deploy adapter once
+    It: Test deployment correctness
+    It: Test configuration loading
+    It: Test subscription registration
+    AfterAll: Remove adapter
+  })
+
+  // Test Group 2: Shared adapter (Test Group-level)
+  Describe("Functional Tests", Ordered, func() {
+    var adapter *Adapter
+    BeforeAll: Deploy adapter + create test data
+    It: Validate DNS record creation
+    It: Validate status reporting
+    It: Validate metrics
+    AfterAll: Cleanup test data + remove adapter
+  })
+
+  // Test Group 3: Isolated adapters (Test Case-level)
+  Describe("Error Scenarios", func() {
+    var adapter *Adapter
+    BeforeEach: Deploy fresh adapter
+    It: Test error handling (invalid domain)
+    It: Test retry logic
+    AfterEach: Remove adapter
+  })
+})
+```


⚠️ Potential issue | 🟡 Minor

Add language specifier to code block.

The conceptual test structure example should specify a language. Consider using go since this appears to be Ginkgo/Gomega test syntax.

📝 Proposed fix

-``` +```go Describe("DNS Adapter Suite", func() {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

```

Describe("DNS Adapter Suite", func() {

// Test Group 1: Shared adapter (Test Group-level)

Describe("Deployment Validation", Ordered, func() {

var adapter *Adapter

BeforeAll: Deploy adapter once

It: Test deployment correctness

It: Test configuration loading

It: Test subscription registration

AfterAll: Remove adapter

})

// Test Group 2: Shared adapter (Test Group-level)

Describe("Functional Tests", Ordered, func() {

var adapter *Adapter

BeforeAll: Deploy adapter + create test data

It: Validate DNS record creation

It: Validate status reporting

It: Validate metrics

AfterAll: Cleanup test data + remove adapter

})

// Test Group 3: Isolated adapters (Test Case-level)

Describe("Error Scenarios", func() {

var adapter *Adapter

BeforeEach: Deploy fresh adapter

It: Test error handling (invalid domain)

It: Test retry logic

AfterEach: Remove adapter

})

})

```

🧰 Tools

🪛 markdownlint-cli2 (0.20.0)

[warning] 581-581: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

In `@hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md` around lines 581 - 613, The fenced code block showing the Ginkgo test structure should include a language specifier so syntax highlighting is enabled; update the opening fence of the block that begins with Describe("DNS Adapter Suite", func() { to use ```go instead of ```; leave the rest of the block unchanged so the Ginkgo/Gomega test symbols (Describe, BeforeAll, It, AfterAll, BeforeEach, AfterEach) are highlighted correctly.

coderabbitai · 2026-02-03T02:51:24Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+**Cleanup Responsibility**:
+
+- Test cases and suites clean their own state (adapters, test data)
+- Infrastructure cleanup handled by Test Framework (see Section 9 for retention policy)
+


⚠️ Potential issue | 🟠 Major

Address test cleanup failure scenarios.

Section 8.6 states "Test cases and suites clean their own state (adapters, test data)" but doesn't address what happens if cleanup fails. Failed cleanup could:

Leave orphaned adapters consuming resources

Pollute the namespace for subsequent test suites

Interfere with the Test Framework's final teardown

Consider adding guidance for:

Cleanup failure detection and logging

Whether cleanup failures should fail the test

Fallback mechanisms (e.g., force-delete with timeout)

🛡️ Proposed addition

Add a subsection under 8.5 (currently 8.6):

**Cleanup Failure Handling**: - Test suites implement timeout-based forced cleanup (default: 60s) - Cleanup failures are logged but don't fail the test (test result reflects actual validation) - Failed cleanups are detected by reconciler via orphaned resources - Test Framework performs final namespace deletion regardless of suite cleanup status

🤖 Prompt for AI Agents

In `@hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md` around lines 650 - 654, Add a new subsection titled "Cleanup Failure Handling" under Section 8.5 (currently referenced as 8.6) that specifies: test suites must implement timeout-based forced cleanup with a default 60s, cleanup failures must be detected and logged (but should not mark the test as failed—the test result should reflect validation only), reconciler/orphan-detection should surface failed cleanups, and the Test Framework must perform final namespace/infrastructure deletion regardless of suite cleanup status; include these four bullet points verbatim and ensure the subsection name is "Cleanup Failure Handling".

86254860 · 2026-02-03T06:02:05Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+| Broker Resources   | Adapter/Sentinel | Per Test Run | |
+
+**Rule:**
+No component may create resources outside its Test Run Namespace without Test Run–level isolation.


adapter may create new namespace with clusterID name.

86254860 · 2026-02-03T06:04:07Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+**Conceptual Structure**:
+
+```
+Describe("DNS Adapter Suite", func() {


We don't involve any business adapter, why defining it as DNS Adapter Suite?

86254860 · 2026-02-03T06:06:46Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+Test infrastructure is managed at the **Test Run level**, not per test case.
+
+- Infrastructure is deployed once per Test Run


Is the adapter considered as infrastructure, or is it treated differently?

86254860 · 2026-02-03T06:10:28Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+### 4.4 Fixture Adapter
+
+**Problem**: Core Suite needs to validate HyperFleet framework behavior (event flow, status aggregation, error handling). Functional adapters have external dependencies (cloud APIs, GCP projects) and cannot provide the controlled, repeatable scenarios needed for framework testing. What type of adapter should Core Suite use?


I don’t see a fundamental difference between the fixture adapter and the functional adapters. To me, they seem to be the same, as both are dummy adapters.

Is it a problem ? We may need a test adapter to cover all the logic, but it is not a problem IMO. : )

update

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md`:
- Around line 351-356: Change the requirement text for "All E2E test namespaces"
so it says namespaces must "include the following labels" (or "must include at
least these labels") rather than "must carry exactly three labels"; keep the
three listed labels (`ci`, `test-run-id`, `managed-by`) and their allowed values
but remove the constraint that no other labels may be present to allow
platform-injected labels.

coderabbitai · 2026-02-03T06:42:09Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+All E2E test namespaces must carry exactly three labels:
+
+1. **`ci`**: Execution context (`yes` | `no`)
+2. **`test-run-id`**: Test Run identifier
+3. **`managed-by`**: Ownership marker (`e2e-test-framework`)
+


⚠️ Potential issue | 🟡 Minor

Avoid “exactly three labels” to allow platform-required labels.

Kubernetes or cluster policies may inject additional labels; require “must include” instead of “exactly three” to prevent conflicts.

✅ Suggested edit

-All E2E test namespaces must carry exactly three labels: +All E2E test namespaces must carry at least the following three labels:

🤖 Prompt for AI Agents

In `@hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md` around lines 351 - 356, Change the requirement text for "All E2E test namespaces" so it says namespaces must "include the following labels" (or "must include at least these labels") rather than "must carry exactly three labels"; keep the three listed labels (`ci`, `test-run-id`, `managed-by`) and their allowed values but remove the constraint that no other labels may be present to allow platform-injected labels.

yufchang · 2026-02-05T07:48:39Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+- Ensures **strong resource isolation** between test runs
+- Clearly defines **deployment lifecycle ownership**
+- Prevents race conditions by design
+- Supports **dynamic adapter deployment and removal** (hot-plugging)


I would suggest isolate deployment and testing. This increase complexity. Now API needs to know the number of adapters to aggregate status.

yufchang · 2026-02-05T07:59:38Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+### 4.3 Test Run Lifecycle
+


Infra deployment and the tearing down part is in a separate independent CI workflow in PROW. The run strategy here , my understanding is about test case execution. How about to move this part into CICD part ?

yufchang · 2026-02-05T09:00:04Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+- Delete cloud messaging resources (topics/subscriptions) tagged with test_run_id via cloud CLI
+- Uninstall infrastructure components via helm
+- Delete namespace
+- See Section 9 for detailed cleanup and retention policy


This can be in the test pipeline. I think here a little mixed between "test case run strategy" and " CI execution strategy ". : )

yufchang · 2026-02-05T09:01:15Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+### 4.4 Fixture Adapter
+
+**Problem**: Core Suite needs to validate HyperFleet framework behavior (event flow, status aggregation, error handling). Functional adapters have external dependencies (cloud APIs, GCP projects) and cannot provide the controlled, repeatable scenarios needed for framework testing. What type of adapter should Core Suite use?


Is it a problem ? We may need a test adapter to cover all the logic, but it is not a problem IMO. : )

yufchang · 2026-02-05T09:03:39Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+**Problem**: Core Suite needs to validate HyperFleet framework behavior (event flow, status aggregation, error handling). Functional adapters have external dependencies (cloud APIs, GCP projects) and cannot provide the controlled, repeatable scenarios needed for framework testing. What type of adapter should Core Suite use?
+
+**Decision**: Fixture Adapter in dedicated repository


I would suggest to use "test Adapter " . And it is not a static test adapter, its configuration can be customized.

yufchang · 2026-02-05T09:04:59Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+#### 4.4.2 Design Approach
+
+**Repository**: `openshift-hyperfleet/adapter-fixture`


How about the name of "test-adater". It is straightforward.

yufchang · 2026-02-05T09:10:27Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+### 5.1 One Namespace per Test Run
+
+Each Test Run is assigned a **dedicated Kubernetes Namespace**.


Is it because we are using landingzone Adapter? In one test env, why we are not able to run different test suits? Sorry, Maybe I misunderstood.

yufchang · 2026-02-05T09:12:06Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+| Component          | Lifecycle Owner  | Scope        | Notes |
+|--------------------|------------------|--------------|-------|
+| Namespace          | Test Framework   | Per Test Run | |
+| API                | Test Framework   | Per Test Run | |
+| Sentinel           | Test Framework   | Per Test Run | |
+| Fixture Adapter    | Test Framework   | Per Test Run | Infrastructure component |


The similar question, why deployment belong to automation framework?

yufchang · 2026-02-05T09:13:48Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+
+Test suites represent **validation focus**, not environment configurations.
+
+#### 8.2.1 Core Suite


Could you elaborate why we need this concept ? Is there attributes in test case can indicate what are core suite ?

yufchang · 2026-02-05T09:15:55Z

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md

+## 12. Open Questions and Follow-Ups
+
+No open questions at this time. Fixture Adapter design is covered in Section 4.4.
+
+---
+


Can we remove this part since no questions. I guess it is generated by AI. : )

HYPERFLEET-532 | docs: E2E test run strategy and Resource Management …

0c5cf7b

…spike

xueli181114 reviewed Jan 30, 2026

View reviewed changes

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md Show resolved Hide resolved

xueli181114 reviewed Jan 30, 2026

View reviewed changes

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md Show resolved Hide resolved

xueli181114 reviewed Jan 30, 2026

View reviewed changes

hyperfleet/e2e-testing/e2e-run-strategy-spike-report.md Outdated Show resolved Hide resolved