HYPERFLEET-531 - feat: Add Hyperfleet release process spike report#83
HYPERFLEET-531 - feat: Add Hyperfleet release process spike report#8386254860 wants to merge 2 commits intoopenshift-hyperfleet:mainfrom
Conversation
WalkthroughAdds a new comprehensive release-process spike document at hyperfleet/docs/release-process-spike-report.md specifying release entry criteria, branching and code-freeze workflows (release branches, RCs, GA), a 3‑week sprint cadence with ad‑hoc releases, multi‑gate readiness checks, post‑freeze bug and hotfix handling, release artifacts (container images, Helm charts, adapters, Git tags, release repo), documentation and testing requirements, security gates, governance, MVP (Prow manual releases) → Post‑MVP (Konflux) migration plan, templates, appendices, and success metrics. Sequence Diagram(s)sequenceDiagram
autonumber
participant Dev as Developer
participant Repo as Git Repo
participant CI as CI System (Prow / Konflux)
participant Registry as Image Registry
participant ReleaseRepo as Release Repository
participant Ops as Operations
Dev->>Repo: Push feature branch / open PR
Repo->>CI: Trigger CI (tests, build)
CI->>Repo: Report status (pass/fail)
CI->>Registry: Publish build artifact (on success)
Note over CI,Repo: Evaluate release entry criteria & readiness gates
Dev->>Repo: Create release branch / tag (RC)
Repo->>CI: Trigger release pipeline
CI->>ReleaseRepo: Publish release artifacts (charts, manifests, notes)
CI->>Registry: Push release images (tagged)
ReleaseRepo->>Ops: Provide release bundle & release notes
Ops->>Registry: Deploy to environments (canary -> GA)
Ops->>Repo: Report deployment status / feedback
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@hyperfleet/docs/release-process-spike-report.md`:
- Around line 638-665: Replace emphasized section titles that use bold (e.g.,
"**1. Conduct Retrospectives and Identify Improvements**", "**2. Migrate to
Konflux for Official Releases**", "**Why Konflux:**", "**Migration Approach:**",
"**3. Additional Process Improvements**") with proper Markdown headings (e.g.,
"#", "##", or "###" as appropriate) so they are true headings instead of
emphasis; update the three numbered/section headings to at least "##" and the
subheadings like "Why Konflux:" and "Migration Approach:" to "###" for
consistent document structure and to satisfy MD036.
- Line 79: The fenced code block at the shown diff is missing a language
identifier and triggers MD040; update the opening fence from ``` to include a
language (e.g., change the opening fence to ```text or ```bash) so the block
becomes a tagged code fence; ensure the corresponding closing fence remains ```
and leave the block contents unchanged.
- Line 250: The fenced code block in
hyperfleet/docs/release-process-spike-report.md is missing a language identifier
(triggering MD040); update the opening fence from ``` to ```text (or another
appropriate language like ```text) for the code block that contains the flow
diagram (the sequence starting with "Bug Reported") so the linter recognizes it
as a code block and MD040 is satisfied.
- Line 303: A fenced code block in the markdown ends with a bare triple backtick
(```); MD040 requires a language identifier—update that fenced block (the
trailing/backtick-only block shown in the diff) to include a language token like
text (e.g., change ``` to ```text) so the block becomes ```text and closes
properly; ensure the same fenced block that contains "Developer → Code Review →
Release Owner → Automated Tests → Merge" is updated.
🧹 Nitpick comments (1)
hyperfleet/docs/release-process-spike-report.md (1)
400-400: Consider tightening wording (“under discussion”).
LanguageTool notes this as wordy; “proposed” or “being considered” reads tighter.✅ Suggested tweak
-- Note: Umbrella chart strategy (hyperfleet-chart repo) is under discussion +- Note: Umbrella chart strategy (hyperfleet-chart repo) is being considered
10f3fc8 to
692ff64
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@hyperfleet/docs/release-process-spike-report.md`:
- Line 199: Rename the section header "### 3.4 Documentation Completeness
(Mandatory)" to "### 3.3 Documentation Completeness (Mandatory)" and renumber
the following headers accordingly: change "3.5 Cross-Team Coordination" to "3.4
Cross-Team Coordination", "3.6 Security & Compliance" to "3.5 Security &
Compliance", and "3.7 Release Artifacts Verification" to "3.6 Release Artifacts
Verification" so the sequence reads 3.2 → 3.3 → 3.4 → 3.5 → 3.6; update any
in-file references to these section numbers if present.
0d6b5cb to
5d202ca
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@hyperfleet/docs/release-process-spike-report.md`:
- Around line 98-104: The document has conflicting support-window language
between the "After GA" branching diagram and the "Section 2.4" paragraph; pick
one explicit policy and update both places to match (e.g., replace "support
window: 12 months" in the "After GA" diagram with the exact wording used in
Section 2.4, or update Section 2.4 to state "12 months" if that is the chosen
policy), and ensure the same canonical phrase appears in both the "After GA"
diagram block and the Section 2.4 text so the policy is unambiguous throughout.
🧹 Nitpick comments (1)
hyperfleet/docs/release-process-spike-report.md (1)
416-456: Make templates explicitly “examples” or use placeholders.The release notes and appendix templates use concrete versions/dates (e.g., v1.5.0, 2026‑05‑12, April 14, 2026), which can be misread as committed schedules. Consider labeling these blocks as “Example” and/or replacing with placeholders (e.g., vX.Y.Z, YYYY‑MM‑DD).
♻️ Example tweak (one option)
-# HyperFleet v1.5.0 Release Notes +# HyperFleet vX.Y.Z Release Notes (Example) -## [1.5.0] - 2026-05-12 +## [X.Y.Z] - YYYY-MM-DD -- Feature Freeze: April 14, 2026 -- Code Freeze: April 28, 2026 -- GA Target: May 12, 2026 +- Feature Freeze: YYYY-MM-DD +- Code Freeze: YYYY-MM-DD +- GA Target: YYYY-MM-DDAlso applies to: 508-529, 732-736
5d202ca to
2f53472
Compare
|
|
||
| ### 2.4 Release Branch Maintenance | ||
|
|
||
| **Support Policy:** N-2 OR 6 months (whichever is longer) |
There was a problem hiding this comment.
I am not sure if my math is mathing here, but N-2 with a 6 week release cycle, does this not mean 12 weeks of support vs 6 months 🤔 so wouldn't 6 months be defacto?
There was a problem hiding this comment.
Good catch - the math wasn't working as intended! Therefore, I've updated the policy to use lifecycle stages instead:
- Every release gets exactly 6 months of support from GA date
- Phase 1 (months 0-3): Full support - all Major+ bug fixes
- Phase 2 (months 3-6): Security maintenance - only CRITICAL/HIGH CVEs and Blockers
- After 6 months: EOL
This could:
- Gives users a clear, predictable 6-month upgrade window regardless of our release cadence
- Naturally reduces maintenance burden for developers (security-only after month 3)
| - [ ] Other: [specify] | ||
|
|
||
| ## Rollback Plan | ||
| How will we rollback if issues are discovered? |
There was a problem hiding this comment.
Are rollbacks something we want to support, or should we always roll forward?
There was a problem hiding this comment.
Taking the hyperfleet-api database as an example: if a database schema change is involved and the offering team encounters an emergency in production, the fastest and simplest mitigation is usually to roll back. Without a defined rollback plan, how is the offering team expected to handle such a situation?
Here's some analysis from AI.
Roll-forward (Always fix forward):
- ✅ Safer for Kubernetes environments (CRDs, state changes, data migrations)
- ✅ Aligns with GitOps and immutable infrastructure principles
- ✅ Forces proper testing and validation of fixes
- ✅ Simpler support matrix (no need to maintain rollback compatibility)
- ❌ Slower response time (need to build/test/release new version)
- ❌ Requires discipline to fix issues quickly
Rollback (Support reverting):
- ✅ Faster emergency response (immediate revert)
- ✅ Buys time to properly fix issues
- ❌ Complex with CRDs, state changes, database schemas
- ❌ Requires bi-directional migration support (N → N-1)
- ❌ More testing burden (must test both upgrade and downgrade paths)
My opinion for HyperFleet, WDYT?
- Primary strategy: Roll-forward (fix issues with new patch release)
- Emergency escape hatch: Rollback support (only for critical/blocker scenarios where roll-forward would take too long)
There was a problem hiding this comment.
I would lean towards roll-forward, until such time that we require rollback support. Putting roll back support right now will stretch the testing capacity we have before the end of Q1, I would focus this doc on roll-forward only and create a separate epic in our backlog for roll-back support post Q1
There was a problem hiding this comment.
https://github.com/openshift-hyperfleet/architecture/blob/main/hyperfleet/docs/versioning-trade-offs.md#4-database-migration-and-rollback-procedures-post-mvp here documented the version rollbacks from Alex.
|
|
||
| ### 2.2 Timeline and Freeze Process | ||
|
|
||
| **Sprint-Based Release Cycle (6 weeks / 2 sprints):** |
There was a problem hiding this comment.
6 weeks seems pretty long to me, considering how new the project is, I think it better if we tighten up on this and push releases a lot quicker so that we can get early feedback loops between us and the pillar teams. As the project matures we can look to extend the release cadence but I think we should focus on getting the release cycle as fast as possible.
2 weeks given to stabilization and code freeze, I think with the right CI/CD in place we can tighten this up a lot
There was a problem hiding this comment.
Just to add if we can ship a hotfix in 48 hours, what is the reason 'features' take 6 weeks 🤔
Not saying we need to release every 2 days but just worried how things can by pass the process and be labeled as a hotfix to get a feature out ASAP
There was a problem hiding this comment.
My initial thought is that in the initial releases, the CI/CD and E2E testing automation is currently being built, the Stabilization & Release Phase might take longer.
And yes as you mentioned it's very important four our new product to get early and rapid feedback with pillar teams. Therefore, I've updated cadence from 6 weeks to 3 weeks. And I also added some note "Since CI/CD automation is currently being built, the Stabilization & Release Phase (Week 3) may take longer in the initial releases. As the product matures and automation capabilities improve, the team should continuously refine and optimize the release cadence based on actual data and lessons learned from each release cycle."
There was a problem hiding this comment.
Yeah that sounds good to me, a release per sprint feels a lot better. We should definitely refine this as we go 🙏
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@hyperfleet/docs/release-process-spike-report.md`:
- Around line 171-176: The bold phase labels "**Phase 1: Full Support (first 3
months)**" and "**Phase 2: Security Maintenance (months 3-6)**" should be
converted to Markdown subheadings (e.g., use "#### Phase 1: Full Support (first
3 months)" and "#### Phase 2: Security Maintenance (months 3-6)") so the section
titles are proper headings; keep the subsequent bullet lists unchanged and
ensure there is a blank line before each new heading to satisfy markdownlint
MD036.
f6fdcb6 to
8e8b7fb
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@hyperfleet/docs/release-process-spike-report.md`:
- Around line 755-763: The timeline section under "Release v1.5.0 Tracking
Issue" contains incorrect weekday labels for dates; update the lines "Feature
Freeze: April 15, 2026 (Monday, Week 3)", "Code Freeze: April 17, 2026
(Thursday, Week 3)", and optionally "GA Target: April 19, 2026 (Friday, Week 3)"
by either removing the parenthesized weekday labels entirely (recommended) or
replacing them with the correct weekdays for those dates, keeping the date
strings "Sprint Start: April 1, 2026", "Feature Freeze: April 15, 2026", "Code
Freeze: April 17, 2026", and "GA Target: April 19, 2026" intact so the timeline
no longer shows inconsistent weekday/date pairs.
🧹 Nitpick comments (1)
hyperfleet/docs/release-process-spike-report.md (1)
471-480: Use placeholders for sample versions/dates in templates.The example release notes and dependency versions (e.g., “Go: 1.25”, “2026-05-12”) may go stale or imply commitments. Prefer placeholders so teams fill in current values per release.
✅ Suggested fix (placeholder examples)
-## [1.5.0] - 2026-05-12 +## [X.Y.Z] - YYYY-MM-DD ... -- Go: 1.25 -- Base image: gcr.io/distroless/static-debian12:nonroot -- Helm: 3.14+ +- Go: X.Y +- Base image: <image-ref> +- Helm: X.Y+Also applies to: 533-555
8e8b7fb to
940a918
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@hyperfleet/docs/release-process-spike-report.md`:
- Line 217: Section 3.2 and 4.2 contain conflicting policies about
Major-severity bugs (3.2 at line 217 says "No Major severity bugs" as a hard GA
gate while 4.2 at lines 304-307 allows deferral); reconcile them by choosing one
of the three recommended resolutions and apply consistent wording: either soften
3.2 to "No Blocker/Critical bugs; Major bugs evaluated by Release Owner and may
be deferred with documented risk", or strengthen 4.2 to require Major fixes
before GA (or mandate release delay/severity downgrade if not fixable), or add
an explicit exception clause in 3.2 stating Major bugs may be accepted only with
documented stakeholder approval and workarounds; update both Section 3.2 and 4.2
text so they match the chosen policy and add a short sentence referencing
"Release Owner" and "stakeholder approval" where applicable.
🧹 Nitpick comments (2)
hyperfleet/docs/release-process-spike-report.md (2)
428-428: Provide timeline for umbrella chart strategy decision.The note "under discussion" is clear, but a release process document ideally should have resolved strategies. Consider adding a timeline for when this decision will be made (e.g., "Decision expected by vX.Y.0 release" or "To be resolved in Q2 2026").
This helps teams understand whether they should plan for a unified chart or continue with individual component charts.
759-762: Use placeholder date format for template consistency.The Release Tracking Issue template uses specific dates (April 1, 15, 17, 19, 2026) while the Ad-Hoc Release Request template (Appendix D) correctly uses "YYYY-MM-DD" placeholders. For consistency and to prevent copy-paste errors, consider using:
-- Sprint Start: April 1, 2026 -- Feature Freeze: April 15, 2026 -- Code Freeze: April 17, 2026 -- GA Target: April 19, 2026 +- Sprint Start: YYYY-MM-DD (Week 1, Day 1) +- Feature Freeze: YYYY-MM-DD (Week 3, Day 1) +- Code Freeze: YYYY-MM-DD (Week 3, Day 3) +- GA Target: YYYY-MM-DD (Week 3, Day 5)Or use relative notation like "Sprint Day 1", "Sprint Day 15", etc.
940a918 to
d1297c6
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@hyperfleet/docs/release-process-spike-report.md`:
- Around line 756-796: The Release Candidates checklist contains incorrect
weekday labels for the April 2026 dates (e.g., "v1.5.0-rc.1 (April 15,
Monday)"). Update the text under the "Release Candidates" section to either
remove all weekday names or correct them to the proper weekdays (April 15 →
Wednesday, April 16 → Thursday, April 17 → Friday, April 18 → Saturday) so the
date/weekday entries in the v1.5.0-rc.1/rc.2/rc.3 lines are accurate and won't
drift.
d1297c6 to
48ec6e2
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@hyperfleet/docs/release-process-spike-report.md`:
- Around line 762-766: Update the template timeline to match the 3-week sprint
cadence: keep Sprint Start as April 1, 2026, change Feature Freeze to April
15–16, 2026, change Code Freeze to April 17–18, 2026, and change GA Target to
April 22, 2026 so the dates across the timeline in
release-process-spike-report.md align with the 3-week (21-day) sprint cadence
referenced elsewhere.
🧹 Nitpick comments (3)
hyperfleet/docs/release-process-spike-report.md (3)
536-536: Date inconsistency between changelog example and template.The changelog example shows release date
2026-05-12(May 12), but the release tracking template in Appendix C uses April 2026 dates (lines 762-766). For consistency and to avoid confusion, consider using the same month in both examples.🗓️ Suggested fix
-## [1.5.0] - 2026-05-12 +## [1.5.0] - 2026-04-22Or update both to use the same example month consistently throughout the document.
780-782: Update RC checklist dates to match corrected timeline.If the main timeline is adjusted to properly reflect a 3-week sprint (per previous comment), update the RC checklist dates accordingly.
📅 Suggested alignment
## Release Candidates -- [ ] v1.5.0-rc.1 (April 15, at Feature Freeze) -- [ ] v1.5.0-rc.2 (April 16-17, if needed) -- [ ] v1.5.0-rc.3 (April 18, if needed) +- [ ] v1.5.0-rc.1 (April 16, at Feature Freeze) +- [ ] v1.5.0-rc.2 (April 17-18, if needed) +- [ ] v1.5.0-rc.3 (April 20-21, if needed)
431-431: Optional: Consider more concise wording.The phrase "is under discussion" could be simplified to "is being discussed" or "is TBD" for brevity, though the current wording is acceptable for a spike document.
✍️ More concise alternatives
-- Note: Umbrella chart strategy (hyperfleet-chart repo) is under discussion +- Note: Umbrella chart strategy (hyperfleet-chart repo) is being discussedOr:
-- Note: Umbrella chart strategy (hyperfleet-chart repo) is under discussion +- Note: Umbrella chart strategy (hyperfleet-chart repo) is TBD
48ec6e2 to
6bc2282
Compare
ciaranRoche
left a comment
There was a problem hiding this comment.
Left a small NP, but otherwise looks good to me
| What testing will be deferred to next regular release? | ||
| - [ ] Full exploratory testing | ||
| - [ ] Performance regression testing | ||
| - [ ] Cross-browser testing |
There was a problem hiding this comment.
NP : cross-browser testing is not required for our system
| - ✓ Performance regression tests show no degradation vs. previous release | ||
|
|
||
| **Build & CI Health:** | ||
| - ✓ Prow CI pipeline is green for all components on the main branch |
There was a problem hiding this comment.
In my understanding, Prow CI pipeline is green equals to Automated Test Passed. This part mixes the Testing and Building.
**CI/CD Pipeline Health:**
- ✓ Prow CI pipeline green on main branch for all components
- Unit tests: >70% coverage for new code
- Integration tests: Passing consistently
- E2E tests: Critical user journeys validated
- Performance tests: No regression show no degradation vs. previous release
**Build Artifacts:**
- ✓ Container images build successfully for all target architectures
- ✓ Helm charts package without errors
| - Image naming: `registry.ci.openshift.org/hyperfleet/{component}:v{version}` | ||
| - api-service:v1.5.0 | ||
| - sentinel:v1.5.0 | ||
| - adapter-framework:v1.5.0 |
There was a problem hiding this comment.
Current examples show all components using the same version (v1.5.0). I think we may need to clarify the versioning strategy here. Specifically, whether we intend to use a unified version across all components for each release, or allow components to maintain independent versions while the HyperFleet release defines a validated version set.
| ### 3.2 Bug Severity Gates (Mandatory) | ||
|
|
||
| - ✓ No open bugs with severity **Major** or above (Blocker, Critical, Major) | ||
| - ✓ Normal and Minor bugs: No gate, tracked for future releases |
There was a problem hiding this comment.
For MVP , we can skip normal bugs. Post MVP , let us set a more strict Quality criteria to fix normal bugs before release.
| │ │ | ||
| │ │ Code Freeze (critical fixes only) | ||
| │ │ | ||
| │ ├─── vX.Y.0-rc.1 (Release Candidate 1) |
There was a problem hiding this comment.
is vX.Y... a branch from a branch or a tag?
Maybe it worth adding that it's a tag. Example (Tag - Release Candidate 1)
Summary by CodeRabbit