-
Notifications
You must be signed in to change notification settings - Fork 86
Add design for managing Kopia repositories via BSL and new BSLR #1827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: oadp-dev
Are you sure you want to change the base?
Conversation
sseago
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that isn't clear from the design doc is what changes may be needed upstream to allow BackupRepositories to be managed/created from outside Velero. I don't think Velero has any pluggability here currently, but if you want two different VM backups in the same namespace and BSL to use a different kopia repository, then I would think some velero-level integration would be required.
|
|
||
| ## Background | ||
|
|
||
| The current architecture of OADP tightly couples each BackupStorageLocation (BSL) with a single Kopia repository. This repository is provisioned and controlled entirely by Velero’s core components. While this setup is adequate for standard backup scenarios, it introduces significant limitations when more flexible or granular configurations are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kopia repos are scoped to BSL+namespace, not just BSL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sseago true, will update.
|
|
||
| ## Goals | ||
|
|
||
| * **Support Multiple Repository Instances per BSL** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Support Multiple Repository Instances per BSL in the same namespace"
|
woot! very excited about this one, thank you @mpryc |
@sseago currently, the design does not include any pluggability within Velero itself. The main goal is to enable managing Kopia repositories independently from Velero by using OADP. Specifically, OADP will handle initializing and managing Kopia repositories through functions that interact with the Velero codebase via the OADP controller. This creates an abstraction layer allowing users to access and manage Kopia server repositories outside of Velero. Another goal is to give access to kopia repositories outside of velero to allow e.g. retrieval of the files from kopia cli. |
|
@mpryc How do we keep velero and OADP from stepping on each other if they're both trying to manage kopia repos? i.e. if we have 2 VMs in the same namespace and they need to be stored in different kopia repos, how does that work if Velero creates one repo and uses it for both? I have a feeling I'm missing something here. |
|
Thank you for this proposal! After analyzing the design and comparing it with Velero's current BackupRepository implementation, I have some findings and questions that need clarification. Current UnderstandingBased on the clarification that "the design does not include any pluggability within Velero itself" and that OADP will manage Kopia repositories independently:
Current Velero ArchitectureIn Velero today, BackupRepository and BackupStorageLocation have a many-to-one relationship: graph TB
subgraph "Current Velero Architecture"
BSL["BackupStorageLocation"]
BR1["BackupRepository: ns1-default-kopia"]
BR2["BackupRepository: ns2-default-restic"]
BR3["BackupRepository: ns3-default-kopia"]
BR1 -->|References| BSL
BR2 -->|References| BSL
BR3 -->|References| BSL
BR1 -.->|Stores data in| S3[("S3/Azure/GCS Bucket")]
BR2 -.->|Stores data in| S3
BR3 -.->|Stores data in| S3
BSL -->|Points to| S3
end
style BSL fill:#1976d2,stroke:#0d47a1,stroke-width:2px,color:#fff
style BR1 fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
style BR2 fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
style BR3 fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
style S3 fill:#616161,stroke:#424242,stroke-width:2px,color:#fff
Key characteristics:
Proposed OADP BSLR ArchitectureBased on my understanding, BSLR operates as a parallel system: graph TB
subgraph "OADP Layer"
OADP["OADP Controller"]
BSLR1["BSLR: vm-backups"]
BSLR2["BSLR: database-backups"]
OADP -->|Manages| BSLR1
OADP -->|Manages| BSLR2
OADP -->|Initializes| KopiaRepos[("Independent Kopia Repositories")]
end
subgraph "Velero Layer - Unchanged"
Velero["Velero"]
BSL["BackupStorageLocation"]
BR["BackupRepository"]
Velero -->|Uses| BSL
Velero -->|Creates| BR
BR -->|References| BSL
end
subgraph "External Access"
KopiaCLI["Kopia CLI"]
KopiaCLI -.->|Direct access| KopiaRepos
end
OADP -.->|Interacts with| Velero
BSLR1 -.->|Maps to| BSL
BSLR2 -.->|Maps to| BSL
style OADP fill:#00796b,stroke:#004d40,stroke-width:2px,color:#fff
style BSLR1 fill:#388e3c,stroke:#1b5e20,stroke-width:2px,color:#fff
style BSLR2 fill:#388e3c,stroke:#1b5e20,stroke-width:2px,color:#fff
style Velero fill:#1976d2,stroke:#0d47a1,stroke-width:2px,color:#fff
style BSL fill:#1565c0,stroke:#0d47a1,stroke-width:2px,color:#fff
style BR fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
style KopiaCLI fill:#5d4037,stroke:#3e2723,stroke-width:2px,color:#fff
style KopiaRepos fill:#424242,stroke:#212121,stroke-width:2px,color:#fff
Critical Questions1. Data Flow and IntegrationHow does OADP intercept or redirect the actual backup data flow? Since Velero will continue creating its own BackupRepository objects and managing backups as usual, how does OADP ensure that data goes to the appropriate BSLR-managed repository instead? 2. Namespace Conflict ResolutionAs @sseago pointed out: If two VMs in the same namespace need different Kopia repositories, but Velero creates one repository per namespace-BSL combination, how does OADP handle this conflict? For example:
3. Repository Initialization Race ConditionWho initializes the Kopia repositories - OADP or Velero? If both systems try to initialize repositories in the same storage location, how are conflicts avoided? 4. Backup Operation FlowDuring sequenceDiagram
participant User
participant Velero
participant OADP
participant Storage
User->>Velero: velero backup create
Velero->>Velero: Create Backup CR
Note over Velero: Creates PodVolumeBackup CRs
rect rgb(200, 50, 50)
Note right of Velero: UNCLEAR: How does OADP<br/>intercept this flow?
Velero-->>OADP: ???
OADP-->>Storage: Use BSLR repository?
end
alt Current Understanding
Velero->>Storage: Writes to namespace-based repo
else Proposed BSLR
OADP->>Storage: Writes to workload-based repo
end
5. Repository MappingHow does OADP map between:
6. Credential ManagementIf BSLR repositories have independent credentials, but Velero is still performing the actual backup operations, how are the BSLR credentials injected into the backup process? Suggested Clarifications
This would help address the "stepping on each other" concern and clarify how these two systems can coexist. SummaryThe core architectural question is: How do OADP and Velero coexist without conflicts when they have fundamentally different repository models?
Understanding this interaction is crucial for evaluating the design's feasibility and implementation approach. |
This design introduces the BackupStorageLocationRepository (BSLR) as a new custom resource that models and manages Kopia repositories on a per-BSL basis. Signed-off-by: Michal Pryc <mpryc@redhat.com>
Velero and the new OADP BSLR mechanism avoid stepping on each other by clearly separating responsibility through the use of default vs non-default BackupStorageLocationRepository (BSLR) objects. Velero continues to manage the default BSLR (associated with the default BSL), while OADP only manages non-default BSLRs. In those cases, BSLR acts as a pointer to the Kopia repository, but OADP takes over orchestration and lifecycle management. This separation is handled explicitly in the logic: and
Velero itself will not create or manage the Kopia repositories for the above use case in this design — that responsibility lies with the OADP controller, via the new BackupStorageLocationRepository (BSLR) custom resource. So to address your example: if you have 2 VMs in the same namespace and want them to be stored in separate Kopia repositories, you would define 2 separate BSLRs (each tied to the same or different BSL, depending on your setup). Each VM would then reference its own BSLS — ensuring separation at the repository level. The BSLS is actually another layer which is covered in the design: #1830 Another possibility is to have one BSLR with one BSLS and users specified within BSLS. Each VM user will get their own credentials and Kopia Repository will separate access to the backups (snapshots) for them. |
|
@kaovilai Thank you for your detailed review. I've added an additional design proposal that may address some of your concerns and complements the current BSLR design: #1830 This is a separate design, but it is intended to work in tandem with the current BSLR approach. Below are answers to your questions. Because your review was outside of the code I will copy-paste some of them inline:
All those three points are correct, meaning your understanding is perfect, which leads to conclusion design is self-explanatory.
First of all let's drop restic out of the equation as it's not relevant to the current and future OADP nor the BSLR (as per design). The current velero architecture is pretty much correct, the BSLR will be similar to BackupRepository, just managed, created by the OADP controller. We do not want to have same CR as Velero for this parallel mechanism. It would be also possible to use BackupRepository and Velero as management mechanism and drop the BSLR all together, but that would require BSLS (from another design) to point to BR and have their own naming mechanism to not to step each other. Also in such case the repository password management per BR would need to be included.
Yes you are correct - very much parallel mechanism similar to BR.
OADP does not redirect or intercept the actual backup data flow managed by Velero. Instead, it introduces the option for cluster administrators to explicitly enable the BackupStorageLocationServer (BSLS) for a given BackupStorageLocationRepository (BSLR), as outlined in the BSLS design proposal. Velero continues to manage its own backup lifecycle as usual, and any BSLR-enabled repository is managed in parallel by the OADP. OADP does not override or influence which repository Velero writes to. Administrators could optionally configure access control (ACL) to allow read-only access to a repository created by Velero - for example, enabling users to restore specific files without allowing them to write backups. However, such ACL support is outside the scope of the current design. In short, it is up to the cluster administrator to enable BSLS for selected BSLRs and manage access credentials accordingly.
With the new OADP design:
For example:
Velero will still perform the backup and restore operations, but it will operate against repositories that were created by the Velero and are BackupRepository objects. OADP has provisioned and made available repositories that were referenced by the BSLR mechanism and made them available for backup/restore via the BSLS proxy mechanism. Velero doesn't directly know about the BSLRs - OADP configures and controls access to them, resolving the namespace-level conflict externally.
OADP does not rely on Velero’s namespace-based repository model (namespace-bsl-kopia) for repository management. Instead, it decouples repository creation and selection from Velero entirely by introducing a new model:
|
|
cool cool, interested in seeing the updates we discussed today. Thank you @mpryc really cool work here! |
|
@mpryc This design is gonna be updated/revised, right ? (Keeping the Backup Repo component pluggable) |
Yes. I am making final touch on new PR. |
shawn-hurley
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to see the spec (api) of the BSLR added to the enhancement to get a better understanding of the mechanism for linking the two together.
| ### BSLR Controller Responsibilities | ||
|
|
||
| * **BSL Monitoring**: The controller observes BSLs and ensures a BSLR exists per BSL. | ||
| * **Kopia Repository Management**: For BSLRs not directly tied to a BSL, the controller creates and manages Kopia repositories. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the default BSL?
| - Check if a corresponding default `BackupStorageLocationRepository` (BSLR) exists for the BSL. | ||
| - If a default BSLR exists: | ||
| - Ensure it is not marked for deletion. | ||
| - Determine if the BSLR spec needs to be updated based on changes in the BSL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we would have a default BSLR for every BSL. Could multiple BSL use the same BSLR?
|
I would like to understand the mechanism for when a Backup is created, how the Filesystem backup or the Data mover knows to use a particular BSLR. If I just missed it, I am sorry. |
|
@mpryc: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
WalkthroughIntroduces a new design document for the BackupStorageLocationRepository (BSLR) concept, an OADP CRD for managing multiple Kopia repositories within a single BackupStorageLocation. The document outlines purpose, architecture, reconciliation flows, security considerations, and implementation alternatives. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
docs/design/backupstoragelocationrepository-design.md (1)
51-77: Consider adding a CRD schema outline to the design for clarity.The High-Level and Detailed Design sections describe reconciliation flows and controller responsibilities but do not include a CRD schema or field structure. Showing a simplified CRD outline (or at least the key fields) would help clarify:
- How BSLR references its associated BSL (e.g., by name, namespace, owner reference).
- How the "default" designation is tracked (label, annotation, or spec field).
- What Kopia-specific configuration is stored in the BSLR spec.
- How credentials are referenced and scoped.
This would reduce ambiguity during implementation and help reviewers/maintainers understand the design intent.
Consider adding a section like:
### BSLR CRD Structure (outline) apiVersion: oadp.openshift.io/v1alpha1 kind: BackupStorageLocationRepository metadata: name: <bslr-name> namespace: <namespace> labels: oadp.openshift.io/bsl: <bsl-name> oadp.openshift.io/default: "false" # or "true" for default BSLR spec: backupStorageLocationRef: name: <bsl-name> kopiaConfig: # Kopia-specific settings (encryption, compression, etc.) credentialRef: name: <secret-name> status: phase: Available | Failed | ... conditions: [...]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
docs/design/backupstoragelocationrepository-design.md(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
docs/design/backupstoragelocationrepository-design.md
🪛 LanguageTool
docs/design/backupstoragelocationrepository-design.md
[grammar] ~83-~83: Ensure spelling is correct
Context: ...## Reconcile on BSL Changes _Note: The defult BSLR spec is marked as managed by the O...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🔇 Additional comments (2)
docs/design/backupstoragelocationrepository-design.md (2)
65-66: Clarify namespace scoping of BSLR identity and reconciliation logic.Past feedback noted that Kopia repositories are scoped to
BSL + namespace, not just BSL alone. While the document mentions namespace in places (line 9, 23), the reconciliation flows and controller responsibilities don't consistently emphasize namespace as part of the BSLR identity. Specifically:
- Line 65: "ensures a BSLR exists per BSL" → Should clarify this is per BSL per namespace.
- Lines 85–93: The "Reconcile on BSL Changes" flow checks for a "corresponding default BSLR" without filtering by namespace, which could be ambiguous in multi-namespace clusters.
- The CRD structure (ownership labels, selectors, etc.) needed to enforce namespace isolation is not shown in the design.
Please clarify:
- Is the unique BSLR identity
(namespace, BSL name, bslr-name)or is there a flatter structure?- How does the reconciler ensure it only processes BSLRs in the same namespace as the BSL it monitors?
- What labels or fields identify a "default" BSLR, and are they namespace-scoped?
Also applies to: 85-93
99-113: Clarify reconciliation semantics for the "default" BSLR.The reconciliation flow states (line 99–100): "If the BSLR is the default for its associated BackupStorageLocation (BSL): Exit reconciliation (handled by BSL watcher)."
This logic assumes a way to determine whether a BSLR is "the default" for a BSL, but the mechanism is not specified in the design. Please clarify:
- How is a BSLR marked or identified as "default" for its BSL (label, naming convention, field in spec)?
- Can a namespace have multiple default BSLRs for the same BSL name, and if so, how is the conflict resolved?
- Should the logic at line 99–100 also check that the BSLR and BSL share the same namespace?
|
|
||
| #### Reconcile on BSL Changes | ||
|
|
||
| _Note: The defult BSLR spec is marked as managed by the OADP controller._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix spelling error: "defult" → "default"
-_Note: The defult BSLR spec is marked as managed by the OADP controller._
+_Note: The default BSLR spec is marked as managed by the OADP controller._📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| _Note: The defult BSLR spec is marked as managed by the OADP controller._ | |
| _Note: The default BSLR spec is marked as managed by the OADP controller._ |
🧰 Tools
🪛 LanguageTool
[grammar] ~83-~83: Ensure spelling is correct
Context: ...## Reconcile on BSL Changes _Note: The defult BSLR spec is marked as managed by the O...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🤖 Prompt for AI Agents
In docs/design/backupstoragelocationrepository-design.md around line 83, fix the
spelling mistake "defult" to "default" in the note so the sentence reads: "Note:
The default BSLR spec is marked as managed by the OADP controller."
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mpryc, weshayutin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/hold being replaced by #1845 |
|
@mpryc close? |
This design introduces the BackupStorageLocationRepository (BSLR) as a new custom resource that models and manages Kopia repositories on a per-BSL basis.
Why the changes were made
Propose new design that provides a clear separation between BSL config and repository state, and enables early provisioning of Kopia repos before backups run. This in the future will allow to create BSL Server on top of BSLR.
How to test the changes made
Read the design.