-
Notifications
You must be signed in to change notification settings - Fork 54
[sha512] image/copy: allow users to force digest algorithm in CLI (e.g. skopeo copy --force-digest)
#552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The goal is to allow building the digest-choice-dependent machinery from the bottom up without committing to an API while we don't understand the problem space; for now, we don't expose any way for users to actually make a choice. This code is never called at this point, so this should not change behavior. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
for now, purely to let us internally work on implementing digest choice / conversions; there is no way for a caller to set it. Should not change behavior, the field has no users. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This is: - UNUSABLE (no external API) - INCOMPLETE (manifest digests, TryReusingBlob, ...) - BROKEN (various code assumes that digest returned from upload matches input) - INCONSISTENT (PutBlobOptions.Digests is only implemented in one transport, and silently ignored elsewhere) Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…eo copy Changes to image/copy/blob.go: - Use digests.Options.Choose() to validate digest changes Changes to image/copy/copy.go: - Add SetForceDigestAlgorithm() helper to configure digest forcing Changes to image/copy/single.go: - Add updateManifestConfigDigest() helper that uses typed manifest structures (OCI1/Schema2) to update config digest, avoiding generic JSON manipulation - Modify copyConfig() to return the new config digest when changed - Update copyUpdatedConfigAndManifest() to handle config digest changes by calling updateManifestConfigDigest() when needed Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
|
✅ A new PR has been created in buildah to vendor these changes: containers/buildah#6597 |
mtrmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is valuable.
On the process side, I’m not quite sure how to go about building this — I’d like to avoid exposing and API that plain doesn’t work for some transports. I think I’d prefer:
- Building this ~one complete feature at a time, e.g. if we add
PutBlobOptions.Digests, then actually implement it in all transports immediately. We should never be failing to enforceForceDigest…and reporting success. (In many cases, aSituation{CannotChangeDigestReason: "a sha256-only transport not updated yet", digest: sha256Value}in a transport would be fine, that’s what the tracker is for, as long as the enforcement of the rules is all there immediately.) - Not providing the public API until all pieces are in place. If we need something for testing, a temporary
BrokenSetForceDigestAlgorithm /* Unstable API, may be changed or removed any time */?
Without a deep review or revisiting earlier PRs, I think missing pieces here are
- multi-platform images: should not be too difficult, but is fairly orthogonal
TryReusingBlob: this +BlobInfoCachewill be significant, it has ~the same prerequisite question “what layer reuse is valid cross-digest” as c/storage — and that’s a major reason why I’d prefer to build this a feature at a time, so that we can get plainPutBlobor configs done without blocking on this- The
PutBlobPartialcode path (plausibly should be just a failure becausepull --force-digest=makes little sense, and would be difficult to implement, I’m not sure) - DiffID computation (maybe keep sha256 and fail on schema1??)
ZstdChunkedcompression using sha512 digests for all its data
| } | ||
| if stream.info.Digest != "" && uploadedInfo.Digest != stream.info.Digest { | ||
| return types.BlobInfo{}, fmt.Errorf("Internal error writing blob %s, blob with digest %s saved with digest %s", srcInfo.Digest, stream.info.Digest, uploadedInfo.Digest) | ||
| expectedAlgo, err := ic.c.options.digestOptions.Choose(digests.Situation{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is redundant / conflicting with PutBlob… making its own choice. If we had the generic code choosing a specific algorithm and expecting precisely that one to be used, then it should do that before PutBlob and provide that to transports as a parameter.
I’m not sure where the decision should be, at this point [and it’s a private interface, so we can change our mind] — I guess the transport might have some information (e.g. about remote registry capabilities) that the generic code doesn’t — but, for now, we don’t have all transports updated yet, either way…
AFAICS this only exists as a sanity check (both config and layer copies later enforce equality if cannotModifyManifest…), so I think an “if algorithms match the whole values must match” check here would be sufficient.
| } | ||
|
|
||
| // SetForceDigestAlgorithm forces the use of a specific digest algorithm for this copy operation. | ||
| func (o *Options) SetForceDigestAlgorithm(algo digest.Algorithm) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Yes, I’m generally in favor of adding more options via methods rather than fields. And we can always add something vaguely like SetForceDigestOptions later if we decided to make that fully public.)
I’m wondering about naming this ForceDestinationDigest… — we might eventually want a “pulls must consume sha512 or stronger digests and refuse to work on sha256”, and that’s a fairly unrelated feature.
| if err != nil { | ||
| return fmt.Errorf("failed to set force-digest algorithm: %w", err) | ||
| } | ||
| o.digestOptions = digestOpts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fail if this is already set? It might protect against oversights. OTOH it can also prevent some code patterns — if that turned out to be troublesome, we can always remove that enforcement later.
| // only to allow gradually building the feature set. | ||
| // After c/image/copy consistently implements it, provide a public digest options API of some kind. | ||
| optionsCopy := *options | ||
| // Only set default if not already configured |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(We can avoid the whole optionsCopy if digestOptions != nil.)
| if destInfo.Digest != srcInfo.Digest { | ||
| return fmt.Errorf("Internal error: copying uncompressed config blob %s changed digest to %s", srcInfo.Digest, destInfo.Digest) | ||
| // Allow digest algorithm changes when forcing a specific digest algorithm | ||
| forcingDifferentAlgo := ic.c.options.digestOptions.MustUseSet() != "" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- See the similar check
copyBlobFromStream - “MustUse” Is not the only reason a copy might choose a different algorithm (e.g. we might be pushing to a sha256-only-transport or registry), hard-coding a check only for
MustUsehere is overly specific - Compare the end of
copyLayers, where the equality check is immediately paired with manifest edits [and see elsewhere about manifest edits], so that we always set up edits if we know they are needed, and we minimize the risk of the two getting out of sync. I’m not sure whether the check+update should live incopyConfigorcopyUpdatedConfigAndManifest(guessing the latter), but they should be together.
|
|
||
| // updateManifestConfigDigest uses typed manifest structures instead of generic JSON manipulation. | ||
| // This leverages the existing manifest parsing and serialization infrastructure. | ||
| func (ic *imageCopier) updateManifestConfigDigest(manifestBlob []byte, src types.Image, newConfigDigest digest.Digest) ([]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to go through the manifest abstraction in c/image/internal/image and not be hard-coded in c/image/copy.
(It can be a method on image.Sourced/genericManifest, like CanChangeLayerCompression, without making it a public API, at least for now.)
| } | ||
|
|
||
| ic.c.Printf("Writing manifest to image destination\n") | ||
| manifestDigest, err := manifest.Digest(man) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is another part that will need updating, for multi-arch images.)
|
FWIW containers/buildah#6597 hits |
I think #512 has a fix, but I’m not going to stay around today to wait for containers/buildah#6552 to confirm. |
I can go ahead with this. Let me know if you'd much rather the other preference though. |
|
I didn’t mean the two bullet points as alternatives, but as two parts of the same plan. My main concern here is having some way of tracking what doesn’t work yet; all of that will need to either be implemented or at least be updated to cleanly fail. So I was thinking “one feature at a time, in all transports”. Doing it that way also means we immediately run into difficulties if some transports’ structures make a design difficult. There may be other ways to track things or to structure the implementation. And maybe that whole discussion is premature, and we need a prototype PR, or a few, sort of like this, testing the full workflow to just run into things and see what are the things that will need to be done. |
Depends on #512 . Only see HEAD commit here.