fix(controller): reference agent images by tag for private registries#2076
fix(controller): reference agent images by tag for private registries#2076igabi wants to merge 1 commit into
Conversation
Declarative agent images are pinned by their link-time digest (registry/repository@sha256:...) injected via controller-digest-ldflags.sh. When --image-registry / --image-repository point to a private registry, only the registry and repository are rewritten while the upstream digest is kept, so the controller emits a reference that is not resolvable in registries that do not preserve the upstream manifest digest. The agent pods then fail with ImagePullBackOff / manifest unknown, and --image-tag is silently ignored for declarative agents. Add a --pin-runtime-image-digest flag (env PIN_RUNTIME_IMAGE_DIGEST), default true to preserve current behaviour. When set to false, runtime image resolution falls back to a tag reference (registry/repository:tag) via the existing ImageConfig.PinnedImage(), restoring the pre-0.9.7 behaviour for operators mirroring images into a private registry. The Helm chart exposes this as controller.pinRuntimeImageDigest and plumbs it to the controller ConfigMap. Fixes kagent-dev#2055 Signed-off-by: Gabriel Ichim <gaby.10b@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes declarative agent ImagePullBackOff / manifest unknown failures when using private registries that don’t preserve upstream image digests, by allowing the controller to emit tag-based runtime image references instead of always using link-time digests.
Changes:
- Added a
--pin-runtime-image-digest/PIN_RUNTIME_IMAGE_DIGESTswitch (defaulttrue) to control digest vs tag references for declarative runtime images. - Refactored Go/Python runtime image resolution to share a
resolveRuntimeImagehelper and improved the “digest not set” error guidance. - Exposed
controller.pinRuntimeImageDigestin the Helm chart and added Helm + Go unit tests covering tag fallback behavior.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| helm/kagent/values.yaml | Adds chart value controller.pinRuntimeImageDigest (default true) with operator-facing docs. |
| helm/kagent/tests/controller-configmap_test.yaml | Adds Helm unit tests asserting the default and overridden PIN_RUNTIME_IMAGE_DIGEST ConfigMap value. |
| helm/kagent/templates/controller-configmap.yaml | Plumbs controller.pinRuntimeImageDigest into the controller ConfigMap as PIN_RUNTIME_IMAGE_DIGEST. |
| go/core/pkg/app/app.go | Introduces the --pin-runtime-image-digest flag wired to the translator package global. |
| go/core/internal/controller/translator/agent/runtime_test.go | Adds unit tests validating that disabling pinning produces tag-based image references (no @sha256:). |
| go/core/internal/controller/translator/agent/deployments.go | Implements shared resolveRuntimeImage and updates runtime image resolution to honor the new flag. |
| go/core/internal/controller/translator/agent/adk_api_translator.go | Adds the PinRuntimeImageDigest default and documentation for the new behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
EItanya
left a comment
There was a problem hiding this comment.
So I actually think that a different fix is in order here. We switched to digests because substrate does not allow tags, but that surfaced an implementation detail of that system into the broader kagent. Rather I think we should switch back to tags for standard deployments, and then allow overriding of digests for substrate. What do you think?
|
runtime_test.go -- pinruntimeImagedigest=false is tested for the golang-adk path but not for the golang-adk-full path (skills-enabled agents). is this intentional? |
What
Declarative agent pods fail with
ImagePullBackOff/manifest unknownwhen the controller is configured to use a private registry that does not preserve the upstream image manifest digest.Since v0.9.7, declarative agent images are pinned by their link-time digest (
registry/repository@sha256:...), injected viacontroller-digest-ldflags.sh. When--image-registry/--image-repositorypoint to a private registry, the controller rewrites only the registry and repository segments but keeps the hardcoded upstream digest. Many mirrors (e.g.az acr importtargets, pull-through proxies that re-push) assign a different manifest digest, so the emitted reference does not resolve.--image-tagis silently ignored for declarative agents, so there is no way to fall back to a tag.This is the behaviour reported in #2055.
Fix
Add a
--pin-runtime-image-digestflag (envPIN_RUNTIME_IMAGE_DIGEST), defaulting totrueso current digest-pinning behaviour is unchanged.When set to
false, the Python and Go runtime image resolvers fall back to a tag reference (registry/repository:tag) through the existingImageConfig.PinnedImage()helper — restoring the pre-0.9.7 behaviour for operators mirroring images into a registry that does not preserve digests.Both runtime resolvers now share a small
resolveRuntimeImagehelper, and the "digest not set at link time" error now points operators at the new flag.The Helm chart exposes this as
controller.pinRuntimeImageDigestand plumbs it to the controllerConfigMap(picked up by the existing env → flag loading).Testing
go test ./core/internal/controller/translator/agent/...): addedTestRuntime_PythonRuntime_TagReferenceWhenDigestPinningDisabledandTestRuntime_GoRuntime_TagReferenceWhenDigestPinningDisabled, asserting the rendered deployment references the image by tag (/app:<tag>,/golang-adk:<tag>) and contains no@sha256:when pinning is disabled. Existing digest-pinning tests continue to pass.tests/controller-configmap_test.yamlassertingPIN_RUNTIME_IMAGE_DIGESTis"true"by default and"false"whencontroller.pinRuntimeImageDigest=false.go vetandgofmtclean on the changed packages.Usage
Fixes #2055
🤖 Generated with Claude Code