ci: publish CI images to internal registry first#4002
Conversation
|
- Update .gitlab/ci-images.yml to change the default CI_REGISTRY to registry.ddbuild.io and target the ddbuild registry path registry.ddbuild.io/ci/dd-trace-php/dd-trace-ci. - Make docker logins dynamic to support local builds, Docker Hub logins, and AWS ECR logins depending on the target registry server. - Bypass runner credential helper issues in Linux container environments by resetting ~/.docker/config.json. - Make registry and base image names fully configurable in docker-compose.yml and Dockerfiles, allowing parent base images to be dynamically resolved from ddbuild during child compilation steps.
- Update all GitLab CI generator scripts (.gitlab/generate-*.php) to use internal CI images from registry.ddbuild.io/ci/dd-trace-php/dd-trace-ci instead of pulling from Docker Hub via the mirror path. - This ensures test jobs use the newly compiled images directly from our project's ECR registry namespace.
- Add a new 'ci-publish' stage to .gitlab-ci.yml. - Implement 4 parallel matrix trigger jobs in .gitlab/ci-images.yml (Publish CentOS, Publish Bookworm, Publish Alpine, and Publish Windows) to run automatically after their respective build jobs succeed. - Each trigger calls the DataDog/public-images pipeline, passing the corresponding internal ddbuild ECR image as source and targeting public Docker Hub as destination under the exact same tag.
- Update all occurrences of bookworm-8 and shared-ext-8 to bookworm-9 and shared-ext-9 globally across .gitlab CI test generators, .gitlab/ci-images.yml, and .github workflows. - Update BOOKWORM_VERSION from 8 to 9 in tooling/bin/build-debug-artifact to ensure local debug builds pull and compile with the new version.
- Export MAKEFLAGS=-j at the top of build-extensions.sh. - This forces all underlying make invocations triggered by pecl install (including the heavy single-threaded gRPC, MongoDB, and parallel builds) to compile in parallel, drastically reducing build times on multi-core runner environments.
- Remove obsolete CI_REGISTRY, CI_REGISTRY_USER, and CI_REGISTRY_TOKEN from .gitlab/ci-images.yml. - Remove all complex, dynamic ECR/Docker Hub login shell blocks and AWS CLI installations from CentOS, Alpine, Bookworm, and Windows build jobs. - Rely entirely on the runner's native, pre-configured credentials for registry.ddbuild.io, significantly simplifying the pipeline configuration.
- Clean up dockerfiles/ci/README.md to document the new automated, secure internal ECR build flow. - Clarify that project collaborators no longer need to configure Personal Access Tokens (PATs) or credentials when building CI images. - Document how to trigger the manual sync to the public Docker Hub registry via downstream triggers in the 'ci-publish' stage.
228828a to
ba9d133
Compare
The image list (PHP versions and tags) is derived from the docker-compose.yml
+ .env files in each dockerfiles/ci/<os>/ dir (single source of truth).
.gitlab/generate-ci-images.php renders .gitlab/ci-images.yml.tpl, emitting per
Linux OS:
- <OS> build : one matrix job over PHP version; 'docker buildx bake
--no-cache --pull --push' builds both arches (x-bake
platforms from compose) on the amd64 runner's managed ci
builder and pushes a multi-arch manifest to
registry.ddbuild.io
- <OS> publish:<v>: manual mirror to Docker Hub via DataDog/public-images,
dependency-free (just syncs whatever is in the internal
registry)
Static preamble + Windows jobs live in .gitlab/ci-images.static.yml (Windows
is single-arch). The generator runs in generate-templates and is triggered as
a child pipeline via the manual 'ci-images' job; the old .gitlab/ci-images.yml
local include is removed.
ba9d133 to
8d5b7d5
Compare
Point the php-8.5 image at the 8.5.8RC1 RC sources (php-8.5_bookworm tracks the latest 8.5.x). Reverts to a distributions/ tarball once 8.5.8 ships GA (~2 Jul 2026); just update phpTarGzUrl + phpSha256Hash.
bake delegates the compile to the managed "ci" buildx builder instance, so the job pod only orchestrates and doesn't need 8 CPU / 16Gi. Master set no KUBERNETES_* on these jobs either — fall back to cluster defaults. MAKE_JOBS (builder compile parallelism) is kept, pinned to a literal 8 since it no longer derives from KUBERNETES_CPU_LIMIT.
Drop comments that referenced earlier in-branch states (per-arch + manifest fuse design, master's KUBERNETES_* settings, the old MAKE_JOBS derivation) and fix the generator docblock that still said manifest / per-service publish. Comments now describe only the current state.
Add a short 'How it works' overview (source of truth, generator, buildx-bake multi-arch build, public-images mirror) and how-to sections for adding/updating a PHP version and the Docker Hub UNAUTHORIZED publish gotcha, while keeping the local-build instructions.
The bookworm-9 CI images are rebuilt with 'pecl install parallel' (latest, >= 1.2.14), so the workaround that reinstalled the fixed parallel over the old 1.2.13 from the image is no longer needed.
The 8.5 tailcall VM crash is fixed in 8.5.8 (now built for bookworm via 8.5.8RC1), so 8.5 no longer needs excluding from the profiler language tests. The dedicated .php_language_profiler_targets anchor only existed for that exclusion and is now identical to .all_profiler_targets, so the language-test job uses that directly.
Windows images aren't built/pushed/mirrored to the internal registry.ddbuild.io/ci/dd-trace-php/dd-trace-ci (only the Linux images were migrated), so pulls 404'd (manifest unknown for php-8.4_windows). Revert the Windows image refs in the tracer/package generators back to the registry.ddbuild.io/images/mirror/datadog/dd-trace-ci mirror, matching master.
Merge ci-images.static.yml into ci-images.yml.tpl: the literal preamble (stages, job templates, Windows jobs) now lives at the top of the template, above the PHP loops that generate the Linux jobs. Literal text in a .tpl is emitted verbatim, so the Windows PowerShell needs no escaping — the separate file and the file_get_contents indirection bought nothing. Generated output is unchanged.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d019ffb8eb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "Bookworm" => "dockerfiles/ci/bookworm", | ||
| "CentOS" => "dockerfiles/ci/centos/7", | ||
| "Alpine" => "dockerfiles/ci/alpine_compile_extension", | ||
| ]; |
There was a problem hiding this comment.
We should also add windows here, to avoid listing every last of them explicitly. Should be trivial to do?
There was a problem hiding this comment.
I am currently trying to onboard with ci-identities, so we can have the same workflow for those windows jobs
Fold ci-images.yml.tpl into generate-ci-images.php: the parsing/logic runs at the top, then a single `?>` drops into the literal pipeline preamble (stages, job templates, Windows jobs) followed by the PHP loops that emit the Linux jobs. One file instead of two; generated pipeline is unchanged. Docblock, emitted header comment and README updated to drop the now-gone .tpl.
…s jobs - centos/alpine compose now pass CI_REGISTRY_IMAGE as a build arg (anchor on base, merged into php services) so PHP images build FROM the freshly-built internal base instead of the Docker Hub fallback (matches bookworm BUILD_BASE). - Windows build + publish jobs are generated from docker-compose.yml; the build matrix now includes the windows-base-* services so they exist in the internal registry before publish. Linux + Windows publish share .image_publish.
Now that the Windows build jobs push php-*_windows to registry.ddbuild.io/ci/dd-trace-php/dd-trace-ci, point the tracer/package generators at the internal registry instead of the Docker Hub mirror, matching the Linux images. Re-applies what ce65269 reverted (the blocker - Windows not pushed to the internal registry - is fixed by this PR). The httpbin-windows and php-request-replayer-2.0-windows helper images stay on the mirror; they aren't built here.
The Windows shell runner uses the host docker config, whose ecr-login credsStore fails `list` with MissingRegion during the anonymous mcr.microsoft.com base-image pull. Give the helper a region so it stops erroring, leaving the rest of the host config (incl. ambient registry.ddbuild.io auth) untouched.
…ing from mirror The Windows shell runners have no working registry.ddbuild.io creds (host docker config only has the ECR cred helper, which fails with NoCredentialProviders). Drop the dead-end AWS_REGION workaround and add the CI Identities id_tokens to the Windows build jobs, the supported auth for non-K8s runners (onboarding pending, #ci-identities). The assume-role + docker-config script wiring is deferred until it can be verified post-onboarding. Windows jobs are all manual, so they don't block the pipeline. Until internal Windows images exist, revert the tracer/package generators to consume Windows images from the Docker Hub mirror (ce65269 state) to avoid 404s.
64b4e3c to
68c282a
Compare
The Windows jobs failed pulling the ECR base image with 'MissingRegion' because 'docker compose' shells out and doesn't forward the runner identity to the registry credential helpers. Replace the docker-compose build + push two-step (and the docker-compose.exe download) with a single 'docker buildx bake --no-cache --pull --push', which runs through the docker CLI so the ECR credential helper is used as configured. bake reads the existing docker-compose.yml, so the targets are unchanged.
…d.io Roll the Windows build back to docker-compose (revert the bake attempt) and: * Scope Docker auth to a job-local DOCKER_CONFIG with only a ci-identities credHelper for registry.ddbuild.io. The ECR 'MissingRegion' came from Docker calling list on the default credsStore (ecr-login) during compose init, regardless of any Dockerfile FROM address — a runner-level config issue. * Pass --build-arg CI_REGISTRY_IMAGE so the Dockerfile FROM (ARG default datadog/dd-trace-ci) matches the registry.ddbuild.io image: push target that the inherited global CI_REGISTRY_IMAGE selects.
Now that the Windows build pushes php-*_windows to registry.ddbuild.io/ci/dd-trace-php/dd-trace-ci, point the Windows tracer test and package-build jobs at the internal images directly (like the Linux jobs), instead of the Docker Hub mirror round-trip. The httpbin-windows and request-replayer helper images are not built here, so they stay on the mirror.
Description
When building CI docker images, this PR changes the process to:
registry.ddbuild.io(Datadog internal container registry)public-imagesdownstream job to magically sync those images to Docker Hub for usage with GitHub CI and external contributorsdocker-compose.yml+.envfilesdocker-compose.ymltoo (previously hand-written)So how do I build images?
ci-imagesjob in the GitLab CI PipelineWins
Reviewer checklist