Skip to content

fix(GH-2284): Put docker:dind in proper cgroup parent#4480

Open
marverix wants to merge 1 commit into
actions:masterfrom
cdqag:GH-2284
Open

fix(GH-2284): Put docker:dind in proper cgroup parent#4480
marverix wants to merge 1 commit into
actions:masterfrom
cdqag:GH-2284

Conversation

@marverix
Copy link
Copy Markdown

@marverix marverix commented May 4, 2026

Hello,

This PR is fix for ticket openeded since 2023: #2284 .

The idea is to use dockerd cgroup-parent flag to put dind containers in proper control groups.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to address GH-2284 by configuring the Docker-in-Docker (docker:dind) daemon to place nested Docker containers under the runner Pod’s cgroup via dockerd --cgroup-parent, improving cgroupv2 resource accounting/limits behavior.

Changes:

  • Add a --cgroup-parent=... flag to the dind dockerd args.
  • Inject the Pod UID into the dind container via the Downward API (metadata.uid) for use in the cgroup parent path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +118 to +125
- --cgroup-parent=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod${POD_UID//-/_}.slice
env:
- name: DOCKER_GROUP_GID
value: "123"
- name: POD_UID
valueFrom:
fieldRef:
fieldPath: metadata.uid
- dockerd
- --host=unix:///var/run/docker.sock
- --group=$(DOCKER_GROUP_GID)
- --cgroup-parent=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod${POD_UID//-/_}.slice
- dockerd
- --host=unix:///var/run/docker.sock
- --group=$(DOCKER_GROUP_GID)
- --cgroup-parent=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod${POD_UID//-/_}.slice
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

@fstr
Copy link
Copy Markdown

fstr commented May 11, 2026

This patch works, but the containers started via the dind docker process are only able to read the docker sidecar container cgroupv2 slice (or pod level resource if you run on Kubernetes 1.34+ and set pod memory limits) if --default-cgroupns-mode=host is set.

This allows the dind created nested cotainer to see the full cgroup tree of the host and find its parent cgroup (the pod or container cgroup).

Using --default-cgroupns-mode=host leaks some information into the runner about what else runs on the system, which is useful information to an attacker if the runner is compromised, so use with care.

@marverix
Copy link
Copy Markdown
Author

After deeper investigation this will not work properly. There are 2 reasons:

  • As Copilot pointed out, this will work only for burstable Pods, so it must be more generic
  • 2nd, more problematic, is the docker:dind image itself. This image contains /usr/local/bin/dind bash script, which is doing amendments in /sys/fs/cgroups. To be more precise: it moves processes around. If you run one runner (with dind) it's let's say all right, although I despise the idea of such hacks when the dind has privileged rights. But when number of runners, using the same scrips, all having root privileges, trying to tinker around on hosts's /sys/fs/cgroups... it's a nightmare. In our company we ended forking docker:dind , and creating own image that supports K8s cgroups v2.

I don't see much option rather than github arc going the same direction: to have own dind image with modified dockerd-entrypoint and dind.

After modifications, all OOM issues on nodes ended in our company. Previously it was "WTF" all the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants