Skip to content

Intermittent create-workdir-failed for Docker images using system users with no home directory #1089

@plamen-bardarov

Description

@plamen-bardarov

Current behavior

When running a Docker-based application that defines a system user without a home directory (using --no-create-home), Guardian intermittently fails to create the working directory for lifecycle processes (such as diego-sshd or launcher).

Context

The issue arises when a Dockerfile is configured as follows:

RUN adduser --system --no-create-home --uid 1000 robot
USER 1000

Because the DesiredLRP for these actions does not explicitly set a Dir (working directory) property, Guardian appears to default to the home directory specified in /etc/passwd. If that directory does not exist and the user lacks write permissions to /home, the process fails.

Error Logs

Guardian Error:

{
  "timestamp": "2025-11-12T07:16:20.429992127Z",
  "level": "error",
  "source": "guardian",
  "message": "guardian.run.exec-with-bndl.create-workdir-failed",
  "data": {
    "error": "exit status 2",
    "handle": "a65e7033-2307-4e8f-70af-d8e5",
    "path": "/tmp/lifecycle/diego-sshd",
    "session": "13323029.2"
  }
}

App Logs (CF CLI):

2025-11-07T14:35:50.64+0200 [CELL/SSHD/0] OUT failed-creating-process: exit status 2
...
2025-11-07T14:36:04.83+0200 [APP/PROC/WEB/0] OUT failed-creating-process: exit status 2

Observations

  1. Intermittency: The failure is intermittent. Sometimes the container starts successfully, while other times it fails for either /tmp/lifecycle/launcher or /tmp/lifecycle/diego-sshd.
  2. Default Behavior: It appears Guardian tries to mkdir the home directory found in /etc/passwd. If --no-create-home was used, this path usually points to a non-existent directory under /home, where a non-root user (UID 1000) typically has no write permissions.
  3. Documentation Gap: Current CF documentation mentions that Docker images must contain an /etc/passwd entry for root and that the home directory for root must be present. However, it is unclear if Cloud Foundry supports system users without home directories for non-root workloads.

Steps to Reproduce

  1. Create a Docker-based app with a non-root system user and no home directory (adduser --system --no-create-home).
  2. Push the app to Cloud Foundry.
  3. Enable SSH for the app.
  4. Restage/Restart several times to observe the intermittent exit status 2 during the creation of the diego-sshd or launcher processes.

Possible Cause

Might be a race condition or a specific logic path in Guardian https://github.com/cloudfoundry/guardian/blob/bc2c20cdaedbdd6082fde908909b84ab804a981b/rundmc/runrunc/execer.go#L75

Desired behavior

Docker apps being able to run with system users with no home dir or the documentation stating it's not possible.

Affected Version

develop

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Inbox

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions