Skip to content

Commit 14f716e

Browse files
committed
RFC 140: Runner Host Mounts
Signed-off-by: Aria <me@aria.rip>
1 parent 66dc7ff commit 14f716e

File tree

1 file changed

+76
-0
lines changed

1 file changed

+76
-0
lines changed

140-runner-host-mounts/proposal.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# Summary
2+
3+
This proposal introduces a mechanism through which tasks can request files, directories, or
4+
block devices to be mounted from the host the worker is on to the task's container.
5+
6+
The worker must be configured to allow tasks to mount a given location, which we will make clear
7+
is a last-resort option when other mechanisms are unsuitable.
8+
9+
# Motivation
10+
11+
When performing tasks that require access to a device on the host, such as a GPU, it is required to
12+
make the 'files' that the kernel creates available in a task's container.
13+
14+
For instance, to use an AMD GPU within a container, the container needs access to these paths from the host:
15+
16+
- `/dev/kfd` - GPU compute interface
17+
- `/dev/dri` - folder containing an interface for each GPU.
18+
19+
Currently, it is not possible to run these workloads using Concourse.
20+
21+
# Proposal
22+
23+
## Worker configuration option `--allowed-host-mounts`
24+
25+
Add a new configuration option to the worker command, `--allowed-host-mounts`:
26+
The value should be a regular expression. Any host path that the expression fully matches will be *allowed* to be
27+
mounted into a task's container if it is requested.
28+
29+
This option can be specified only once, but can use `|` to match separate paths as required.
30+
31+
## Task configuration option `host_mounts`
32+
33+
Add a new configuration key (`host_mounts`) to the `task-config` schema, which can be used to request paths from the host to be
34+
mounted into that task's container.
35+
36+
Formally:
37+
38+
- `host_mounts` is a list of `host-mount-config`s, defaulting to `[]`
39+
- A `host-mount-config` is either:
40+
- An object, with the following keys:
41+
- `host` (required, non-empty): The desired path to mount from the host
42+
- `container` (optional): The mount's location in the container. Defaults to the same as `host`
43+
- More options may be added down the line, as required.
44+
- A string, of format:
45+
- `host:container`, where `host` and `container` are strings not containing `:`, and have meanings as defined above.
46+
- or, `host`, leaving `container` to default to `host` as above
47+
- This is just a shorthand syntax, similar to the one Docker & Docker Compose uses.
48+
49+
## New behaviour when spawning task containers
50+
51+
If the task's `host_mounts` list is empty, then behaviour is unchanged.
52+
53+
If not, then each element in the list must be validated:
54+
55+
- The `host` key must be fully matched by the `--allowed-host-mounts` regex.
56+
- If the `--allowed-host-mounts` option was not specified, validation fails automatically.
57+
- The path referred to by the `host` key must exist from the worker's perspective.
58+
- Permissions, writability, etc, are not checked.
59+
60+
If any of the elements do not validate, the task fails with an error message explaining why.
61+
62+
If validation has succeeded, each requested path is mounted into the task's container, using the container runtime's native feature for doing so.
63+
64+
# Open Questions
65+
66+
- If the worker is itself running in a docker container, the mount will need to also be specified there. This will need to be made clear to users of the feature
67+
- Mounting directories from the host opens users up to weird permissions errors, which are often counterintuitive. For instance, group membership is not inherited from the worker user, so if the mounted directory is not world-accessible and its user/group ID doesn't match with what's specified in the container, users will get permission errors.
68+
- As it stands, no consideration to host mounts is proposed during scheduling. On heterogenous clusters, worker tags can be used for this, but is this good enough?
69+
70+
# Answered Questions
71+
72+
# New Implications
73+
74+
Adding this feature allows users to run new, increasingly common, types of workloads using Concourse.
75+
Whilst it could also be used to introduce worker state, which we consider an antipattern, the intended use and limitations
76+
of this feature should discourage users from doing so unnecessarily.

0 commit comments

Comments
 (0)