Skip to content

overlay: tolerate non-must-copy xattr failures during copy-up#13262

Open
drewmacneil wants to merge 1 commit into
google:masterfrom
drewmacneil:drew/overlay-copyup-xattr-tolerance
Open

overlay: tolerate non-must-copy xattr failures during copy-up#13262
drewmacneil wants to merge 1 commit into
google:masterfrom
drewmacneil:drew/overlay-copyup-xattr-tolerance

Conversation

@drewmacneil
Copy link
Copy Markdown
Contributor

Summary

Fixes mkdir, open(O_CREAT), and similar operations failing with EOPNOTSUPP inside the sandbox when the lower-layer filesystem exposes a non-must-copy security.* xattr (e.g. security.selinux on a container rootfs labeled by Docker/containerd on an SELinux-enabled host).

copyXattrsLocked in pkg/sentry/fsimpl/overlay/copy_up.go iterates lower-layer xattrs and copies each to upper. Until 4e559c2af ("gofer: Add full xattr support"), the gofer's ListXattrAt was a stub returning EOPNOTSUPP, so the loop's function-level early-return short-circuited it and copy-up never saw real host xattrs. After 4e559c2, the gofer calls unix.Flistxattr on the host file, so copy-up now sees names like security.selinux. The VFS permission check (pkg/sentry/vfs/permissions.go:317-324) rejects writes to security.* other than security.capability with EOPNOTSUPP, and the previous unconditional return err in the loop propagated that error all the way out to userspace. The "should not regress" claim in 4e559c2's commit message considered GetXattrAt for user.overlay.opaque lookups, not the new ListXattrAt exposing arbitrary host xattrs to copy-up.

This change matches Linux's fs/overlayfs/copy_up.c:ovl_copy_xattr, which tolerates per-xattr failures unless the xattr is in ovl_must_copy_xattr (POSIX ACLs and user.*). For SELinux specifically, Linux relies on the LSM to re-label the upper file independently rather than copying the xattr verbatim — gVisor's tmpfs upper has no SELinux LSM, so the xattr is simply absent on upper, which matches Linux behavior with SELinux disabled.

Repro

Triggered by any host filesystem that exposes a security.* xattr (other than security.capability) on the lower layer. SELinux labeling of container rootfs files is the common in-the-wild trigger — every file under a containerd-managed bundle on Amazon Linux 2023 carries security.selinux="system_u:object_r:container_ro_file_t:s0". Observed sentry log and strace excerpt:

copy_up.go:402] [1:1] failed to copy up xattrs because SetXattrAt failed:
                       operation not supported on transport endpoint
strace.go:602]  [1:1] node X mkdir(/app/sandbox-data/workspace, 0o777) = -1 errno=95

The user-visible symptom is mkdir(2) (or any operation that triggers copy-up of the parent) returning EOPNOTSUPP, after which the in-sandbox process typically exits because it can't create its workspace.

Test

Two regression tests in test/syscalls/linux/mount.cc:

  • OverlayfsCopyUpSkipsUnsupportedSecurityXattr — sets security.selinux and user.test_marker on a lower directory, mounts overlay with a tmpfs upper, triggers copy-up via mkdir on a child path, and asserts that mkdir succeeds, the must-copy user.test_marker was copied to upper, and the skipped security.selinux was not. Without the fix, mkdir fails with EOPNOTSUPP.
  • OverlayfsCopyUpFailsForMustCopyXattr — guards the must-copy boundary. Sets system.posix_acl_default on a lower directory and triggers copy-up against an upper that rejects POSIX ACLs (probed at setup time). Asserts copy-up still aborts as before, so the fix isn't blanket-permissive.

Verified with:

  • bazel test //test/syscalls:mount_test_runsc_systrap --test_filter="*OverlayfsCopyUp*"
  • bazel test //test/syscalls:mount_test_runsc_ptrace --test_filter="*OverlayfsCopyUp*"
  • bazel test //pkg/sentry/fsimpl/overlay:overlay_test

Assisted-by: Claude Opus 4.7

@drewmacneil drewmacneil marked this pull request as ready for review May 23, 2026 05:01
Copy link
Copy Markdown
Collaborator

@ayushr2 ayushr2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the bug report. I can write the fix. I will add you as a co-author.

const char kSelinuxLabel[] = "system_u:object_r:container_ro_file_t:s0";
if (setxattr(lower_subdir.c_str(), "security.selinux", kSelinuxLabel,
sizeof(kSelinuxLabel) - 1, 0) < 0) {
SKIP_IF(errno == EOPNOTSUPP || errno == ENOTSUP);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test will effectively not run in gVisor. I think tmpfs will also return EOPNOTSUPP for this:

return linuxerr.EOPNOTSUPP

Comment on lines +372 to +377
// abort the copy-up. Mirrors Linux's fs/overlayfs/util.c:ovl_must_copy_xattr.
func mustCopyXattr(name string) bool {
return name == "system.posix_acl_access" ||
name == "system.posix_acl_default" ||
strings.HasPrefix(name, linux.XATTR_USER_PREFIX)
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux's fs/overlayfs/util.c:ovl_must_copy_xattr() function actually does not include XATTR_USER_PREFIX. It includes XATTR_SECURITY_PREFIX.

// without CAP_SYS_ADMIN; ENODATA covers an xattr that
// disappears between ListXattrAt and GetXattrAt.
if !mustCopyXattr(name) && (linuxerr.Equals(linuxerr.EOPNOTSUPP, err) ||
linuxerr.Equals(linuxerr.EPERM, err) ||
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we ignore the EPERM here? Linux doesn't ignore it.

Comment on lines +2707 to +2708
GTEST_SKIP() << "upper filesystem accepts POSIX ACLs; cannot test "
"must-copy failure";
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is also moot in gVisor since we don't support system.posix_acl_default xattr.

Copy link
Copy Markdown
Collaborator

@ayushr2 ayushr2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#13289 supersedes this

@drewmacneil does it fix your reproducer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants