CVE-2026-31431 ("Copy Fail") is a Linux kernel privilege escalation vulnerability
in the algif_aead cryptographic interface. An attacker uses AF_ALG sockets with
the authencesn algorithm and splice() to corrupt arbitrary files in the kernel
page cache — including setuid binaries like /usr/bin/su.
This document provides a zero-reboot remediation using a BPF LSM DaemonSet
that blocks all AF_ALG AEAD binds — the subsystem exploited by Copy Fail. This
prevents bypasses via crypto template nesting (e.g. pcrypt(authencesn(...))).
Other AF_ALG usage (hash, skcipher) is unaffected. Tested end-to-end on three
separate OCP 4.22 clusters.
# 1. Verify BPF LSM is enabled (All versions of RHEL CoreOS enable this by default)
oc debug node/<any-node> -- chroot /host cat /sys/kernel/security/lsm
# Must contain "bpf"
# 2. Deploy the namespace and grant privileged SCC
oc apply -f daemonset.yaml
# 3. DaemonSet pods will start automatically on all nodes
# 4. Verify
oc get pods -n cve-2026-31431-mitigation-ebpf # All nodes should show Running
oc logs -n cve-2026-31431-mitigation-ebpf -l app=block-copyfail
# Expected: "block-copyfail: blocker active — all AF_ALG AEAD binds blocked"No reboots. No node drains. No pod restarts. Protection is immediate and covers all processes on all nodes (100% coverage).
- How the Exploit Works
- Confirming Vulnerability on Your Cluster
- BPF LSM DaemonSet Deployment
- Post-Deployment Verification
- Building the Image from Source
- Removal
The exploit chains three kernel features:
- AF_ALG socket — creates a userspace handle to kernel crypto via
socket(AF_ALG, SOCK_SEQPACKET, 0) - AEAD bind — binds to
authencesn(hmac(sha256),cbc(aes)), a specific authenticated encryption algorithm - splice() + sendmsg() — the kernel incorrectly performs an "in-place" operation where source and destination page mappings differ, corrupting the page cache of a read-only file
The attacker corrupts /usr/bin/su in the page cache (without write access to
the file), then executes it to gain root.
Create a new cve-2026-31431-test namespace on your cluster and run the test script by appling the manifests in the test directory:
oc apply -f testCheck the results:
oc wait pod/cve-test -n cve-2026-31431-test \
--for=jsonpath='{.status.phase}'=Succeeded --timeout=120s
oc -n cve-2026-31431-test logs -l app=cve-2026-31431-testOn a vulnerable cluster you will see:
=== CVE-2026-31431 Vulnerability Test ===
Target: /usr/bin/su
Original SHA256: 8969560ae8e6e21c6184c1451f59418822ee69dd5d946d71987b55236bbc0feb
Attempting splice + AF_ALG page-cache corruption (160 bytes in 40 chunks)...
After SHA256: 30b0f5b5a054c4df65b48ca792863bf7054b4d793f15f57163792ba6c2b151ae
PAGE CACHE CORRUPTION: YES - /usr/bin/su was modified in the page cache
Attempting to execute corrupted /usr/bin/su ...
exit code: 0
RESULT: PARTIALLY MITIGATED
Page-cache corruption succeeded (kernel is vulnerable)
Privilege escalation blocked (allowPrivilegeEscalation=false)
oc delete namespace cve-2026-31431-testThe BPF LSM approach hooks socket_bind at the kernel level and blocks all
AF_ALG AEAD binds regardless of template nesting. It is based on
block-copyfail, rewritten in C
with libbpf for OCP deployment.
BPF LSM must be enabled. RHEL CoreOS 9.8 (OCP 4.22) has it enabled by default. Verify with:
oc debug node/<any-node> -- chroot /host cat /sys/kernel/security/lsmExpected output includes bpf:
lockdown,capability,landlock,yama,selinux,bpf
If bpf is not present, a one-time MachineConfig is needed (this is the
only scenario requiring a reboot):
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 99-enable-bpf-lsm
spec:
kernelArguments:
- lsm=lockdown,capability,selinux,bpfCreate a new cve-2026-31431-mitigation-ebpf namespace, grant SCC, and deploy the DaemonSet by applying the daemonset.yaml manifest.
The privileged SCC must be granted before the DaemonSet pods are created,
otherwise pod creation will fail with SCC validation errors.
oc apply -f daemonset.yamloc get pods -n cve-2026-31431-mitigation-ebpf -o wideExpected: one pod per node, all Running:
NAME READY STATUS AGE NODE
block-copyfail-2jhzf 1/1 Running 34s ci-...-master-2
block-copyfail-4dfq7 1/1 Running 34s ci-...-master-1
block-copyfail-c2ts8 1/1 Running 34s ci-...-worker-c
block-copyfail-ctblk 1/1 Running 34s ci-...-worker-a
block-copyfail-m26sx 1/1 Running 34s ci-...-worker-b
block-copyfail-xsh6d 1/1 Running 34s ci-...-master-0
oc logs -n cve-2026-31431-mitigation-ebpf -l app=block-copyfailExpected:
block-copyfail: blocker active — all AF_ALG AEAD binds blocked
Re-run the same exploit test from the Confirming Vulnerability section.
After deploying the BPF LSM DaemonSet, the output will be:
=== CVE-2026-31431 Vulnerability Test ===
Target: /usr/bin/su
Original SHA256: 30b0f5b5a054c4df65b48ca792863bf7054b4d793f15f57163792ba6c2b151ae
Attempting splice + AF_ALG page-cache corruption (160 bytes in 40 chunks)...
AF_ALG bind failed: [Errno 1] Operation not permitted
RESULT: CANNOT TEST - AF_ALG or splice not available/permitted
The DaemonSet logs will show the blocked attempt:
oc logs -n cve-2026-31431-mitigation-ebpf -l app=block-copyfailblock-copyfail: blocker active — all AF_ALG AEAD binds blocked
block-copyfail: BLOCKED pid=16777 comm=python3 time=2026-05-01 16:37:23
Run verify-algos.py on a node to confirm that all AEAD algorithms are blocked
while other AF_ALG types (hash, skcipher) continue to work:
oc debug node/<any-node> -- chroot /host python3 -c "
import socket
tests = [
('aead', 'gcm(aes)'),
('aead', 'ccm(aes)'),
('aead', 'rfc4106(gcm(aes))'),
('hash', 'sha256'),
('skcipher', 'cbc(aes)'),
('aead', 'authencesn(hmac(sha256),cbc(aes))'),
]
for t, n in tests:
s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0)
try:
s.bind((t, n))
print(f' ALLOWED {t}/{n}')
except OSError as e:
print(f' BLOCKED {t}/{n} -- {e}')
finally:
s.close()
"Expected output:
BLOCKED aead/gcm(aes) -- [Errno 1] Operation not permitted
BLOCKED aead/ccm(aes) -- [Errno 1] Operation not permitted
BLOCKED aead/rfc4106(gcm(aes)) -- [Errno 1] Operation not permitted
ALLOWED hash/sha256
ALLOWED skcipher/cbc(aes)
BLOCKED aead/authencesn(hmac(sha256),cbc(aes)) -- [Errno 1] Operation not permitted
This confirms the BPF LSM blocks all AEAD binds while leaving other AF_ALG types functional.
The BPF LSM blocker source is in block-copyfail/:
block-copyfail/
block_copyfail.bpf.c # BPF kernel program (LSM hook)
block_copyfail.c # Userspace loader (libbpf skeleton)
block_copyfail.h # Shared event struct
Makefile # Build pipeline
Dockerfile # Multi-stage build
daemonset.yaml # Namespace + DaemonSet manifest
trigger-test.py # Quick validation script
Build and push:
cd block-copyfail/
podman build -t quay.io/<org>/block-copyfail:latest .
podman push quay.io/<org>/block-copyfail:latestThe Dockerfile uses a multi-stage build: Fedora with clang/bpftool/libbpf-devel for compilation, UBI 9 minimal for the runtime image (~122 MB).
Deleting the DaemonSet immediately removes the mitigation on all nodes:
oc delete -f daemonset.yaml
# or
oc delete namespace cve-2026-31431-mitigation-ebpfThe BPF program detaches automatically when the loader process exits. No reboot or pod restart is needed.