validate_usr_symlinks: match type=link as a whole mtree field#2099
validate_usr_symlinks: match type=link as a whole mtree field#2099MindflareX wants to merge 1 commit into
type=link as a whole mtree field#2099Conversation
There was a problem hiding this comment.
Code Review
This pull request improves the robustness of the symlink validation logic in the AWK script by replacing naive substring matches for type=link and link= with field-bounded regex patterns. This prevents false negatives when metadata fields like uname or gname contain the string 'type=link'. Corresponding regression tests were added to verify the fix. A review comment suggests using the [[:space:]] character class consistently when extracting the link target to ensure all POSIX-defined whitespace characters are handled correctly.
| if ($0 !~ /[[:space:]]type=link([[:space:]]|$)/) { | ||
| VIOLATIONS[original_path] = original_path " is not a symlink (must link to " expected[path] ")" | ||
| } else if (match($0, / link=([^ \t]+)/, dest) && dest[1] != expected[path]) { | ||
| } else if (match($0, /[[:space:]]link=([^ \t]+)/, dest) && dest[1] != expected[path]) { |
There was a problem hiding this comment.
For consistency with the use of [[:space:]] in the type=link check on line 27, consider using [^[:space:]]+ to match the link target. This ensures that any POSIX-defined whitespace character is correctly treated as a field boundary when extracting the link destination, making the regex more robust and consistent with the preceding check.
} else if (match($0, /[[:space:]]link=([^[:space:]]+)/, dest) && dest[1] != expected[path]) {
44fbbf2 to
143878f
Compare
The aspect's awk script used `$0 !~ /type=link/` to decide whether an entry at /bin, /sbin, /lib, /lib32, /lib64, or /libx32 was a symlink. Because the pattern was a plain substring match, any mtree field value elsewhere on the line that happened to contain the literal string "type=link" (for example `uname=type=link`, `gname=type=link`, or a `flags` value spelled the same way) would satisfy the check and the aspect would silently accept a non-symlink at that path. A symmetric issue existed in the ` link=` match, which could be anchored on the prior field's trailing characters rather than a field boundary. Anchor both regexes to whitespace boundaries so the check operates on whole mtree fields, and add regression test cases that fail under the old script and pass under the new one. Signed-off-by: MindflareX <94793217+MindflareX@users.noreply.github.com>
143878f to
8fac8bd
Compare
Summary
The aspect's awk script in
private/util/validate_usr_symlinks.awkused$0 !~ /type=link/to decide whether an entry at/bin,/sbin,/lib,/lib32,/lib64, or/libx32was a symlink. Because the pattern was a plain substring match, any mtree field value elsewhere on the line that happened to contain the literal stringtype=link(e.g.uname=type=link,gname=type=link, aflagsvalue spelled the same way) would satisfy the check, and the aspect would silently accept a non-symlink at that path — breaking the merged-usr invariant that the aspect exists to enforce.A symmetric issue existed in the
link=match used to verify the symlink target, which could in principle be anchored against the trailing characters of a prior field rather than a true field boundary.This PR anchors both regexes to whitespace boundaries so they operate on whole mtree fields and adds regression tests.
Reproduction (against the unfixed script)
The script reports no violation even though
./binis a directory, not a symlink tousr/bin. With this PR:Risk assessment
For images built from the snapshots in
private/repos/deb/*.lock.jsonthis is defense-in-depth: Debian-supplied.debsdo not carryuname/gname/flagsvalues shaped liketype=link, so the bypass cannot be reached from current sources. The fix matters mainly because the aspect's stated job is to catch merged-usr-violating layers regardless of how they end up in the input tars (locally-built tars, custom packages, future non-Debian sources).Test plan
private/util/validate_usr_symlinks_test.shpasses locally (existing cases + 5 new regression cases).