Skip to content

fix: sanitize all invalid characters in checksum filenames#2886

Open
s3onghyun wants to merge 1 commit into
go-task:mainfrom
s3onghyun:fix-checksum-filename-regex
Open

fix: sanitize all invalid characters in checksum filenames#2886
s3onghyun wants to merge 1 commit into
go-task:mainfrom
s3onghyun:fix-checksum-filename-regex

Conversation

@s3onghyun

Copy link
Copy Markdown

Description

normalizeFilename (in internal/fingerprint) turns a task name into the on-disk checksum/timestamp filename, replacing invalid characters with -. It uses:

var checksumFilenameRegexp = regexp.MustCompile("[^A-z0-9]")

A-z is the classic range bug: it spans ASCII 65 (A) to 122 (z), which includes the six characters between Z and a[ \\ ] ^ _ \``. Those are not sanitized. The intent (per the comment) was clearly [A-Za-z0-9]`.

Concrete impact: a task name containing one of these characters (notably \\, a path separator on Windows) leaks into the checksum/timestamp file path, which can corrupt the file location and break sources:/generates: up-to-date detection for that task.

Fix

[^A-z0-9][^A-Za-z0-9].

Test

Extended TestNormalizeFilename with cases for the leaked characters (\\, _, [, ], ^, ```). Fails before, passes after.

Note: for any existing task whose name contains one of these characters, the normalized filename changes once, so the first run after upgrade recomputes the checksum (treated as not-up-to-date) — expected and harmless.

normalizeFilename used the regexp [^A-z0-9], but A-z is the ASCII range
65-122, which includes the six characters [ \ ] ^ _ ` between Z and a.
Those were left unsanitized in the on-disk checksum/timestamp filename, so
a task name containing e.g. a backslash could corrupt the file path and
break up-to-date detection. Use [^A-Za-z0-9] as intended.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant