Skip to content

t: add greplint.pl and convert grep to test_grep#2135

Open
mmontalbo wants to merge 6 commits into
gitgitgadget:masterfrom
mmontalbo:mm/test-grep-docs
Open

t: add greplint.pl and convert grep to test_grep#2135
mmontalbo wants to merge 6 commits into
gitgitgadget:masterfrom
mmontalbo:mm/test-grep-docs

Conversation

@mmontalbo

@mmontalbo mmontalbo commented Jun 2, 2026

Copy link
Copy Markdown

test_grep is a wrapper around grep for test assertions that prints
the file contents on failure for easier debugging. Bare grep fails
silently, making it hard to diagnose what went wrong.

This series converts existing bare grep assertions to test_grep and
adds greplint.pl to prevent new ones from being introduced.

Patch 1 documents test_grep in t/README.

Patch 2 fixes three greps missing file arguments (t2402, t7507,
t7700). They were reading empty stdin and passing vacuously.

Patch 3 extracts chainlint's Lexer, ShellParser, and ScriptParser
into a shared module (lib-shell-parser.pl) so greplint.pl can
reuse the same tokenizer. No functional change to chainlint.

Patch 4 fixes a latent line-counting bug in scan_dqstring where
newlines from $() bodies inside double-quoted strings were counted
twice. This does not affect chainlint (which uses byte offsets)
but matters for greplint.pl's line-number reporting.

Patch 5 converts existing assertion greps to test_grep, including
sourced test helpers. Greps used as data filters or on files
that may not exist are left unconverted with lint-ok annotations.

Patch 6 adds greplint.pl with test fixtures (modeled on chainlint/)
and wires it into the Makefile as test-greplint and check-greplint.

Changes since v2:

  • t3420-rebase-autostash: dropped the change to the '! grep dirty
    file3' line under 'rebase --quit'. As SZEDER pointed out, file3
    never exists in the conflicted state, so that grep was passing only
    because it could not open the file. SZEDER's fix
    (sg/t3420-do-not-grep-in-missing-file, now in 'next') replaces the
    line with 'test_path_is_missing file3', which is the right check;
    this series simply leaves that line to his fix.

  • Audited the remaining '# lint-ok' annotations for the same "grep a
    file that never exists with correctly running Git" gotcha, as Junio
    suggested. The rule the audit applies: 'grep' becomes 'test_grep'
    only where its exit code is the assertion; grep that produces data
    (a filter) or that reads a file whose presence is conditional stays
    a plain 'grep', because test_grep BUGs on a missing file.

    • t5537 (.git/shallow): the file is still present after the repack
      (the client stays shallow), so the assertion is converted to
      'test_grep !' like any other; the "may not exist" note was wrong.

    • t1400 (.git/packed-refs): the file exists only with the files
      backend. Guarded the packed-refs check with a REFFILES
      prerequisite; the backend-agnostic 'git show-ref' check that
      follows still runs under every backend.

    • t7450 (squatting-clone/d/a/git~2): kept as '! grep' with an
      improved '# lint-ok'. 'git~2' is the NTFS 8.3 short name of a
      planted '..git' decoy and only exists when 8.3 short-name
      generation is enabled. Verified on a Windows VM: with 8.3
      disabled (the modern default) the short name is absent, the '!
      grep' correctly tolerates it, and a plain test_grep would BUG.
      So this one deliberately stays a missing-file-tolerant grep.

    • t5326 and t5702 remain annotated: these are genuine data filters
      (grep produces data that is redirected/captured, not an
      assertion).

  • Rebased onto master. The only conflict was a one-line codespell
    typo fix that upstream applied to chainlint.pl in the meantime; it
    now lives in the extracted lib-shell-parser.pl.

Known limitation / follow-up:

Assertions like grep pattern file >/dev/null and
grep pattern <file are not converted because greplint.pl
treats any redirect as a filter, so it does not flag them.
The former could be converted, but test_grep prints the
matching line on success just as grep does, so the >/dev/null
would have to be kept or dropped as a judgment call. The
latter requires turning the <file redirect into a positional
file argument, since test_grep reads a named file rather than
stdin. Both are left as bare grep. A follow-up series can
address these once a convention is established.

Cc: "D. Ben Knoble" ben.knoble@gmail.com, Eric Sunshine sunshine@sunshineco.com, SZEDER Gábor szeder.dev@gmail.com

@mmontalbo mmontalbo force-pushed the mm/test-grep-docs branch 30 times, most recently from 5d5366a to 9354adf Compare June 3, 2026 18:05
@mmontalbo mmontalbo force-pushed the mm/test-grep-docs branch 6 times, most recently from 1f7855c to 937f800 Compare June 11, 2026 17:25
@mmontalbo mmontalbo changed the title t: add check-grep.pl and convert grep to test_grep t: add greplint.pl and convert grep to test_grep Jun 11, 2026
@mmontalbo mmontalbo force-pushed the mm/test-grep-docs branch 5 times, most recently from 45e09e3 to 14114d7 Compare June 11, 2026 22:32
@gitgitgadget

gitgitgadget Bot commented Jun 13, 2026

Copy link
Copy Markdown

There are issues in commit fd9dc7e:
ci: trigger rebuild

  • Commit checks stopped - the message is too short
  • Commit not signed off

@mmontalbo

Copy link
Copy Markdown
Author

/preview

@gitgitgadget

gitgitgadget Bot commented Jun 13, 2026

Copy link
Copy Markdown

Preview email sent as pull.2135.v2.git.1781318996.gitgitgadget@gmail.com

@mmontalbo

Copy link
Copy Markdown
Author

/submit

@gitgitgadget

gitgitgadget Bot commented Jun 13, 2026

Copy link
Copy Markdown

Submitted as pull.2135.v2.git.1781323575.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2135/mmontalbo/mm/test-grep-docs-v2

To fetch this version to local tag pr-2135/mmontalbo/mm/test-grep-docs-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2135/mmontalbo/mm/test-grep-docs-v2

@gitgitgadget

gitgitgadget Bot commented Jun 13, 2026

Copy link
Copy Markdown

Michael Montalbo wrote on the Git mailing list (how to reply to this email):

On Mon, Jun 8, 2026 at 2:36 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> I do not think we want an automated tool that rewrites the source
> files.  I was hoping that we would get a patch or two that _adds_ to
> existing test-lint framework (i.e., 'test-grep' that 'test-lint'
> target depends on in t/Makefile) that gives diagnosis in a similar
> fashion as test-lint-shell-syntax and test-chainlint do.
>
> Also some existing uses of "grep" are not end-user facing and should
> not be rewritten to "test_grep".

Apologies for not responding explicitly before sending a re-roll, I
will do that in
the future. v2 attempts to address these points as noted in the "changes"
section.

@gitgitgadget

gitgitgadget Bot commented Jun 16, 2026

Copy link
Copy Markdown

This branch is now known as mm/test-grep-lint.

@gitgitgadget

gitgitgadget Bot commented Jun 16, 2026

Copy link
Copy Markdown

This patch series was integrated into seen via git@80b9fc2.

mmontalbo added 6 commits July 2, 2026 13:07
test_grep is a wrapper around grep for test assertions that prints
the file contents on failure for easier debugging.  It also accepts
'!' as its first argument for negation, which preserves the
diagnostic output that '! test_grep' would suppress.

Despite being widely used (and the preferred replacement for bare
grep in assertions), test_grep has no entry in t/README alongside
the other documented helpers like test_cmp and test_line_count.
Add one.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Three grep assertions were missing their file arguments, causing
them to read from empty stdin instead of the intended file:

- t2402: '! grep ...' should read from 'out', matching the
  grep on the preceding line.
- t7507: the closing quote is in the wrong place, making the
  entire 'diff --git actual' a single pattern with no file
  argument instead of pattern 'diff --git' and file 'actual'.
- t7700: '! grep ...' should read from 'packlist', matching
  the redirect on the preceding line.

Without file arguments these greps always succeed (empty stdin
matches nothing), so the assertions were not actually checking
anything.  All three tests pass with the corrected file arguments,
confirming the intended behavior is sound.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Move chainlint.pl's Lexer, ShellParser, and ScriptParser into a
shared module (lib-shell-parser.pl) so other lint tools can reuse
the same shell parsing infrastructure.  A subsequent commit adds
greplint.pl, which needs the same tokenizer to correctly identify
command boundaries.

ScriptParser's check_test() becomes a no-op in the shared module.
chainlint.pl defines ChainlintParser (extending ScriptParser)
with the &&-chain check_test() implementation.

No functional change: chainlint produces the same output and
check-chainlint self-tests pass.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
scan_dqstring's post-loop newline counter re-counts newlines that
were already counted during recursive parsing of $() bodies.  This
happens because scan_dollar returns text containing newlines (from
multi-line command substitutions), and the catch-all counter at the
end of scan_dqstring counts all of them again.

Fix this by counting newlines inline as non-special characters are
consumed, and removing the post-loop catch-all.  Each newline is
now counted exactly once: literal newlines at the inline match,
line splices at the backslash handler, and $() newlines by
scan_token during the recursive parse.

This is a latent bug: any consumer that relies on token line
numbers rather than byte offsets would get incorrect results for
tokens following a multi-line $() inside a double-quoted string.
chainlint is not affected because it annotates the original body
text using byte offsets, not token line numbers.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Replace bare grep with test_grep in test assertions across the
suite, including sourced test helpers (lib-*.sh, *-tests.sh).
test_grep prints the contents of the file being searched on
failure, making debugging easier than a bare grep which fails
silently.

Only assertion-style greps are converted: grep used as a filter
in pipelines, command substitutions, conditionals, or with
redirected I/O is left as-is with a "# lint-ok" annotation.
Existing '! test_grep' calls are rewritten to 'test_grep !' so
that the diagnostic output is preserved on failure.

test_grep requires the file it reads to exist, so '! grep'
assertions that inspect a file whose presence is conditional need
care.  In t5537 the '.git/shallow' file is still present after the
repack (the client remains shallow), so the assertion is
converted like any other.  In t1400 the '.git/packed-refs' file
exists only with the files backend, so its check is guarded with a
REFFILES prerequisite; the backend-agnostic 'git show-ref' check
that follows still runs under every backend.  In t7450 'git~2' is
the NTFS 8.3 short name of a decoy '..git' file and only exists
when 8.3 short-name generation is enabled; that '! grep' tolerates
the missing file on purpose, so it is left as-is with a
"# lint-ok" annotation rather than converted (a plain test_grep
would BUG when the short name is absent).

The conversion was generated using a grep-assertion linter
(greplint.pl, added in the following commit) to identify bare
grep calls at command position.  To reproduce, from the t/
directory:

    # Step 1: annotate the two data-filter greps (grep produces
    # data, not a verdict) so the linter skips them.
    sed -i '/grep -vf before commits\.raw/s/$/ # lint-ok: data filter/' \
        t5326-multi-pack-bitmaps.sh
    sed -i '/grep -E "^\[0-9a-f\].*|| :/s/$/ # lint-ok: data filter/' \
        t5702-protocol-v2.sh

    # Step 1b: two '! grep' assertions need more than a mechanical
    # conversion; handle them by hand before the linter-driven steps
    # below so it leaves them alone.
    #
    # t1400: '.git/packed-refs' is absent under reftable, so guard the
    # check with REFFILES (a plain test_grep would BUG on the missing
    # file):
    #
    #      git update-ref -d HEAD $B &&
    #  -   ! grep "$m" .git/packed-refs &&
    #  +   if test_have_prereq REFFILES
    #  +   then
    #  +           test_grep ! "$m" .git/packed-refs
    #  +   fi &&
    #      test_must_fail git show-ref --verify -q $m
    #
    # t7450: git~2 is an NTFS 8.3 short name that exists only when
    # short-name generation is enabled, so this stays a missing-file-
    # tolerant '! grep'; add a comment plus "# lint-ok" so the linter
    # skips it.

    # Step 2: reorder pre-existing '! test_grep' to 'test_grep !'
    # (must come before steps 3-4 so greplint does not see them)
    sed -i 's/! test_grep/test_grep !/' t0031-lockfile-pid.sh
    sed -i 's/! test_grep/test_grep !/' t5300-pack-object.sh
    sed -i 's/! test_grep/test_grep !/' t5319-multi-pack-index.sh

    # Step 3: convert '! grep' -> 'test_grep !'
    perl greplint.pl *.sh 2>&1 | cut -d: -f1,2 |
    while IFS=: read f l; do
        sed -i "${l}s/! *grep/test_grep !/" "$f"
    done

    # Step 4: convert remaining 'grep' -> 'test_grep'
    perl greplint.pl *.sh 2>&1 | cut -d: -f1,2 |
    while IFS=: read f l; do
        sed -i "${l}s/grep/test_grep/" "$f"
    done

To verify, run: make -C t test-greplint

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Without a lint guard, bare grep assertions will creep back into
tests over time, defeating the previous commit's conversion.

Add greplint.pl to catch bare 'grep' used as a test assertion
(where 'test_grep' should be used) and '! test_grep' (where
'test_grep !' should be used).

greplint.pl reuses the shared shell parser from lib-shell-parser.pl
to tokenize test bodies.  The Lexer collapses heredocs, command
substitutions, and quoted strings into single tokens, so 'grep'
appearing inside these contexts is not flagged.  A flat walk over
the token stream tracks command position and pipeline state to
distinguish assertion greps from filter greps.

For double-quoted test bodies, a source-line walk counts
backslash-continuation lines that the Lexer consumes without
emitting into the body text, adjusting the reported line number
accordingly.

Add test fixtures in greplint/ (modeled on chainlint/) covering
detection of bare grep assertions, correct skipping of filters,
pipelines, redirects, command substitutions, and lint-ok annotations.

Wire into the Makefile as:
  - test-greplint: runs greplint.pl on $(T) $(THELPERS) $(TPERF)
  - check-greplint: runs greplint.pl on fixtures, diffs against expected
  - clean-greplint: removes temp dir

Add eol=lf entries in t/.gitattributes for greplint fixtures,
matching chainlint, so that check-greplint passes on Windows
where core.autocrlf would otherwise cause CRLF mismatches
between expected and actual output.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
@mmontalbo

Copy link
Copy Markdown
Author

/preview

@gitgitgadget

gitgitgadget Bot commented Jul 3, 2026

Copy link
Copy Markdown

Preview email sent as pull.2135.v3.git.1783054200.gitgitgadget@gmail.com

@mmontalbo

Copy link
Copy Markdown
Author

/submit

@gitgitgadget

gitgitgadget Bot commented Jul 3, 2026

Copy link
Copy Markdown

Submitted as pull.2135.v3.git.1783054466.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2135/mmontalbo/mm/test-grep-docs-v3

To fetch this version to local tag pr-2135/mmontalbo/mm/test-grep-docs-v3:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2135/mmontalbo/mm/test-grep-docs-v3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant