Skip to content

feat: add eval:inventory CLI command and reporting logic#28009

Open
ved015 wants to merge 2 commits into
google-gemini:mainfrom
Gsoc26:eval-inventory
Open

feat: add eval:inventory CLI command and reporting logic#28009
ved015 wants to merge 2 commits into
google-gemini:mainfrom
Gsoc26:eval-inventory

Conversation

@ved015

@ved015 ved015 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds npm run eval:inventory, a repo-local command for listing the eval cases currently defined under evals/.

This builds on the static eval analyzer by scanning *.eval.ts and *.eval.tsx files, collecting the cases it finds, and printing a stable text report grouped by policy and suite.

Details

  • Added scripts/eval-inventory-cli.ts as the command entry point.
  • Added scripts/utils/eval-inventory.ts to:
    • find eval files under evals/
    • run analyzeEvalSource for each file
    • aggregate cases and diagnostics
    • format the report by policy and suite
  • Added the eval:inventory npm script.
  • Added tests for file discovery, report formatting, grouping, diagnostics, helper names, and empty-directory behavior.

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

@gundermanc pls have a look

@ved015 ved015 requested review from a team as code owners June 18, 2026 09:48
@github-actions github-actions Bot added the size/l A large sized PR label Jun 18, 2026
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

📊 PR Size: size/L

  • Lines changed: 416
  • Additions: +412
  • Deletions: -4
  • Files changed: 5

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new development utility for the evaluation framework, enabling developers to easily audit and inventory existing test cases. By providing a CLI command to aggregate and categorize evaluation tests, it enhances visibility into the test suite's structure and policy classifications. Additionally, the PR includes necessary stability improvements to the test infrastructure and comprehensive unit tests for the new reporting logic.

Highlights

  • New CLI Tooling: Added a new npm run eval:inventory command to generate a comprehensive report of all evaluation test cases.
  • Inventory Aggregation: Implemented logic to scan for .eval.ts and .eval.tsx files, perform static AST analysis, and group results by policy and suite.
  • Test Infrastructure Fix: Updated scripts/tests/test-setup.ts to use asynchronous Vitest mocks, resolving a critical mock resolution issue.
  • Diagnostic Reporting: Improved output formatting by converting absolute diagnostic paths to relative paths for better readability.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an eval inventory CLI tool and helper utilities to scan, aggregate, and format reports on eval source files, along with comprehensive unit tests. Feedback on the changes highlights an issue in the custom sort comparator within scripts/utils/eval-inventory.ts, where a violation of the reflexivity rule of Array.prototype.sort when comparing identical '(no suite)' values could lead to unstable sorting behavior.

Comment thread scripts/utils/eval-inventory.ts
@gemini-cli gemini-cli Bot added the status/need-issue Pull requests that need to have an associated issue. label Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/l A large sized PR status/need-issue Pull requests that need to have an associated issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant