Skip to content

Conversation

@gspencergoog
Copy link
Collaborator

@gspencergoog gspencergoog commented Jan 22, 2026

Description

This updates evals to handle client side functions, and to add some more test cases for newer features in the spec.

Summary of Changes

This pull request significantly upgrades the v0.9 evaluation framework by integrating client-side function handling and broadening the test suite to cover advanced specification features. The changes ensure that the system can correctly interpret and validate complex UI messages involving function calls, improving both the robustness and expressiveness of the UI generation capabilities.

Highlights

  • Client-Side Function Support: Implemented support for client-side functions within the evaluation system, enabling more dynamic UI generation.
  • Expanded Test Cases: Added new prompt examples to cover recent specification features, including client-side validation, usage of standard functions like pluralize, openUrl actions, and deeply nested UI layouts.
  • Dynamic Function Validation: Enhanced the Validator to dynamically load and validate standard function calls by parsing the standard_catalog.json schema, improving maintainability.
  • Recursive Function Call Validation: Introduced a new recursive validation mechanism to correctly process and validate nested function calls within UI messages.
Changelog
  • specification/v0_9/eval/src/prompts.ts
    • Updated existing promptText examples to clarify component usage, such as explicitly mentioning 'List' components, 'Text' children for buttons, empty labels for CheckBoxes, and valid URLs for Video components.
    • Added new promptText entries to test client-side validation, the pluralize standard function, the openUrl client-side action, and deeply nested UI component structures.
    • Ensured explicit definition of referenced component IDs in prompt descriptions.
  • specification/v0_9/eval/src/validator.ts
    • Introduced a private standardFunctions: Set<string> property to store names of recognized standard functions.
    • Modified the constructor to parse standard_catalog.json and dynamically populate standardFunctions, including a warning if no functions are loaded.
    • Added a new private recursive method validateFunctionCalls(root: any, errors: string[]) to traverse the message structure and identify/validate function calls.
    • Integrated validateFunctionCalls into the main validation flow of the run method.

gemini-code-assist[bot]

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

1 participant