Update evals to handle client side functions #537
+101
−7
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This updates evals to handle client side functions, and to add some more test cases for newer features in the spec.
Summary of Changes
This pull request significantly upgrades the v0.9 evaluation framework by integrating client-side function handling and broadening the test suite to cover advanced specification features. The changes ensure that the system can correctly interpret and validate complex UI messages involving function calls, improving both the robustness and expressiveness of the UI generation capabilities.
Highlights
pluralize,openUrlactions, and deeply nested UI layouts.Validatorto dynamically load and validate standard function calls by parsing thestandard_catalog.jsonschema, improving maintainability.Changelog
promptTextexamples to clarify component usage, such as explicitly mentioning 'List' components, 'Text' children for buttons, empty labels for CheckBoxes, and valid URLs for Video components.promptTextentries to test client-side validation, thepluralizestandard function, theopenUrlclient-side action, and deeply nested UI component structures.private standardFunctions: Set<string>property to store names of recognized standard functions.constructorto parsestandard_catalog.jsonand dynamically populatestandardFunctions, including a warning if no functions are loaded.validateFunctionCalls(root: any, errors: string[])to traverse the message structure and identify/validate function calls.validateFunctionCallsinto the main validation flow of therunmethod.