-
Notifications
You must be signed in to change notification settings - Fork 0
feat: support union and optional arguments #83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Looks really useful, thanks. @jacanchaplais are you taking this one? |
|
Thanks both @dionhaefner and @jpbrodrick89. Yep, will review this today or tomorrow. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jpbrodrick89. Thanks so much for the PR, these features are obviously useful!
I like using a UI fallback of a textbox for union types. However, since the key changes in what's rendered in the web UI is to replace specialised inputs with a textbox, and an optional type is to add a checkbox to the input container, I think we can probably avoid type checking the unions, and tracking if it could be a number, etc.
Tesseract-Core's reliance on Pydantic gives us type checking for free, and also we can just use a few simple if statements on the strings obtained from the textbox inputs. Python strings have a tonne of useful methods for checking the type of what they contain, if you tab complete str.is you'll get
isalnum, isalpha, isascii, isdecimal, isdigit, isidentifier, islower, isnumeric, isprintable, isspace, istitle, isupper
To make sure we can manage code complexity, I'd be keen to implement this feature with a smaller diff, just focusing on detecting if a union type has turned up, and adding a new "type" expected by the Jinja template of "union". Then in the Jinja template, you'd just need to add one extra if statement to create a text_input field if it's a union type, and you could inject a function into the template source that looks something like this (more explicit compared with attempting to parse as JSON and catching exceptions):
def _autocast(data: str) -> int | float | bool | str:
if data.isdigit():
return int(data)
if data.replace(".", "", 1).isdigit():
return float(data)
if data.lower() == "false":
return False
if data.lower() == "true":
return True
return dataAs for the optional type, rather than adding more UI, I think it might simpler to append "(optional)" to the title of the input field. Then if they leave the field blank, it'll be assumed that the value should be None.
As for the collections of composite types, I think your JSON parsing idea is reasonable, but I'd rather make that its own PR. I documented it as a limitation of Tesseract Streamlit at the bottom of the README, and I'd like to think about how the UX will be impacted on its own terms.
I hope that makes sense! Thanks again for the contribution, I think if we can add this feature with a smaller footprint on the codebase, it'll add a lot of value and will open us up for some great extensibility moving forward! 💙
The type hints were not being respected. NotRequired fields are no longer treated as optional.
jacanchaplais
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jonathan, this is really useful! I've added a couple of comments along with some minor commits to reformat a couple things, but this is a lovely contribution. Once those last comments are addressed, I'm very happy to approve this for merging. Thanks for all of your work!
| ================================================================ #} | ||
| {% for field in schema %} | ||
| {# prepare display title with (optional) suffix if needed #} | ||
| {% set display_title = field.get('title', field.uid) + (' (optional)' if field.get('optional', False) else '') %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing this! Ideally Jinja templates don't contain much logic, particularly for string wrangling. I'd suggest moving this logic into the _format_field() function to modify the title field directly.
| ### Limitations | ||
|
|
||
| * **Collections of complex types**: Unions like `Model | list[Model]` or `float | list[float]` are not yet supported. These are deferred to a future release. | ||
| * **No specialized widgets**: Union fields always use text inputs. For example, `int | float` uses a text input instead of a number input with spinners. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you shift this down and merge it with the existing "Current limitations" section?
| ## 🔀 Union Type Support | ||
|
|
||
| `tesseract-streamlit` supports Pydantic union types (e.g., `int | str`, `float | None`) with automatic type inference: | ||
|
|
||
| ### Supported Union Patterns | ||
|
|
||
| * **Optional numbers**: `int | None`, `float | None` - Rendered as text inputs (not number inputs) with placeholder text "Leave blank if not needed". This allows the field to truly be empty. The input is parsed as an integer or float when provided. | ||
| * **Optional strings**: `str | None` - Rendered as text inputs with placeholder text. Leave blank to pass `None`. | ||
| * **Mixed scalar types**: `int | float`, `float | str`, `int | str | None` - Rendered as text inputs with automatic type casting. | ||
|
|
||
| ### Auto-Casting Behavior | ||
|
|
||
| Union type fields use a text input with automatic type detection: | ||
|
|
||
| 1. **Integers**: Input like `42` or `-23` is parsed as `int` | ||
| 2. **Floats**: Input like `3.14`, `-0.5`, or `1e-5` is parsed as `float` | ||
| 3. **Strings**: Any other input is treated as a string | ||
| 4. **None**: Leaving the field blank (for optional fields) passes `None` | ||
|
|
||
| **Example:** | ||
| ```python | ||
| class InputSchema(BaseModel): | ||
| threshold: float | str # Can accept 0.5 (float) or "auto" (string) | ||
| count: int | None = None # Optional integer, blank input passes None | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appreciate the attention to keeping the docs up to date! I do think this section is a bit gratuitous, though. I think a simple sentence somewhere to say that union and optional types in InputSchema are supported would be sufficient. The rest should just be intuitive from using the web UI with the placeholder text, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another solution could be changing the "Current limitations" section to something more positive like "Supported features and limitations", since that's basically what it is anyway. Then you could add a bullet or two there, as you see fit.
| with {{ field.container }}: | ||
| {% set key_prefix = 'string' if field.type == 'string' else ('int' if field.type == 'integer' else 'number') %} | ||
| {% set label = display_title if field.type == 'string' else field.get('title', field.uid) %} | ||
| {% set default_val = field.get('default', '') if (field.type == 'string' or field.get('default') is none) else field.get('default', '') %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just ran your modified goodbyeworld and got "None" as the default text.
I think you may be treating NotRequired fields in TypedDict as thought they're optional. I've corrected this for populating NumberConstraints, but could you catch it for the "default" key in _InputField / JinjaField?
If there's no default, a NotRequired key shouldn't be present in the dictionary, rather than being set with a value of None. 😊
Description of changes
Support union and optional types (e.g.
float | str,str | None,Hobby | list[Hobby]:Resolution rules:
1. If null is in union, remove it and set is_optional=True
2. If a single type remains, return that with is_optional
3. If only int/float types remain, resolve to "number"
4. Otherwise resolve to "union" which is parsed by _autocast (tries to convert to int -> float -> string)
Added checkboxes for optional arguments if unselected then None is passed. Unfortunately, removing the input field conditionally is not possible in a jinja template. Any reason why we're using jinja templates and not pure streamlit here?
Errors now raised for unsupported types
Testing done
dummywhich is not in the OutputSchema and changedint.hobby.nametostr.hobby.name. I have no idea how tests passed previously...Added test cases to mock schemas and goodbyeworld.
Added unit tests for parsing functions.
Check errors raise for unsupported types
Manual testing.