Skip to content

Conversation

@cristian-groza
Copy link
Contributor

@cristian-groza cristian-groza commented Dec 19, 2025

Key Changes

  • New analyze_files tool: Internal tool for analyzing file attachments with configurable analysis types
  • Job attachments module: Extracted job attachment handling logic into dedicated job_attachments.py module with improved organization
  • Job attachment wrapper: New wrapper class for managing file attachments in agent workflows
  • Internal tool factory: Factory pattern implementation for creating and managing internal tools
  • Dependency updates: Updated uipath and jsonschema-pydantic-converter package versions

Technical Highlights

JSON Schema Converter

The converter solves a critical challenge with dynamically generated Pydantic models by:

  • Creating a shared pseudo-module (jsonschema_pydantic_converter._dynamic) in sys.modules that acts as a namespace for all dynamically created types
  • Recursively collecting and registering all nested BaseModel types with proper module attribution
  • Ensuring get_type_hints() can resolve forward references when models are used with external libraries like LangGraph

Why this was needed: Dynamically generated Pydantic models need proper module references for type introspection to work correctly. Without this, frameworks that rely on typing.get_type_hints() (like LangGraph) fail to resolve forward references in nested models, breaking validation and serialization.

@cristian-groza cristian-groza force-pushed the feat/add-analyze-file-tool branch from 58d7fc8 to f31733d Compare December 22, 2025 09:32
@cristian-groza cristian-groza changed the title WIP add analyze file tool feat: add a new internal analyze_files tool with supporting infrastructure for file attachment handling Dec 22, 2025
@cristian-groza cristian-groza force-pushed the feat/add-analyze-file-tool branch from f91813c to 958edcc Compare December 22, 2025 12:15
Comment on lines +20 to +21
class AnalyzeFileTool(StructuredToolWithOutputType, ToolWrapperMixin):
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is ok for now, but we will have to revise this at some point.

  1. It doesn't provide a nice way to compose functionalities (like file handling & static args)
  2. It risks other code relying on isinstance(AnalyzeFileTool). In my design, systems would only search of Mixins

"""Agent Graph state for standard loop execution."""

messages: Annotated[list[AnyMessage], add_messages] = []
job_attachments: Annotated[dict[str, Attachment], add_job_attachments] = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is ok for now but we will likely change our internal state management in the near future

Comment on lines 301 to 305
results = []
for json_path in json_paths:
expr = parse(json_path)
matches = expr.find(data)
results.extend([match.value for match in matches])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if some of the paths overlap?
ie: $.attachments[*] vs $attachments[0]

In that case we will have duplicate matches in the result. Is that a problem?
If we assume we will get disjoint paths (which is the case for the paths you extract earlier), we could at least make it clear in the docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the paths overlap the function will return duplicates. Currently this is not a problem because the get_json_paths_by_type function is producing disjoint paths by design. I will update the docstring to make this behavior clear.

Comment on lines +25 to +29
def create_model(
schema: dict[str, Any],
) -> Type[BaseModel]:
model, namespace = transform_with_modules(schema)
corrected_namespace: dict[str, Any] = {}
Copy link
Contributor

@andreitava-uip andreitava-uip Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm really not a fan of this... but there is no way around this.

  • Our InputModel is dynamically generated from json schema -> unavoidable
  • We need to dynamically extend the AgentGraphState with the InputModel, otherwise we do not have the inputs in the state -> almost unavoidable
  • AgentGraphState is then inspected by langgraph internals with get_type_hints(), which will fail to resolve dynamic types unless we do this. We have no mechanism to pass a local namespace through.

@cristian-groza cristian-groza force-pushed the feat/add-analyze-file-tool branch from 7c81f33 to abcc021 Compare December 23, 2025 09:52
@cristian-groza cristian-groza force-pushed the feat/add-analyze-file-tool branch from 6a9b2fb to 5fe66f5 Compare December 23, 2025 10:02
Copy link
Contributor

@andreitava-uip andreitava-uip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash commits before merging

@cristian-groza cristian-groza merged commit 316002d into main Dec 23, 2025
39 checks passed
@cristian-groza cristian-groza deleted the feat/add-analyze-file-tool branch December 23, 2025 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants