refactor: extract payload builders and tracing into reusable modules#1038
Open
refactor: extract payload builders and tracing into reusable modules#1038
Conversation
Extract evaluation reporting logic into dedicated modules for better
code organization, reusability, and separation of concerns:
- Add _eval_tracing.py: EvalTracingManager class that encapsulates all
OpenTelemetry tracing logic for evaluation runs including parent trace
creation, eval run traces, and evaluator span management
- Add _payload_builders package:
- BasePayloadBuilder: Abstract base class with shared utilities for
GUID conversion, usage extraction from spans, completion metrics,
and request spec building
- CodedPayloadBuilder: Handles coded agent evaluation payloads with
string IDs and /coded/ endpoint suffix
- LegacyPayloadBuilder: Handles legacy (low-code) agent payloads with
GUID conversion and assertionRuns format
These modules provide reusable abstractions that can be used to simplify
the StudioWebProgressReporter and enable easier testing and maintenance
of the evaluation reporting functionality.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
EvalTracingManagerclass that encapsulates OpenTelemetry tracing logic for evaluation runs_payload_builderspackage with abstract base class and concrete implementations for coded and legacy evaluationsWhy This Refactoring Was Needed
The
StudioWebProgressReporterclass had grown to over 1200 lines with significant code duplication between coded and legacy evaluation handling. This refactoring:Improves maintainability: Separates concerns into focused modules - tracing logic in
_eval_tracing.pyand payload building in_payload_builders/Enables reusability: The new abstractions can be used independently for:
Reduces duplication: Shared utilities like GUID conversion, usage extraction, and completion metrics building are now in a single base class
Facilitates testing: Smaller, focused classes are easier to unit test in isolation
New Modules
_eval_tracing.pyEvalTracingManager: Manages OpenTelemetry tracing for evaluation runs including parent trace creation, eval run traces, and evaluator span management_payload_builders/BasePayloadBuilder: Abstract base class with shared utilities for GUID conversion, usage extraction from spans, completion metrics, and request spec buildingCodedPayloadBuilder: Handles coded agent evaluation payloads with string IDs and/coded/endpoint suffixLegacyPayloadBuilder: Handles legacy (low-code) agent payloads with GUID conversion andassertionRunsformatTest plan
uv run just lint)uv run mypy src/uipath/_cli/_evals/)uv run just build)🤖 Generated with Claude Code