feat: Add evaluations support to ManagedAgent.run() by jsonbailey · Pull Request #153 · launchdarkly/python-server-sdk-ai

jsonbailey · 2026-04-28T23:27:49Z

Summary

Wires judge evaluations into ManagedAgent.run() via asyncio.Task, mirroring ManagedModel.run() (PR 7 / PR 8)
run() returns immediately; await result.evaluations guarantees both evaluation and tracker.track_judge_result() complete
Uses ai_config.evaluator.evaluate(input, content) — resolves to empty list with Evaluator.noop()
Failed judge results (success=False) do NOT call track_judge_result()
Adds 6 new tests covering the full evaluations contract

Depends on

feat: Wire LDAIMetrics tool_calls and duration_ms into tracker #152 (PR 10 — enrich-metrics, which is based on feat: Add ManagedGraphResult, GraphMetricSummary, and AgentGraphRunnerResult types #151, feat: Update LangChain runners to implement Runner protocol returning RunnerResult #150, feat: Update OpenAI runners to implement Runner protocol returning RunnerResult #149, feat!: Add ManagedResult, RunnerResult, and Runner protocol; rename invoke() to run() #148, fix: Replace done_callback with coroutine chain for judge tracking #147)

Test plan

All existing tests pass (uv run pytest packages/sdk/server-ai/tests/)
New TestManagedAgentEvaluations tests: run returns before evaluations resolve, collect results, tracking fires on await, noop evaluator returns empty list, failed results not tracked

🤖 Generated with Claude Code

Wire judge evaluations into ManagedAgent.run() via an asyncio.Task, mirroring ManagedModel.run(). Awaiting result.evaluations guarantees both evaluation and tracker.track_judge_result() complete. run() returns immediately; the evaluations task resolves asynchronously. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

jsonbailey force-pushed the jb/aic-2174/agent-evaluations branch from 4f29d99 to 0ea4a04 Compare April 28, 2026 23:56

jsonbailey changed the base branch from jb/aic-2388/enrich-metrics to jb/aic-2174/langchain-graph-runner April 28, 2026 23:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add evaluations support to ManagedAgent.run()#153

feat: Add evaluations support to ManagedAgent.run()#153
jsonbailey wants to merge 1 commit intojb/aic-2174/langchain-graph-runnerfrom
jb/aic-2174/agent-evaluations

jsonbailey commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jsonbailey commented Apr 28, 2026

Summary

Depends on

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant