Skip to content

Python: Fix Hyperlight CodeAct span parenting#6712

Open
eavanvalkenburg wants to merge 3 commits into
microsoft:mainfrom
eavanvalkenburg:fix-hyperlight-span-parenting
Open

Python: Fix Hyperlight CodeAct span parenting#6712
eavanvalkenburg wants to merge 3 commits into
microsoft:mainfrom
eavanvalkenburg:fix-hyperlight-span-parenting

Conversation

@eavanvalkenburg

@eavanvalkenburg eavanvalkenburg commented Jun 24, 2026

Copy link
Copy Markdown
Member

Motivation & Context

CodeAct host-tool calls made from inside execute_code were losing their parent span when Hyperlight hopped onto its worker thread. That made the execution flow look flattened in OTEL backends and hid the nested relationship between execute_code and the host tool call.

Description & Review Guide

  • What are the major changes? Preserve the active OpenTelemetry context across Hyperlight's worker-thread boundary before running the sandbox callback, and add a regression test that asserts the host-tool span is a child of execute_code.
  • What is the impact of these changes? Hyperlight traces now reflect the real nested execution flow, which makes CodeAct runs easier to inspect and debug in observability tools.
  • What do you want reviewers to focus on? The context handoff in the Hyperlight worker-thread path and the span-parenting assertion in the regression test.

Related Issue

Fixes #6704

Contribution Checklist

  • The code builds clean without any errors or warnings
  • All unit tests pass, and I have added new tests where possible
  • The PR follows the Contribution Guidelines
  • The PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).
  • This is not a breaking change. If it is a breaking change, add the breaking change label (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 24, 2026 15:31
@moonbox3 moonbox3 added the python Usage: [Issues, PRs], Target: Python label Jun 24, 2026
@github-actions github-actions Bot changed the title Fix Hyperlight CodeAct span parenting Python: Fix Hyperlight CodeAct span parenting Jun 24, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 5 | Confidence: 93% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Failure Modes, Design Approach


Automated review by eavanvalkenburg's agents

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes OpenTelemetry span parenting for CodeAct host-tool calls executed inside execute_code when Hyperlight crosses thread boundaries, so traces reflect the true nested execution flow.

Changes:

  • Propagate the active contextvars context into Hyperlight’s worker thread executor path.
  • Propagate the active contextvars context into the dedicated thread used to run the async sandbox callback (asyncio.run(...)).
  • Add a regression test asserting the host tool span is a child of the execute_code tool span.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
python/packages/hyperlight/agent_framework_hyperlight/_execute_code_tool.py Copies and replays contextvars context across the Hyperlight worker-thread boundary and the sandbox-callback thread boundary to preserve OTEL parent span context.
python/packages/hyperlight/tests/hyperlight/test_hyperlight_codeact.py Adds an in-memory OTEL exporter fixture and a regression test asserting correct parent/child span relationship for host tool calls inside execute_code.

Comment thread python/packages/hyperlight/tests/hyperlight/test_hyperlight_codeact.py Outdated
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/hyperlight/agent_framework_hyperlight
   _execute_code_tool.py6239385%70, 177, 240, 272, 275, 310–311, 326, 328, 341, 359, 369, 394, 399, 406, 412, 420, 428–430, 432–437, 477, 482, 484, 486, 511–512, 518–519, 524–525, 544–545, 554–555, 590, 621, 627–630, 648–651, 659, 692–693, 700–701, 703, 713–714, 754–755, 762, 808, 865–871, 940, 967, 1037, 1073, 1079–1081, 1110–1114, 1118–1119, 1124, 1141–1145, 1149–1150, 1207–1208
TOTAL42156498188% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
8304 37 💤 0 ❌ 0 🔥 2m 9s ⏱️

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Usage: [Issues, PRs], Target: Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CodeAct host tool spans should be parented under execute_code

5 participants