.NET: Python: Add dotnet integration test report to CI by giles17 · Pull Request #5515 · microsoft/agent-framework

giles17 · 2026-04-27T15:06:32Z

Motivation and Context

Add visibility into dotnet integration test results across CI runs, mirroring the Python integration test report already in place.

Description

Changes

*Workflow: \dotnet-build-and-test.yml*

Add --report-junit\ flag to the integration test step to generate JUnit XML alongside TRX
Add --results-directory ../IntegrationTestResults/\ to centralize integration test output (separate from unit test results)
Upload JUnit XML artifacts from each matrix leg (
et10.0/ubuntu,
et472/windows)
Add new \dotnet-integration-test-report\ job that aggregates results, generates a trend report, and posts to Job Summary

*Script: \python/scripts/flaky_report/aggregate.py*

Refactor XML file discovery to support both pytest (\pytest.xml) and xunit (*.junit.xml) layouts via new _discover_xml_files()\ function
Handle dotnet artifact naming convention (\dotnet-test-results-{framework}-{os})
Fix nodeid collision when the same test runs under multiple frameworks by qualifying keys with provider
Improve module extraction for dotnet C# classnames (recognizes IntegrationTests/UnitTests namespace segments)

Architecture

\
dotnet-test (net10.0/ubuntu) ──┐
├──> dotnet-integration-test-report
dotnet-test (net472/windows) ──┘ (downloads JUnit XML artifacts,
runs aggregate.py, posts to
Job Summary, caches history)
\\

The report job:

Is not in the merge gate (\dotnet-build-and-test-check\ doesn't depend on it)
Only runs on non-PR events (matching when integration tests actually execute)
Uses a separate cache key (\dotnet-integration-report-history-) from the Python report

Key design decisions

*JUnit XML via --report-junit* instead of parsing TRX — xunit v3 supports native JUnit XML generation, allowing reuse of the existing Python report script
Integration-only — unit tests run on a different cadence (PRs vs merges), mixing would make trends incomparable
Separate history stream — distinct cache key prevents dotnet/Python history from interleaving

Checklist

I've read the contributing guidelines
I've verified the CI workflows are syntactically correct
I've tested aggregate.py locally with mixed Python + dotnet JUnit XML data
I've confirmed collision handling works (same test on net10.0 and net472)

github-actions

Automated Code Review

Reviewers: 4 | Confidence: 90% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Design Approach

Automated review by giles17's agents

Copilot

Pull request overview

This PR adds CI visibility for .NET integration test outcomes by publishing JUnit XML from the existing dotnet-test matrix legs and reusing the existing Python trend-aggregation script to generate a Job Summary report with cached history.

Changes:

Update the .NET integration test step to emit JUnit XML into a dedicated IntegrationTestResults/ directory and upload those XML files as per-matrix artifacts.
Add a new dotnet-integration-test-report job that downloads the artifacts, aggregates them into a trend report, posts it to the GitHub Actions Job Summary, and caches history.
Refactor python/scripts/flaky_report/aggregate.py to discover both pytest.xml and *.junit.xml, derive dotnet “provider” labels, and avoid nodeid collisions across providers.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`python/scripts/flaky_report/aggregate.py`	Extends report discovery/parsing to support dotnet xUnit JUnit XML and multi-provider collision handling.
`.github/workflows/dotnet-build-and-test.yml`	Generates/uploads dotnet integration JUnit XML and adds a reporting job to aggregate and publish a trend report.

Copilot · 2026-04-27T15:12:01Z

+      run:
+        working-directory: python
+    steps:
+      - uses: actions/checkout@v6


In this workflow, other checkout steps explicitly set persist-credentials: false to avoid leaving the GITHUB_TOKEN credentials in the local git config. This new job’s checkout doesn’t set it, so credentials will persist by default. Consider adding with: persist-credentials: false here for consistency and to reduce the blast radius of any later step that might invoke git.

Suggested change

- uses: actions/checkout@v6

- uses: actions/checkout@v6

with:

persist-credentials: false

Copilot · 2026-04-27T15:12:02Z

+            # Use provider-qualified key when the same test runs under
+            # multiple providers (e.g. dotnet net10.0 vs net472).  This
+            # prevents later results from silently overwriting earlier ones.
+            raw_id = test["nodeid"]
+            key = raw_id
+            if key in combined_results and combined_results[key]["provider"] != provider:
+                # Collision: re-key existing entry and use qualified key for new one
+                existing = combined_results.pop(key)
+                combined_results[f"{existing['provider']}::{raw_id}"] = existing
+                key = f"{provider}::{raw_id}"
+            elif f"{provider}::{raw_id}" in combined_results:
+                # Provider-qualified key already exists (previous collision)
+                key = f"{provider}::{raw_id}"
+


The nodeid collision handling here is order-dependent when the same raw_id appears under 3+ providers in a single run. After the first collision, the unqualified raw_id key is removed, so a later provider can reintroduce the unqualified key (because key in combined_results is false), which makes run-to-run keys unstable and can skew trend history. Consider always keying results as f"{provider}::{raw_id}" (or track a collided_raw_ids set and always use qualified keys once any collision is detected for that raw_id).

Copilot · 2026-04-27T15:12:02Z

+        "# 🔬 Integration Test Report",
        "",
        f"*Generated: {datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M UTC')}*",


This changes the report header from “Flaky Test Report” to “Integration Test Report” for all consumers of this script, including the existing Python CI jobs that are still named “Flaky Test Report” and upload artifacts like flaky-test-report.md. To avoid confusion, either keep the original heading for the Python flaky report, or make the title configurable (e.g., via a CLI arg/env var) and set it appropriately from each workflow.

- Add --report-junit flag to dotnet integration test step to generate JUnit XML alongside TRX, with explicit --results-directory to centralize output in IntegrationTestResults/ - Upload JUnit XML artifacts from each matrix leg (net10.0/ubuntu, net472/windows) as dotnet-test-results-{framework}-{os} - Add dotnet-integration-test-report job that downloads artifacts, runs the existing aggregate.py script, posts markdown to Job Summary, and saves trend history via actions/cache - Refactor aggregate.py to discover JUnit XML files recursively, supporting both pytest (pytest.xml) and xunit (*.junit.xml) layouts - Handle provider name derivation for dotnet artifact naming convention - Fix nodeid collision when same test runs under multiple frameworks by qualifying keys with provider when collisions are detected - Improve module extraction for dotnet C# classnames (recognizes IntegrationTests/UnitTests namespace segments) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

xUnit v3 generates files with .junit extension, not .junit.xml. Update upload glob and aggregate.py discovery to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Always prefix dotnet test keys with provider (e.g. net10.0 (ubuntu)::TestName) to ensure stable, comparable counts across runs regardless of file parse order. Also show Executed (passed+failed) instead of Total in summary table. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Dotnet tests run on multiple frameworks (net10.0, net472). Instead of one combined table with unstable totals, show separate sections per framework — each with its own summary row and per-test table. Python reports retain the original single-table format. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 27, 2026 15:06

moonbox3 added the python label Apr 27, 2026

github-actions Bot changed the title ~~Add dotnet integration test report to CI~~ Python: Add dotnet integration test report to CI Apr 27, 2026

Copilot started reviewing on behalf of giles17 April 27, 2026 15:07 View session

github-actions Bot reviewed Apr 27, 2026

View reviewed changes

Copilot AI reviewed Apr 27, 2026

View reviewed changes

moonbox3 added documentation Improvements or additions to documentation .NET labels Apr 27, 2026

giles17 temporarily deployed to integration April 27, 2026 15:37 — with GitHub Actions Inactive

github-actions Bot changed the title ~~Python: Add dotnet integration test report to CI~~ .NET: Python: Add dotnet integration test report to CI Apr 27, 2026

giles17 had a problem deploying to integration April 27, 2026 15:37 — with GitHub Actions Failure

giles17 marked this pull request as draft April 27, 2026 15:44

giles17 and others added 2 commits April 27, 2026 08:45

chore: trigger dotnet CI for report validation

450eab4

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

giles17 force-pushed the dotnet-test-report branch from 584a2ae to 450eab4 Compare April 27, 2026 15:45

giles17 temporarily deployed to integration April 27, 2026 15:47 — with GitHub Actions Inactive

giles17 temporarily deployed to integration April 27, 2026 16:09 — with GitHub Actions Inactive

giles17 had a problem deploying to integration April 27, 2026 16:09 — with GitHub Actions Failure

fix: use .junit extension (not .junit.xml) for xunit v3 output

f48c8b3

xUnit v3 generates files with .junit extension, not .junit.xml. Update upload glob and aggregate.py discovery to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

giles17 temporarily deployed to integration April 27, 2026 16:29 — with GitHub Actions Inactive

giles17 temporarily deployed to integration April 27, 2026 16:53 — with GitHub Actions Inactive

giles17 temporarily deployed to integration April 27, 2026 19:27 — with GitHub Actions Inactive

giles17 had a problem deploying to integration April 27, 2026 19:29 — with GitHub Actions Failure

giles17 temporarily deployed to integration April 27, 2026 19:29 — with GitHub Actions Inactive

giles17 temporarily deployed to integration April 27, 2026 20:17 — with GitHub Actions Inactive

fix: match Python report summary format (Total, passed/total, etc.)

4ff5130

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

giles17 temporarily deployed to integration April 27, 2026 20:44 — with GitHub Actions Inactive

giles17 temporarily deployed to integration April 27, 2026 20:45 — with GitHub Actions Inactive

giles17 had a problem deploying to integration April 27, 2026 20:45 — with GitHub Actions Failure

giles17 temporarily deployed to integration April 27, 2026 21:01 — with GitHub Actions Inactive

giles17 temporarily deployed to integration April 27, 2026 21:41 — with GitHub Actions Inactive

giles17 temporarily deployed to integration April 27, 2026 21:43 — with GitHub Actions Inactive

giles17 temporarily deployed to integration April 27, 2026 22:07 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.NET: Python: Add dotnet integration test report to CI#5515

.NET: Python: Add dotnet integration test report to CI#5515
giles17 wants to merge 6 commits intomainfrom
dotnet-test-report

giles17 commented Apr 27, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

giles17 commented Apr 27, 2026

Motivation and Context

Description

Changes

Architecture

Key design decisions

Checklist

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants