Analysis scope
Period: 2026-04-10 to 2026-04-24 (14 days)
Analysis run: 2026-04-24 09:39 UTC
Repository: dotnet/android
PRs analyzed: 57 merged PRs
Data sources
| Resource |
Details |
| GitHub REST API |
Merged PRs via gh pr list --state merged --limit 100 --json number,title,mergedAt,headRefName,isCrossRepository |
| Azure DevOps pipeline |
Xamarin.Android-PR (definition ID 12278) on devdiv.visualstudio.com, project DevDiv |
| AZDO test runs API |
GET /DevDiv/_apis/test/runs?buildUri=vstfs:///Build/Build/{buildId}&api-version=7.1 |
| AZDO test results API |
GET /DevDiv/_apis/test/runs/{runId}/results?outcomes=Failed&$top=200&api-version=7.1 |
The dotnet-android pipeline on dnceng-public (dev.azure.com) was not used as the primary data source. Since #11153 (merged 2026-04-17), that pipeline runs build-only for direct team-member PRs — device test results for those PRs exist exclusively in the Xamarin.Android-PR devdiv pipeline.
Of the 57 PRs in the window:
For each merged PR, the script used the highest-numbered build ID for that PR in Xamarin.Android-PR (the last/final build, representing the merge commit). Build-to-PR correlation used the sourceBranch field (refs/pull/{number}/merge).
How AZDO auto-retry creates the flakiness signal
When a test fails in Xamarin.Android-PR, the pipeline creates a separate auto-retry run identified by (Auto-Retry) in the test run name (e.g. MSBuildDeviceIntegration On Device - macOS-3 (Auto-Retry)). Only originally-failed tests are re-executed in this run.
Critical data artifact: after an auto-retry run completes, the failedTests counter on the original test run is reset to 0 regardless of the retry outcome. As a result, querying failedTests on standard runs almost always returns 0, even when tests genuinely failed. All test failure evidence lives exclusively in the (Auto-Retry) named runs.
This analysis targets (Auto-Retry) runs and uses the result outcome of each test within those runs as the signal.
A PR is classified as "merged with red CI" if the devdiv Xamarin.Android-PR build's result field was "failed" (evaluated at analysis time). This includes builds where the failure was in a non-test step (setup, artifact publishing, infra) — those will have a failed build result but zero auto-retry test records.
Signal categories
| Signal |
Symbol |
Meaning |
red_auto_retry_passed |
🔴 |
Test appeared in auto-retry run AND passed on retry; the overall build result was failed — the PR was merged past this test failure |
green_auto_retry_passed |
🟢 |
Test appeared in auto-retry run AND passed on retry; the overall build was ultimately succeeded |
auto_retry_failed |
⚠️ |
Test appeared in auto-retry run but still failed on retry — the build remained red due to this test |
The 🔴 column is the highest-confidence flakiness signal: the team observed the failure, CI retried, it passed, and the PR was merged — the team explicitly decided the failure was not a blocker.
Test run environments
All auto-retry runs executed on macOS agents. The test suite is sharded into parallel slots (macOS-1 through macOS-12). Two suites produced auto-retry data in this window:
MSBuildDeviceIntegration On Device — primary device integration test suite
WearOS On Device — WearOS-specific tests (runs on macOS agents with an Android emulator)
A test appearing in "Both" suites means it was observed failing independently in both suite's auto-retry runs across different PRs.
Summary statistics
| Metric |
Count |
| Merged PRs analyzed |
57 |
| Direct (team-member) PRs |
55 |
| Fork PRs |
2 |
PRs merged with failed Xamarin.Android-PR build |
38 |
| Unique test names observed in auto-retry runs |
51 |
| Tests with ≥1 🔴 (passed on retry in a red build) |
33 |
| Tests with ≥1 🟢 (passed on retry in a green build only) |
14 |
| Tests that appear in both 🔴 and 🟢 |
~10 |
Full flaky test table (51 tests)
Sorted by 🔴 count descending, then total PRs affected descending.
Legend: Suite — MSDI = MSBuildDeviceIntegration, WearOS = WearOS On Device
| # |
Test |
Suite |
🔴 RedRetryPass |
🟢 GreenRetryPass |
⚠️ RetryFail |
PRs |
| 1 |
ApplicationRunsWithDebuggerAndBreaks(True,null,"apk",True,MonoVM) |
WearOS |
8 |
1 |
0 |
9 |
| 2 |
Build_XAML_Change(False) |
MSDI |
5 |
0 |
3 |
8 |
| 3 |
DotNetRunWaitForExit |
MSDI |
5 |
0 |
0 |
5 |
| 4 |
ApplicationRunsWithDebuggerAndBreaks(False,null,"aab",True,MonoVM) |
Both |
4 |
1 |
1 |
6 |
| 5 |
ApplicationRunsWithDebuggerAndBreaks(False,"guest1","aab",True,MonoVM) |
Both |
4 |
1 |
1 |
6 |
| 6 |
ApplicationRunsWithDebuggerAndBreaks(False,null,"apk",True,MonoVM) |
WearOS |
4 |
0 |
1 |
5 |
| 7 |
ApplicationRunsWithDebuggerAndBreaks(True,null,"apk",False,MonoVM) |
WearOS |
4 |
0 |
0 |
4 |
| 8 |
ApplicationRunsWithDebuggerAndBreaks(False,"guest1","apk",True,MonoVM) |
Both |
3 |
1 |
2 |
6 |
| 9 |
DesignTimeBuild_CSharp_From_Clean |
MSDI |
3 |
0 |
1 |
4 |
| 10 |
EnsureUncaughtExceptionWorks(MonoVM) |
MSDI |
3 |
0 |
0 |
3 |
| 11 |
ApplicationRunsWithDebuggerAndBreaks(True,null,"aab",True,MonoVM) |
WearOS |
2 |
2 |
0 |
4 |
| 12 |
Install_CSharp_Change |
MSDI |
2 |
0 |
5 |
7 |
| 13 |
Build_AndroidManifest_Change |
MSDI |
2 |
0 |
4 |
6 |
| 14 |
BuildBasicApplicationAndAotProfileIt |
MSDI |
2 |
1 |
1 |
4 |
| 15 |
Build_AndroidResource_Change |
MSDI |
2 |
1 |
1 |
4 |
| 16 |
Install_CSharp_FromClean |
MSDI |
2 |
0 |
0 |
2 |
| 17 |
SupportDesugaringStaticInterfaceMethods(MonoVM) |
MSDI |
2 |
0 |
0 |
2 |
| 18 |
DotNetRunWithDeviceParameter |
MSDI |
2 |
0 |
0 |
2 |
| 19 |
MonoAndroidExportReferencedAppStarts(False,False,MonoVM) |
MSDI |
2 |
0 |
0 |
2 |
| 20 |
ApplicationRunsWithDebuggerAndBreaks(True,"guest1","apk",True,MonoVM) |
WearOS |
1 |
1 |
1 |
3 |
| 21 |
ApplicationRunsWithDebuggerAndBreaks(True,"guest1","aab",True,MonoVM) |
WearOS |
1 |
1 |
0 |
2 |
| 22 |
SupportDesugaringStaticInterfaceMethods(CoreCLR) |
MSDI |
1 |
1 |
0 |
2 |
| 23 |
Build_CSharp_Change |
MSDI |
1 |
0 |
3 |
4 |
| 24 |
Build_AndroidAsset_Change |
MSDI |
1 |
0 |
2 |
3 |
| 25 |
DotNetInstallAndRunPreviousSdk(True,MonoVM) |
WearOS |
1 |
0 |
1 |
2 |
| 26 |
MonoAndroidExportReferencedAppStarts(True,False,CoreCLR) |
MSDI |
1 |
0 |
0 |
1 |
| 27 |
MonoAndroidExportReferencedAppStarts(True,True,CoreCLR) |
MSDI |
1 |
0 |
0 |
1 |
| 28 |
ApplicationRunsWithoutDebugger(True,True,True,MonoVM) |
MSDI |
1 |
0 |
0 |
1 |
| 29 |
ApplicationRunsWithoutDebugger(True,False,True,MonoVM) |
MSDI |
1 |
0 |
0 |
1 |
| 30 |
SubscribeToAppDomainUnhandledException(MonoVM) |
MSDI |
1 |
0 |
0 |
1 |
| 31 |
ExportedMembersSurviveGarbageCollection(True,CoreCLR) |
MSDI |
1 |
0 |
0 |
1 |
| 32 |
JsonDeserializationCreatesJavaHandle(True,CoreCLR) |
MSDI |
1 |
0 |
0 |
1 |
| 33 |
DotNetInstallAndRunPreviousSdk(False,MonoVM) |
WearOS |
1 |
0 |
0 |
1 |
| 34 |
TypeAndMemberRemapping(False,MonoVM) |
MSDI |
0 |
1 |
0 |
1 |
| 35 |
SmokeTestBuildAndRunWithSpecialCharacters("随机生成器",CoreCLR) |
MSDI |
0 |
1 |
0 |
1 |
| 36 |
ApkSet |
MSDI |
0 |
1 |
0 |
1 |
| 37 |
GradleFBProj(True,CoreCLR) |
MSDI |
0 |
1 |
0 |
1 |
| 38 |
DotNetNewAndroidTest(MonoVM) |
MSDI |
0 |
0 |
1 |
1 |
| 39 |
DotNetInstallAndRunMinorAPILevels(True,"net10.0-android36.1",MonoVM) |
MSDI |
0 |
0 |
1 |
1 |
| 40 |
UnhandledExceptionFromButtonClick(CoreCLR) |
MSDI |
0 |
0 |
1 |
1 |
| 41 |
InstallWithoutSharedRuntime(CoreCLR) |
MSDI |
0 |
0 |
1 |
1 |
| 42 |
SkiaSharpCanvasBasedAppRuns(True,True,MonoVM) |
MSDI |
0 |
0 |
1 |
1 |
| 43 |
FixLegacyResourceDesignerStep(False,CoreCLR) |
MSDI |
0 |
0 |
1 |
1 |
| 44 |
InstantRunFastDevDexes(False) |
MSDI |
0 |
0 |
1 |
1 |
| 45 |
CustomLinkDescriptionPreserve(SdkOnly,MonoVM) |
MSDI |
0 |
0 |
1 |
1 |
| 46 |
GradleFBProj(False,CoreCLR) |
MSDI |
0 |
0 |
1 |
1 |
| 47 |
DotNetRun(False,"llvm-ir",MonoVM) |
MSDI |
0 |
0 |
1 |
1 |
| 48 |
TestAndroidStoreKey(False,False,"apk","True","file:android","-keystore test.keystore",True) |
MSDI |
0 |
0 |
1 |
1 |
| 49 |
SingleProject_ApplicationId(True,MonoVM) |
MSDI |
0 |
0 |
1 |
1 |
| 50 |
Build_No_Changes |
MSDI |
0 |
0 |
1 |
1 |
| 51 |
Build_XAML_Change(True) |
MSDI |
0 |
0 |
1 |
1 |
Suite totals: MSBuildDeviceIntegration — 39 tests; WearOS — 10 tests; Both — 2 tests (appearing in both suites across different PRs)
Tests with elevated ⚠️ retry-fail counts
Several tests have a notably high ⚠️ count relative to 🔴, meaning they frequently fail and do not recover on retry. These may represent persistent failures in specific configurations or environments, not just flakiness:
| Test |
🔴 |
⚠️ |
PRs |
Install_CSharp_Change |
2 |
5 |
7 |
Build_AndroidManifest_Change |
2 |
4 |
6 |
Build_XAML_Change(False) |
5 |
3 |
8 |
Build_CSharp_Change |
1 |
3 |
4 |
Build_AndroidAsset_Change |
1 |
2 |
3 |
ApplicationRunsWithDebuggerAndBreaks(False,"guest1","apk",True,MonoVM) |
3 |
2 |
6 |
PRs merged with red CI (38)
These PRs had a failed result on their last Xamarin.Android-PR build at the time of merge.
| PR |
Title |
| #11070 |
Guard Mono-specific AOT targets for CoreCLR runtime, add XA1042 warning |
| #11082 |
Remove duplicate @ prefix from issueAuthor in GitOps |
| #11084 |
Run FixLegacyResourceDesigner before trimming |
| #11109 |
Remove broken 'Windows > Tests > Debugging' CI lane |
| #11110 |
Use Assert.Inconclusive for emulator acquisition failures |
| #11112 |
[Mono.Android] fix global ref leak in TypeManager.Activate |
| #11117 |
LEGO: Pull request from juno/hb_6dddf33b-c6da-43d8-ac04-14d2c339cb00_20260415103450130 to main |
| #11121 |
Localized file check-in by OneLocBuild Task: Build definition ID 17928: Build ID 13854166 |
| #11122 |
[TrimmableTypeMap] Implement alias support in codegen and runtime |
| #11124 |
Remove unused HashSet allocations in HashJavaNames |
| #11125 |
Bump external/Java.Interop from 85919bb to 7b018fe |
| #11126 |
Port TypeMapObjectsXmlFile.Import to XmlReader streaming |
| #11127 |
Avoid O(n²) array growth in GenerateTypeMappings |
| #11129 |
Document AndroidInstrumentation and EnableMSTestRunner build properties |
| #11132 |
Compute Java name hashes on demand instead of pre-computing both |
| #11133 |
Use stackalloc in TypeMapHelper to reduce allocations |
| #11141 |
Add investigation & debugging practices to copilot-instructions |
| #11142 |
[TrimmableTypeMap] Package CoreCLR preserve list in SDK pack |
| #11143 |
[TrimmableTypeMap] Manifest generator fixes |
| #11144 |
[TrimmableTypeMap] Fix IL1034 by excluding app assembly from trimmer roots |
| #11149 |
[copilot] Add /review agentic workflow and update android-reviewer skill |
| #11150 |
[copilot] Use Claude Opus 4.6 for android-reviewer workflow |
| #11152 |
Set min-integrity: none on /review workflow |
| #11153 |
[ci] Only run dnceng-public pipeline for fork PRs |
| #11155 |
Localized file check-in by OneLocBuild Task: Build definition ID 17928: Build ID 13878536 |
| #11159 |
Bump com.android.tools.build:manifest-merger from 32.1.0 to 32.1.1 |
| #11160 |
[main] Update dependencies from dotnet/dotnet |
| #11162 |
[tests] Improve NUnit runner reporting and dry-run auditing |
| #11163 |
LEGO: Pull request from juno/hb_6dddf33b-c6da-43d8-ac04-14d2c339cb00_20260420103228482 to main |
| #11164 |
Add network allowlist to android-reviewer workflow |
| #11168 |
[TrimmableTypeMap] Fix UCO boolean return type mismatch causing n_* callback trimming |
| #11169 |
Make CoreCLR the default runtime for Debug builds |
| #11171 |
Bump external/Java.Interop from 7b018fe to 69c9daa |
| #11173 |
Add roles restriction to /review slash command |
| #11175 |
Localized file check-in by OneLocBuild Task: Build definition ID 17928: Build ID 13905231 |
| #11178 |
[TrimmableTypeMap] Add exception handling to UCO constructor callbacks (nctor_*_uco) |
| #11181 |
[TrimmableTypeMap] Per-assembly typemap universes with startup hook initialization |
| #11195 |
[Xamarin.Android.Build.Tasks] Retry RemoveDirFixed on ERROR_DIR_NOT_EMPTY |
Caveats and limitations
- Counter reset artifact: the
failedTests count on original AZDO test runs is reset to 0 after auto-retry. All failure evidence exists exclusively in (Auto-Retry) named runs. Tests that fail without triggering an auto-retry are not captured by this analysis.
- Single sample per PR: only the last build for each PR is analyzed. If a PR was rebuilt multiple times, only the final build's test records are included.
- No error message content: this analysis captures test names and pass/fail outcomes only. Error messages and stack traces were not collected.
- Infrastructure failures: some of the 38 red-CI PRs had failures in non-test stages (setup, artifact publishing). Those PRs have no associated test records in this dataset. The exact count of infra-vs-test failures was not determined.
- Time window: the 14-day window captures the most recent flaky tests but does not reflect longer-term trends. Tests with ⚠️ retry-fail counts of 1 (rows 38–51) may represent one-off events.
- No per-shard attribution: when a test fails on multiple macOS shards for the same PR, each shard failure is counted as a separate ⚠️ or 🔴 occurrence. The PR count (
total_prs) deduplicated this; the signal counts did not.
Analysis scope
Period: 2026-04-10 to 2026-04-24 (14 days)
Analysis run: 2026-04-24 09:39 UTC
Repository: dotnet/android
PRs analyzed: 57 merged PRs
Data sources
gh pr list --state merged --limit 100 --json number,title,mergedAt,headRefName,isCrossRepositoryXamarin.Android-PR(definition ID 12278) ondevdiv.visualstudio.com, projectDevDivGET /DevDiv/_apis/test/runs?buildUri=vstfs:///Build/Build/{buildId}&api-version=7.1GET /DevDiv/_apis/test/runs/{runId}/results?outcomes=Failed&$top=200&api-version=7.1The
dotnet-androidpipeline ondnceng-public(dev.azure.com) was not used as the primary data source. Since #11153 (merged 2026-04-17), that pipeline runs build-only for direct team-member PRs — device test results for those PRs exist exclusively in theXamarin.Android-PRdevdiv pipeline.Of the 57 PRs in the window:
dotnet/android) → test data inXamarin.Android-PRdnceng-publicFor each merged PR, the script used the highest-numbered build ID for that PR in
Xamarin.Android-PR(the last/final build, representing the merge commit). Build-to-PR correlation used thesourceBranchfield (refs/pull/{number}/merge).How AZDO auto-retry creates the flakiness signal
When a test fails in
Xamarin.Android-PR, the pipeline creates a separate auto-retry run identified by(Auto-Retry)in the test run name (e.g.MSBuildDeviceIntegration On Device - macOS-3 (Auto-Retry)). Only originally-failed tests are re-executed in this run.Critical data artifact: after an auto-retry run completes, the
failedTestscounter on the original test run is reset to 0 regardless of the retry outcome. As a result, queryingfailedTestson standard runs almost always returns 0, even when tests genuinely failed. All test failure evidence lives exclusively in the(Auto-Retry)named runs.This analysis targets
(Auto-Retry)runs and uses the result outcome of each test within those runs as the signal.A PR is classified as "merged with red CI" if the devdiv
Xamarin.Android-PRbuild'sresultfield was"failed"(evaluated at analysis time). This includes builds where the failure was in a non-test step (setup, artifact publishing, infra) — those will have a failed build result but zero auto-retry test records.Signal categories
red_auto_retry_passedfailed— the PR was merged past this test failuregreen_auto_retry_passedsucceededauto_retry_failedThe 🔴 column is the highest-confidence flakiness signal: the team observed the failure, CI retried, it passed, and the PR was merged — the team explicitly decided the failure was not a blocker.
Test run environments
All auto-retry runs executed on macOS agents. The test suite is sharded into parallel slots (
macOS-1throughmacOS-12). Two suites produced auto-retry data in this window:MSBuildDeviceIntegration On Device— primary device integration test suiteWearOS On Device— WearOS-specific tests (runs on macOS agents with an Android emulator)A test appearing in "Both" suites means it was observed failing independently in both suite's auto-retry runs across different PRs.
Summary statistics
Xamarin.Android-PRbuildFull flaky test table (51 tests)
Sorted by 🔴 count descending, then total PRs affected descending.
Legend: Suite —
MSDI= MSBuildDeviceIntegration,WearOS= WearOS On DeviceApplicationRunsWithDebuggerAndBreaks(True,null,"apk",True,MonoVM)Build_XAML_Change(False)DotNetRunWaitForExitApplicationRunsWithDebuggerAndBreaks(False,null,"aab",True,MonoVM)ApplicationRunsWithDebuggerAndBreaks(False,"guest1","aab",True,MonoVM)ApplicationRunsWithDebuggerAndBreaks(False,null,"apk",True,MonoVM)ApplicationRunsWithDebuggerAndBreaks(True,null,"apk",False,MonoVM)ApplicationRunsWithDebuggerAndBreaks(False,"guest1","apk",True,MonoVM)DesignTimeBuild_CSharp_From_CleanEnsureUncaughtExceptionWorks(MonoVM)ApplicationRunsWithDebuggerAndBreaks(True,null,"aab",True,MonoVM)Install_CSharp_ChangeBuild_AndroidManifest_ChangeBuildBasicApplicationAndAotProfileItBuild_AndroidResource_ChangeInstall_CSharp_FromCleanSupportDesugaringStaticInterfaceMethods(MonoVM)DotNetRunWithDeviceParameterMonoAndroidExportReferencedAppStarts(False,False,MonoVM)ApplicationRunsWithDebuggerAndBreaks(True,"guest1","apk",True,MonoVM)ApplicationRunsWithDebuggerAndBreaks(True,"guest1","aab",True,MonoVM)SupportDesugaringStaticInterfaceMethods(CoreCLR)Build_CSharp_ChangeBuild_AndroidAsset_ChangeDotNetInstallAndRunPreviousSdk(True,MonoVM)MonoAndroidExportReferencedAppStarts(True,False,CoreCLR)MonoAndroidExportReferencedAppStarts(True,True,CoreCLR)ApplicationRunsWithoutDebugger(True,True,True,MonoVM)ApplicationRunsWithoutDebugger(True,False,True,MonoVM)SubscribeToAppDomainUnhandledException(MonoVM)ExportedMembersSurviveGarbageCollection(True,CoreCLR)JsonDeserializationCreatesJavaHandle(True,CoreCLR)DotNetInstallAndRunPreviousSdk(False,MonoVM)TypeAndMemberRemapping(False,MonoVM)SmokeTestBuildAndRunWithSpecialCharacters("随机生成器",CoreCLR)ApkSetGradleFBProj(True,CoreCLR)DotNetNewAndroidTest(MonoVM)DotNetInstallAndRunMinorAPILevels(True,"net10.0-android36.1",MonoVM)UnhandledExceptionFromButtonClick(CoreCLR)InstallWithoutSharedRuntime(CoreCLR)SkiaSharpCanvasBasedAppRuns(True,True,MonoVM)FixLegacyResourceDesignerStep(False,CoreCLR)InstantRunFastDevDexes(False)CustomLinkDescriptionPreserve(SdkOnly,MonoVM)GradleFBProj(False,CoreCLR)DotNetRun(False,"llvm-ir",MonoVM)TestAndroidStoreKey(False,False,"apk","True","file:android","-keystore test.keystore",True)SingleProject_ApplicationId(True,MonoVM)Build_No_ChangesBuild_XAML_Change(True)Suite totals: MSBuildDeviceIntegration — 39 tests; WearOS — 10 tests; Both — 2 tests (appearing in both suites across different PRs)
Tests with elevated⚠️ retry-fail counts
Several tests have a notably high⚠️ count relative to 🔴, meaning they frequently fail and do not recover on retry. These may represent persistent failures in specific configurations or environments, not just flakiness:
Install_CSharp_ChangeBuild_AndroidManifest_ChangeBuild_XAML_Change(False)Build_CSharp_ChangeBuild_AndroidAsset_ChangeApplicationRunsWithDebuggerAndBreaks(False,"guest1","apk",True,MonoVM)PRs merged with red CI (38)
These PRs had a
failedresult on their lastXamarin.Android-PRbuild at the time of merge.85919bbto7b018femin-integrity: noneon /review workflow7b018feto69c9daa/reviewslash commandCaveats and limitations
failedTestscount on original AZDO test runs is reset to 0 after auto-retry. All failure evidence exists exclusively in(Auto-Retry)named runs. Tests that fail without triggering an auto-retry are not captured by this analysis.total_prs) deduplicated this; the signal counts did not.