From 2db045a98d294bd05112ddf3f163f82948201d25 Mon Sep 17 00:00:00 2001 From: Simon Rozsival Date: Fri, 17 Apr 2026 11:12:34 +0200 Subject: [PATCH 1/3] Add investigation & debugging practices to copilot-instructions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Capture five high-leverage lessons learned while debugging the CoreCLRTrimmable lane failures: 1. Reproduce CI failures locally — don't iterate through CI 2. Nuke bin/ and obj/ when the build enters a weird state 3. Verify code paths with logging before reasoning about them 4. Decompile the generated .dll when generated code misbehaves 5. 'am instrument' going silent means it crashed, not hung These rules are placed in .github/copilot-instructions.md so every future agent loads them automatically. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/copilot-instructions.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 9aca8fa06ae..51389df3403 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -199,6 +199,28 @@ This pattern ensures proper encoding, timestamps, and file attributes are handle 3. For deep .binlog analysis, use the `azdo-build-investigator` skill. 4. Only after the skill confirms no Azure DevOps failures should you report CI as passing. +## Investigation & Debugging Practices + +When diagnosing runtime, build, or test failures, follow these practices. They exist because the .NET ↔ JNI ↔ C++ ↔ generated-native stack is loosely coupled and static reasoning alone is unreliable. + +- **Reproduce CI failures locally — do not iterate through CI.** A clean local test cycle is minutes; a CI iteration is hours. Run device tests the same way CI does: + ```bash + make prepare && make all CONFIGURATION=Release + ./dotnet-local.sh build tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj \ + -t:RunTestApp -c Release \ + -p:_AndroidTypeMapImplementation= \ + -p:UseMonoRuntime= + ``` + Results land in `TestResult-Mono.Android.NET_Tests-*.xml` at the repo root. + +- **When the build gets into a weird state, nuke `bin/` and `obj/` and rebuild from scratch.** Stale incremental output causes phantom errors that no amount of code fixing will resolve. A clean `make clean && make prepare && make all CONFIGURATION=Release` is cheap compared to hours chasing ghosts. + +- **Verify code paths with logging before reasoning about them.** Loose coupling between .NET, Java, C++, and generated LLVM IR makes "this must be called from X" assumptions unreliable. Add `log_warn (LOG_DEFAULT, "..."sv, ...)` in C++ or `Logger.Log`/`AndroidLog.Print` in C#, rebuild, re-run, and grep `adb logcat -d`. **Absence of log output is itself evidence** — if your log never fires, your mental model of the call graph is wrong. + +- **When generated code behaves incorrectly, decompile the produced `.dll` before blaming runtime.** Use `ilspycmd` or `ildasm` to inspect the actual generated IL/metadata (attributes, custom attribute rows, type layout). A single missing attribute or misnamed type in generator output can cascade into opaque runtime failures. Do not trust the generator source to tell you what it emitted. + +- **`am instrument` going silent means it crashed, not hung.** If the test runner's output stops mid-run, assume the instrumentation process died. Check `adb logcat -d | grep -E 'FATAL|tombstone|signal'` and look for a native crash dump. Do not wait for a 30-minute CI timeout to "confirm" a hang that was really an instant crash. + ## Troubleshooting - **Build:** Clean `bin/`+`obj/`, check Android SDK/NDK, `make clean` - **MSBuild:** Test in isolation, validate inputs From cd7f66fec558a85e3b7314abb7a9cbb0b6affcf2 Mon Sep 17 00:00:00 2001 From: Jonathan Peppers Date: Fri, 17 Apr 2026 12:30:58 -0500 Subject: [PATCH 2/3] Address review feedback: tighten verbosity, add Windows note, consolidate bin/obj guidance - Add Windows note for make/dotnet-local.sh commands - Cross-reference Troubleshooting section instead of duplicating bin/obj cleanup - Tighten verbose bullets to reduce token cost - Expand Troubleshooting Build entry with full make command Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/copilot-instructions.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 51389df3403..8778648be14 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -211,18 +211,19 @@ When diagnosing runtime, build, or test failures, follow these practices. They e -p:_AndroidTypeMapImplementation= \ -p:UseMonoRuntime= ``` + On Windows, use `build.cmd` and `dotnet-local.cmd` instead of `make`/`dotnet-local.sh`. Results land in `TestResult-Mono.Android.NET_Tests-*.xml` at the repo root. -- **When the build gets into a weird state, nuke `bin/` and `obj/` and rebuild from scratch.** Stale incremental output causes phantom errors that no amount of code fixing will resolve. A clean `make clean && make prepare && make all CONFIGURATION=Release` is cheap compared to hours chasing ghosts. +- **When the build gets into a weird state, nuke `bin/` and `obj/` and rebuild from scratch.** Stale incremental output causes phantom errors. See **Troubleshooting → Build** below. -- **Verify code paths with logging before reasoning about them.** Loose coupling between .NET, Java, C++, and generated LLVM IR makes "this must be called from X" assumptions unreliable. Add `log_warn (LOG_DEFAULT, "..."sv, ...)` in C++ or `Logger.Log`/`AndroidLog.Print` in C#, rebuild, re-run, and grep `adb logcat -d`. **Absence of log output is itself evidence** — if your log never fires, your mental model of the call graph is wrong. +- **Verify code paths with logging, not reasoning.** Add `log_warn (LOG_DEFAULT, "..."sv, ...)` in C++ or `Logger.Log`/`AndroidLog.Print` in C#, rebuild, re-run, and check `adb logcat -d`. If your log never fires, your call-graph assumption is wrong. -- **When generated code behaves incorrectly, decompile the produced `.dll` before blaming runtime.** Use `ilspycmd` or `ildasm` to inspect the actual generated IL/metadata (attributes, custom attribute rows, type layout). A single missing attribute or misnamed type in generator output can cascade into opaque runtime failures. Do not trust the generator source to tell you what it emitted. +- **Decompile the produced `.dll` before blaming runtime.** Use `ilspycmd` or `ildasm` to inspect the actual generated IL/metadata. A missing attribute or misnamed type in generator output cascades into opaque runtime failures. -- **`am instrument` going silent means it crashed, not hung.** If the test runner's output stops mid-run, assume the instrumentation process died. Check `adb logcat -d | grep -E 'FATAL|tombstone|signal'` and look for a native crash dump. Do not wait for a 30-minute CI timeout to "confirm" a hang that was really an instant crash. +- **`am instrument` going silent means it crashed, not hung.** Check `adb logcat -d | grep -E 'FATAL|tombstone|signal'` for a native crash dump. Do not wait for a CI timeout to "confirm" a hang that was really an instant crash. ## Troubleshooting -- **Build:** Clean `bin/`+`obj/`, check Android SDK/NDK, `make clean` +- **Build:** Clean `bin/`+`obj/`, check Android SDK/NDK, `make clean && make prepare && make all` - **MSBuild:** Test in isolation, validate inputs - **Device:** Use update directories for rapid Debug iteration - **Performance:** See `../Documentation/guides/profiling.md` and `../Documentation/guides/tracing.md` From 5e83d7c67f29df82655602015c095248992cd39a Mon Sep 17 00:00:00 2001 From: Jonathan Peppers Date: Fri, 17 Apr 2026 12:32:31 -0500 Subject: [PATCH 3/3] Address Copilot review feedback: fix _AndroidTypeMapImplementation values, use Android.Util.Log - Replace invalid 'legacy' with correct values: llvm-ir|managed|trimmable - Use Android.Util.Log (general C#) instead of AndroidLog.Print (NativeAOT-internal) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/copilot-instructions.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 8778648be14..83afd8082b7 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -208,7 +208,7 @@ When diagnosing runtime, build, or test failures, follow these practices. They e make prepare && make all CONFIGURATION=Release ./dotnet-local.sh build tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj \ -t:RunTestApp -c Release \ - -p:_AndroidTypeMapImplementation= \ + -p:_AndroidTypeMapImplementation= \ -p:UseMonoRuntime= ``` On Windows, use `build.cmd` and `dotnet-local.cmd` instead of `make`/`dotnet-local.sh`. @@ -216,7 +216,7 @@ When diagnosing runtime, build, or test failures, follow these practices. They e - **When the build gets into a weird state, nuke `bin/` and `obj/` and rebuild from scratch.** Stale incremental output causes phantom errors. See **Troubleshooting → Build** below. -- **Verify code paths with logging, not reasoning.** Add `log_warn (LOG_DEFAULT, "..."sv, ...)` in C++ or `Logger.Log`/`AndroidLog.Print` in C#, rebuild, re-run, and check `adb logcat -d`. If your log never fires, your call-graph assumption is wrong. +- **Verify code paths with logging, not reasoning.** Add `log_warn (LOG_DEFAULT, "..."sv, ...)` in C++ or `Android.Util.Log` in C#, rebuild, re-run, and check `adb logcat -d`. If your log never fires, your call-graph assumption is wrong. - **Decompile the produced `.dll` before blaming runtime.** Use `ilspycmd` or `ildasm` to inspect the actual generated IL/metadata. A missing attribute or misnamed type in generator output cascades into opaque runtime failures.