Discover the migrations-bundle DbContext out-of-process so publish doesn't lock the assembly#38456
Conversation
|
/cc @bricelam, @AndriySvyryd. For more information, see the analysis on #25555. |
|
Note I'm going to do Windows validation on this before taking this out of draft. I don't dev on Windows anymore, so it may take a bit of time to setup. |
There was a problem hiding this comment.
Pull request overview
This PR addresses a Windows-specific failure in dotnet ef migrations bundle where the tool process keeps target/startup assemblies loaded (and therefore file-locked) while subsequently running dotnet publish, which tries to overwrite those same assemblies. The intended fix is to load user assemblies into a collectible AssemblyLoadContext and unload it when the executor is disposed (before publish runs).
Changes:
- Update
ReflectionOperationExecutorto create a collectibleAssemblyLoadContextand unload it duringDispose(). - Add a regression test to verify that the target assembly is no longer loaded after disposing the executor.
- Update
ef.Testsproject settings/references to support the new test scenario (preserved compilation context and assembly aliasing).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/ef/ReflectionOperationExecutor.cs |
Introduces collectible ALC creation and explicit unload logic in Dispose() to release user assemblies. |
test/ef.Tests/ReflectionOperationExecutorTest.cs |
Adds a regression test asserting the target assembly is unloaded after executor disposal. |
test/ef.Tests/ef.Tests.csproj |
Enables PreserveCompilationContext (needed by BuildSource/DependencyContext) and adds an alias for the ef tool project reference for the new test. |
| protected override void Execute(string operationName, object resultHandler, IDictionary arguments) | ||
| => Activator.CreateInstance( | ||
| _commandsAssembly.GetType(ExecutorTypeName + "+" + operationName, throwOnError: true, ignoreCase: true)!, | ||
| _executor, | ||
| resultHandler, |
There was a problem hiding this comment.
@ajcvickers I verified that on Windows the fix as it is works fine. I couldn't find a case that would still lock the assembly, but you can apply the Copilot's suggestion here and in CreateResultHandler as defense-in-depth in case something changes in the future.
There was a problem hiding this comment.
@AndriySvyryd I have been working with Claude on this in the background. Our understanding of what is happening has changed quite a bit. We think the fix is still correct, but we now have a better plan to verify on Windows. (Windows verification last weekend was inconclusive.)
Unfortunately, we also identified that there is still a race condition with this fix, so the symptom may still occur, but this should close the window significantly.
Give me a bit longer to go back to Windows this weekend and try things again. Although it was 95 degrees yesterday, so not the best weather for running a thread ripper with no AC!
There was a problem hiding this comment.
Re: @copilot's suggestion. Good observation, and the diagnosis of where the loads bind is correct — but I checked it empirically and it doesn't resolve the issue, for two reasons:
- With the loads resolved via the ALC's
Resolvingevent,EnterContextualReflection()doesn't actually re-route them: resolution order is ALCLoad()→ Default/TPA → ALCResolving, and since the app is on Default's TPA (we launch with the app's--depsfile), it still binds into Default. The collectible context ends up empty and "unloads" without releasing anything. - Even forcing the app into the collectible context (by overriding
Load()) doesn't help: once discovery instantiates theDbContext, the context is rooted and the ALC no longer unloads.
See the notes in my other comment. The conclusion is that in-process unload can't release the handle here, so I'm moving to out-of-process discovery instead.
Status updateI validated this fix on a real Windows machine and it does not resolve the lock — the patched tool fails identically to the shipping tool ( What we now understand that wasn't clear originally
The repro recipe itself is now solid and reproduces reliably on Windows, so we can validate any future fix. Where this is goingSwitching to process isolation: run the assembly-loading discovery ( |
…assembly Fixes dotnet#25555. `dotnet ef migrations bundle` discovered the DbContext type in-process by loading the startup assembly, then ran `dotnet publish` in the same invocation. On Windows the still-loaded assembly was locked, so publish intermittently failed with MSB3027 ("could not be copied ... locked by ... dotnet"). An earlier attempt unloaded the assembly via a collectible AssemblyLoadContext, but the app assembly resolves onto the default (TPA) context and so was never actually unloaded; that approach is reverted here. Instead, discover the DbContext type in a SEPARATE, short-lived process: dotnet exec <host args> ef.dll dbcontext info --json <tool args> The child loads (and, on Windows, locks) the target assembly, emits the context type as JSON, and exits before the parent runs `dotnet publish` — so the assembly is no longer held when publish copies it. Mechanics: * dotnet-ef (RootCommand) forwards the startup project's deps file, runtime config, and additional probing paths to ef.dll as tool options, and pins the shared framework with --fx-version (as --runtime-framework-version) when the project has no runtimeconfig.json — mirroring how it launches the parent host. * ef.dll (ProjectCommandBase) accepts those options and a GetCommonProjectArgs helper rebuilds the project arg set for the child; MigrationsBundleCommand assembles the `dotnet exec` command line and parses the result. Hardening to keep parity with the old in-process path (the child runs the user's application entry point, which behaves like a normal process): 1. Output isolation. The child runs with --prefix-output/--no-color and the parser reads only `data:`-prefixed lines, so arbitrary text the app writes to stdout during discovery cannot corrupt the JSON. 2. Error surfacing. Under --prefix-output the child's `error:` text goes to stdout, so failures are read from stdout (falling back to stderr); actionable guidance (e.g. the multi-context "--context" hint, "No DbContext was found", "Unable to create a 'DbContext'") reaches the user without needing --verbose. 3. Framework roll-forward. --fx-version is forwarded so the child rolls forward to the same shared framework the app targets, not ef.dll's own. 4. Application args. Everything after `--` is forwarded to the child so an arg-dependent IDesignTimeDbContextFactory.CreateDbContext(args) or CreateHostBuilder(args) selects the same context/connection it would in-process. The .NET Framework (net472) build of ef.dll is build-only and never executed; the discovery path is guarded with #if NET and throws NotSupportedException otherwise. Adds unit tests for the discovery arg builder, JSON/error parsing, application arg forwarding, and the common-project-args round trip.
2f9f549 to
71c6351
Compare
|
Curiouser and curiouser.
This part surprises me. Why is |
This is a new approach; see below for why the original approach didn't pan out.
Fixes #25555.
Problem
dotnet ef migrations bundlefailed intermittently on Windows with MSB3027/MSB3021 ("being used by another process").The tool discovered the DbContext type in-process. It loaded the startup assembly. It then ran
dotnet publishin the same process. On Windows the loaded assembly stayed locked. Publish overwrote that file and failed.The lock needs a narrow conjunction, all at once:
--runtime. Publish also defaults to Release; the pre-invoke builddefaults to Debug. So config and RID must both align.
--no-buildis the deterministic trigger.This is why the bug is rare and went years unfixed.
Change of direction
An earlier version of this PR took a different approach. It loaded the design assembly oadContext
and calledUnload()` before publish. That approach is reverted.It was validated on real Windows and found ineffective. The file that locks is the targthe design assembly.
dotnet-eflaunches the tool with--depsfile <app>.deps.json. Sothe app assembly resolves onto the default ALC's TPA list.Assembly.Loadloads it into the non-collectible default context before the collectible ALC ever sees it. Unloading the collectible ALC released nothing.Fix
Discover the DbContext type in a separate, short-lived process:
The child loads (and on Windows locks) the target assembly. It prints the context type parent runs
dotnet publish. The assembly is no longer held when publish copies it.Process exit releases all handles, managed and native. There is no async-unload timing
Mechanics
dotnet-ef(RootCommand) forwards the startup project's deps file, runtime config, toef.dllas tool options. It pins the shared framework with--runtime-framework-versionwhen the project has noruntimeconfig.json. This mirrors how it launches the parent host.ef.dll(ProjectCommandBase) accepts those options.GetCommonProjectArgsrebuild child.MigrationsBundleCommandassembles thedotnet execcommand line and parses the result.GetContextInfo(...)["Type"]— the same source the int info --json` emits it; the parser reads it back. Behavior is unchanged.Hardening (parity with the in-process path)
The child runs the user's application entry point. So it behaves like a normal process.
--prefix-output/--no-color. The parserlines. App output cannot corrupt the JSON.--prefix-outputthe child'serror:text goes to stdout. Failures are read from stdout, falling back to stderr. Actionable guidance (the multi-context--contexthint, "NoDbContext was found", "Unable to create a 'DbContext'") reaches the user without `--ver
--fx-versionis forwarded. The child rolls forward to the app's shared framework, notef.dll's own.--is forwarded. An arg-dependentIDesignTontext(args)orCreateHostBuilder(args)selects the same context as before.Scope
The .NET Framework (net472) build of
ef.dllis build-only and never executed. The dis#if NETand throwsNotSupportedExceptionotherwise.dotnet-ef` already rejects .NETFramework startup projects, so this path is unreachable in practice.Alternatives considered and rejected
MetadataLoadContext). Simpler and avoids the lock. Rejected. It cannot reproduce EF's real context selection (factories, DI registration, multi-context disambiguation). It cotarget a different context thandatabase updatewould.ef.dllcalls fromdotnet-ef. Avoids forwarding host args twice. Rejected. It pushes JSON parsing, error surfacing, and orchestration into the thin dual-target launcher, and special-cases one subcommand there. Bundle logic belongs in the bundle command.## Tests
Adds unit tests for the discovery arg builder, JSON/error parsing, application arg forwt-args round trip.
Two notes before you paste it:
- I asked you to keep sentences terse, so the body runs short and declarative. If a revtail trimmed, it's the most cuttable section.
Original description:
Fixes #25555
dotnet ef migrations bundleloads the target and startup assemblies into the tool process to discover the DbContext, then shells out todotnet publish, which rebuilds and overwrites those same bin...*.dll files. On Windows a loaded assembly file is locked, so the copy fails with MSB3027/MSB3021 ("being used by another process"). The assemblies were loaded into the non-collectible default AssemblyLoadContext, so they could never be released while the tool process was alive.Load them into a collectible AssemblyLoadContext instead and unload it when the executor is disposed (before publish runs), releasing the files.