Skip to content

Discover the migrations-bundle DbContext out-of-process so publish doesn't lock the assembly#38456

Draft
ajcvickers wants to merge 1 commit into
dotnet:mainfrom
ajcvickers:fix/25555-bundle-unload-assembly-context
Draft

Discover the migrations-bundle DbContext out-of-process so publish doesn't lock the assembly#38456
ajcvickers wants to merge 1 commit into
dotnet:mainfrom
ajcvickers:fix/25555-bundle-unload-assembly-context

Conversation

@ajcvickers

@ajcvickers ajcvickers commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

This is a new approach; see below for why the original approach didn't pan out.

Fixes #25555.

Problem

dotnet ef migrations bundle failed intermittently on Windows with MSB3027/MSB3021 ("being used by another process").

The tool discovered the DbContext type in-process. It loaded the startup assembly. It then ran dotnet publish in the same process. On Windows the loaded assembly stayed locked. Publish overwrote that file and failed.

The lock needs a narrow conjunction, all at once:

  • C1 — the loaded path equals the publish target. Bundle's publish always injects a RID. The pre-invoke build only uses a RID with --runtime. Publish also defaults to Release; the pre-invoke build
    defaults to Debug. So config and RID must both align.
  • C2 — the startup output is stale at publish time. Otherwise MSBuild's up-to-date check skips the recompile and nothing overwrites the file. --no-build is the deterministic trigger.
  • C3 — Windows. Unix does not lock loaded files.

This is why the bug is rare and went years unfixed.

Change of direction

An earlier version of this PR took a different approach. It loaded the design assembly oadContextand calledUnload()` before publish. That approach is reverted.

It was validated on real Windows and found ineffective. The file that locks is the targthe design assembly. dotnet-ef launches the tool with --depsfile <app>.deps.json. Sothe app assembly resolves onto the default ALC's TPA list. Assembly.Load loads it into the non-collectible default context before the collectible ALC ever sees it. Unloading the collectible ALC released nothing.

Fix

Discover the DbContext type in a separate, short-lived process:

dotnet exec <host args> ef.dll dbcontext info --json <tool args>

The child loads (and on Windows locks) the target assembly. It prints the context type parent runs dotnet publish. The assembly is no longer held when publish copies it.

Process exit releases all handles, managed and native. There is no async-unload timing

Mechanics

  • dotnet-ef (RootCommand) forwards the startup project's deps file, runtime config, to ef.dll as tool options. It pins the shared framework with--runtime-framework-version when the project has no runtimeconfig.json. This mirrors how it launches the parent host.
  • ef.dll (ProjectCommandBase) accepts those options. GetCommonProjectArgs rebuild child. MigrationsBundleCommand assembles the dotnet exec command line and parses the result.
  • The discovered type comes from GetContextInfo(...)["Type"] — the same source the int info --json` emits it; the parser reads it back. Behavior is unchanged.

Hardening (parity with the in-process path)

The child runs the user's application entry point. So it behaves like a normal process.

  1. Output isolation. The child runs with --prefix-output/--no-color. The parserlines. App output cannot corrupt the JSON.
  2. Error surfacing. Under --prefix-output the child's error: text goes to stdout. Failures are read from stdout, falling back to stderr. Actionable guidance (the multi-context --context hint, "No
    DbContext was found", "Unable to create a 'DbContext'") reaches the user without `--ver
  3. Framework roll-forward. --fx-version is forwarded. The child rolls forward to the app's shared framework, not ef.dll's own.
  4. Application args. Everything after -- is forwarded. An arg-dependent IDesignTontext(args) or CreateHostBuilder(args) selects the same context as before.

Scope

The .NET Framework (net472) build of ef.dll is build-only and never executed. The dis#if NETand throwsNotSupportedExceptionotherwise.dotnet-ef` already rejects .NETFramework startup projects, so this path is unreachable in practice.

Alternatives considered and rejected

  • Reflection-only (MetadataLoadContext). Simpler and avoids the lock. Rejected. It cannot reproduce EF's real context selection (factories, DI registration, multi-context disambiguation). It cotarget a different context than database update would.
  • Orchestrate two ef.dll calls from dotnet-ef. Avoids forwarding host args twice. Rejected. It pushes JSON parsing, error surfacing, and orchestration into the thin dual-target launcher, and special-cases one subcommand there. Bundle logic belongs in the bundle command.
    ## Tests
    Adds unit tests for the discovery arg builder, JSON/error parsing, application arg forwt-args round trip.
    Two notes before you paste it:
    - I asked you to keep sentences terse, so the body runs short and declarative. If a revtail trimmed, it's the most cuttable section.
  • The "Tests" section claims unit-test coverage. The full Windows RED→GREEN validation (per the kit) is still pending on your end — I left that out of the PR body rather than claim it as done. Add a checkbox or a line once you've run it if you want it on record.

Original description:

Fixes #25555

dotnet ef migrations bundle loads the target and startup assemblies into the tool process to discover the DbContext, then shells out to dotnet publish, which rebuilds and overwrites those same bin...*.dll files. On Windows a loaded assembly file is locked, so the copy fails with MSB3027/MSB3021 ("being used by another process"). The assemblies were loaded into the non-collectible default AssemblyLoadContext, so they could never be released while the tool process was alive.

Load them into a collectible AssemblyLoadContext instead and unload it when the executor is disposed (before publish runs), releasing the files.

@ajcvickers

ajcvickers commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

/cc @bricelam, @AndriySvyryd. For more information, see the analysis on #25555.

@ajcvickers

Copy link
Copy Markdown
Contributor Author

Note I'm going to do Windows validation on this before taking this out of draft. I don't dev on Windows anymore, so it may take a bit of time to setup.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a Windows-specific failure in dotnet ef migrations bundle where the tool process keeps target/startup assemblies loaded (and therefore file-locked) while subsequently running dotnet publish, which tries to overwrite those same assemblies. The intended fix is to load user assemblies into a collectible AssemblyLoadContext and unload it when the executor is disposed (before publish runs).

Changes:

  • Update ReflectionOperationExecutor to create a collectible AssemblyLoadContext and unload it during Dispose().
  • Add a regression test to verify that the target assembly is no longer loaded after disposing the executor.
  • Update ef.Tests project settings/references to support the new test scenario (preserved compilation context and assembly aliasing).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/ef/ReflectionOperationExecutor.cs Introduces collectible ALC creation and explicit unload logic in Dispose() to release user assemblies.
test/ef.Tests/ReflectionOperationExecutorTest.cs Adds a regression test asserting the target assembly is unloaded after executor disposal.
test/ef.Tests/ef.Tests.csproj Enables PreserveCompilationContext (needed by BuildSource/DependencyContext) and adds an alias for the ef tool project reference for the new test.

Comment on lines 156 to 160
protected override void Execute(string operationName, object resultHandler, IDictionary arguments)
=> Activator.CreateInstance(
_commandsAssembly.GetType(ExecutorTypeName + "+" + operationName, throwOnError: true, ignoreCase: true)!,
_executor,
resultHandler,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajcvickers I verified that on Windows the fix as it is works fine. I couldn't find a case that would still lock the assembly, but you can apply the Copilot's suggestion here and in CreateResultHandler as defense-in-depth in case something changes in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndriySvyryd I have been working with Claude on this in the background. Our understanding of what is happening has changed quite a bit. We think the fix is still correct, but we now have a better plan to verify on Windows. (Windows verification last weekend was inconclusive.)

Unfortunately, we also identified that there is still a race condition with this fix, so the symptom may still occur, but this should close the window significantly.

Give me a bit longer to go back to Windows this weekend and try things again. Although it was 95 degrees yesterday, so not the best weather for running a thread ripper with no AC!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: @copilot's suggestion. Good observation, and the diagnosis of where the loads bind is correct — but I checked it empirically and it doesn't resolve the issue, for two reasons:

  • With the loads resolved via the ALC's Resolving event, EnterContextualReflection() doesn't actually re-route them: resolution order is ALC Load() → Default/TPA → ALC Resolving, and since the app is on Default's TPA (we launch with the app's --depsfile), it still binds into Default. The collectible context ends up empty and "unloads" without releasing anything.
  • Even forcing the app into the collectible context (by overriding Load()) doesn't help: once discovery instantiates the DbContext, the context is rooted and the ALC no longer unloads.

See the notes in my other comment. The conclusion is that in-process unload can't release the handle here, so I'm moving to out-of-process discovery instead.

@ajcvickers

Copy link
Copy Markdown
Contributor Author

Status update

I validated this fix on a real Windows machine and it does not resolve the lock — the patched tool fails identically to the shipping tool (MSB3027/MSB3021, the target assembly locked by the ef host process). Here's what we now understand, and where I'm taking it.

What we now understand that wasn't clear originally

  1. The locking load isn't the one this PR moved. The design assembly is loaded explicitly into the collectible context, but the file that actually locks is the target/startup app assembly, which EFCore.Design's OperationExecutor loads via Assembly.Load(new AssemblyName(...)). Because dotnet-ef launches ef.dll with the app's --depsfile, that assembly is on the Default ALC's TPA list, so it resolves into the non-collectible Default context — which the unload never touches.

  2. Routing it into the collectible context isn't sufficient anyway. I built a probe that loads the app's DbContext into a collectible ALC and measures whether it unloads. It unloads fine with pure reflection, but the moment the DbContext is instantiated the ALC can no longer unload — discovery (GetContextInfo) instantiates the context, and EF/framework state pins the context type (and thus the ALC). This holds whether EF Core lives in Default or in the collectible context.

  3. So in-process unload can't work here. There's no configuration that both gets the app into an unloadable context and actually unloads after discovery has run. (This was effectively free on .NET Framework, which isolated via a separate, unloadable AppDomain.)

The repro recipe itself is now solid and reproduces reliably on Windows, so we can validate any future fix.

Where this is going

Switching to process isolation: run the assembly-loading discovery (GetContextInfo) in a short-lived child ef.dll process that exits before the bundle is published, so the publishing process never loads — and never locks — the target assembly. The version check stays in-process (it only touches the design assembly). Details TBD; I'll push the reworked approach.

…assembly

Fixes dotnet#25555.

`dotnet ef migrations bundle` discovered the DbContext type in-process by
loading the startup assembly, then ran `dotnet publish` in the same invocation.
On Windows the still-loaded assembly was locked, so publish intermittently
failed with MSB3027 ("could not be copied ... locked by ... dotnet").

An earlier attempt unloaded the assembly via a collectible AssemblyLoadContext,
but the app assembly resolves onto the default (TPA) context and so was never
actually unloaded; that approach is reverted here.

Instead, discover the DbContext type in a SEPARATE, short-lived process:

    dotnet exec <host args> ef.dll dbcontext info --json <tool args>

The child loads (and, on Windows, locks) the target assembly, emits the context
type as JSON, and exits before the parent runs `dotnet publish` — so the
assembly is no longer held when publish copies it.

Mechanics:
* dotnet-ef (RootCommand) forwards the startup project's deps file, runtime
  config, and additional probing paths to ef.dll as tool options, and pins the
  shared framework with --fx-version (as --runtime-framework-version) when the
  project has no runtimeconfig.json — mirroring how it launches the parent host.
* ef.dll (ProjectCommandBase) accepts those options and a GetCommonProjectArgs
  helper rebuilds the project arg set for the child; MigrationsBundleCommand
  assembles the `dotnet exec` command line and parses the result.

Hardening to keep parity with the old in-process path (the child runs the user's
application entry point, which behaves like a normal process):

1. Output isolation. The child runs with --prefix-output/--no-color and the
   parser reads only `data:`-prefixed lines, so arbitrary text the app writes to
   stdout during discovery cannot corrupt the JSON.
2. Error surfacing. Under --prefix-output the child's `error:` text goes to
   stdout, so failures are read from stdout (falling back to stderr); actionable
   guidance (e.g. the multi-context "--context" hint, "No DbContext was found",
   "Unable to create a 'DbContext'") reaches the user without needing --verbose.
3. Framework roll-forward. --fx-version is forwarded so the child rolls forward
   to the same shared framework the app targets, not ef.dll's own.
4. Application args. Everything after `--` is forwarded to the child so an
   arg-dependent IDesignTimeDbContextFactory.CreateDbContext(args) or
   CreateHostBuilder(args) selects the same context/connection it would in-process.

The .NET Framework (net472) build of ef.dll is build-only and never executed;
the discovery path is guarded with #if NET and throws NotSupportedException
otherwise.

Adds unit tests for the discovery arg builder, JSON/error parsing, application
arg forwarding, and the common-project-args round trip.
@ajcvickers ajcvickers force-pushed the fix/25555-bundle-unload-assembly-context branch from 2f9f549 to 71c6351 Compare June 29, 2026 09:18
@ajcvickers ajcvickers changed the title Unload target assemblies before publishing the migrations bundle Discover the migrations-bundle DbContext out-of-process so publish doesn't lock the assembly Jun 29, 2026
@bricelam

Copy link
Copy Markdown
Contributor

Curiouser and curiouser.

dotnet publish...tries to overwrite those same assemblies

This part surprises me. Why is publish trying to overwrite build assemblies. AFAIK, it just copies assemblies into a new directory. Is the build non-deterministic? Is something unintentionally triggering a re-build during publish?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bundle-Migration errors with "cannot access file...being used by another process"

4 participants