Skip to content

fix(mcp): prevent leaked semaphore warnings during MCPServerStdio cleanup#2802

Draft
4444J99 wants to merge 1 commit intoopenai:mainfrom
4444J99:fix/mcp-stdio-semaphore-leak
Draft

fix(mcp): prevent leaked semaphore warnings during MCPServerStdio cleanup#2802
4444J99 wants to merge 1 commit intoopenai:mainfrom
4444J99:fix/mcp-stdio-semaphore-leak

Conversation

@4444J99
Copy link
Copy Markdown

@4444J99 4444J99 commented Mar 28, 2026

Summary

Fixes the "leaked semaphore objects to clean up at shutdown" warning that occurs when using MCPServerStdio with MCP servers that use Python multiprocessing internally.

Root cause: During cleanup(), the exit_stack.aclose() unwinds contexts in LIFO order — ClientSession.__aexit__ runs first (cancelling its internal task group), then the transport's __aexit__ runs. Because the ClientSession cancellation kills the stdout_reader/stdin_writer tasks inside the transport, the transport's finally block — which closes stdin and waits for the subprocess — may never execute cleanly. The subprocess is killed without a chance to flush its multiprocessing resources (semaphores, shared memory, etc.), triggering Python's resource_tracker warning.

Fix: Before calling exit_stack.aclose(), explicitly close the session's write stream. This delivers EOF to the subprocess's stdin, giving it a window to release its resources and exit cleanly before the task groups are cancelled. Also reinitialise exit_stack and clear server_initialize_result during cleanup so the same server instance can reconnect.

Test plan

  • Added test_cleanup_resets_state_for_reconnection — verifies all session state is cleared and exit stack is replaced after cleanup
  • Added test_reconnect_after_cleanup — verifies a server instance can connect, use tools, cleanup, then reconnect and use tools again
  • Added test_cleanup_closes_write_stream_before_exit_stack — verifies the write stream close happens before exit stack unwind (the critical ordering invariant)
  • All 175 existing MCP tests pass
  • ruff check and ruff format pass

Issue number

Closes #618

Checks

  • I've added new tests (if relevant)
  • I've added/updated the relevant documentation
  • I've run make lint and make format
  • I've made sure tests pass

…ent semaphore leaks

When MCPServerStdio.cleanup() tears down via exit_stack.aclose(), the
ClientSession exits first (cancelling its task group), then the transport
exits.  The ClientSession cancellation kills the reader/writer tasks inside
the transport before the transport's finally block can close stdin and wait
for the subprocess to exit gracefully.  This race prevents the subprocess
from flushing multiprocessing resources, causing "leaked semaphore" warnings
from Python's resource_tracker.

Fix: explicitly close the session's write stream before the exit_stack
unwind.  This delivers EOF to the subprocess's stdin, giving it a window to
release its resources and exit cleanly before task groups are cancelled.

Also reinitialise the exit_stack and clear server_initialize_result during
cleanup so the same server instance can reconnect after cleanup.

Closes openai#618
@github-actions github-actions bot added bug Something isn't working feature:mcp labels Mar 28, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4468bf6f4d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1002 to +1004
# Reinitialize the exit stack so the same server instance can
# reconnect cleanly after cleanup.
self.exit_stack = AsyncExitStack()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve original exit stack when cleanup is interrupted

Reinitializing self.exit_stack in finally discards pending cleanup callbacks even when await self.exit_stack.aclose() is interrupted (for example by CancelledError, which this method explicitly re-raises). In that case the old stack may still hold open transport/process contexts, but after reassignment there is no way to retry closing them, so subprocesses can be orphaned and the semaphore leak warning can persist during shutdown. Only replacing the stack after a successful aclose() avoids this resource-leak path.

Useful? React with 👍 / 👎.

@seratch
Copy link
Copy Markdown
Member

seratch commented Mar 29, 2026

Thanks for sharing this. I also explored a viable solution for this, but I found that the sufficient fix requires substantial changes across this SDK. So, I don't think the suggested changes here is enough for the issue. See also: #2573 (comment)

@seratch seratch marked this pull request as draft March 29, 2026 00:01
@4444J99
Copy link
Copy Markdown
Author

4444J99 commented Mar 30, 2026

Understood — happy to contribute to the broader fix in #2573 if that's useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature:mcp

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Resource tracker warning (leaked semaphores) with MCPServerStdio

2 participants