Skip to content

fix: app.run() memory leak#9331

Draft
dmadisetti wants to merge 3 commits intomainfrom
dm/mem-leak
Draft

fix: app.run() memory leak#9331
dmadisetti wants to merge 3 commits intomainfrom
dm/mem-leak

Conversation

@dmadisetti
Copy link
Copy Markdown
Collaborator

📝 Summary

Closes #9198

Since module scopes are persistent, the continual use of app.run() progressively fills up memory even after user accessible variables are not longer present. This PR introduces reference tracking on the _Namespace mapping class which clears the module dictionary on cleanup.

cc @VishakBaddur: I just saw your comments on the initial issue. Good job on determining the root cause, but I think context manager is unneeded. The reference tracking here should silently cleanup without explicit management from the user.

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Apr 22, 2026 7:49pm

Request Review

@akshayka
Copy link
Copy Markdown
Contributor

@dmadisetti this seems over-engineered? If there is a leak, it might be because if this reference cycle: https://github.com/marimo-team/marimo/blob/main/marimo/_ast/app.py#L621-L626

The _owner attribute appears unused ...

@dmadisetti
Copy link
Copy Markdown
Collaborator Author

@akshayka _Namespace is cleaned up, the module namespace outlives everything. My initial fix is maybe the easiest solution: c065bed

and we can go with that. We just have the scoping information, so we can leverage it (for relatively free).

@akshayka
Copy link
Copy Markdown
Contributor

akshayka commented Apr 23, 2026

@dmadisetti I looked at your commit and I don't understand. glbls is just a Python dict defined in run and not bound to any instance variables. Is there an underlying side-effect that needs to be eliminated? I don't see any "persistent" module.

EDIT: marimo pair brought me up to speed, this is more involved than I thought. https://molab.marimo.io/notebooks/nb_ULHALbwMsQzjB24qU1bCaC

@akshayka
Copy link
Copy Markdown
Contributor

Hmm. Not sure how we can fix #9198 while retaining Python semantics ... I think your first commit might have unintended side-effects, clearing globals when objects (such as values pulled out of the returned defs) still reference them seems unsafe. While this PR also looks brittle. It feels like a proper fix would have to change how we execute cells, to prevent the entire notebook's globals from being saved on its returned definitions.

@dmadisetti
Copy link
Copy Markdown
Collaborator Author

I think your first commit might have unintended side-effects

Absolutely agree. The follow-up fix cleans up everything it can (if defs gets dropped, everything not tracked is dropped). Then as tracked objects are cleaned up the dependency chain is cleaned up too.

It feels like a proper fix would have to change how we execute cells, to prevent the entire notebook's globals from being saved on its returned definitions.

It seems to me, we always want the full notebook to run (it may have side effects etc). Retaining all variables also make sense since we don't know what the user will want. But curious if you have other ideas around this?

@akshayka
Copy link
Copy Markdown
Contributor

The follow-up fix cleans up everything it can (if defs gets dropped, everything not tracked is dropped). Then as tracked objects are cleaned up the dependency chain is cleaned up too.

My concern is that it feels like there would be many edge cases that could get confusing, for example if user code mutated things in ways that was not trackable by static analysis / required refs. So I'm not sure if this can be done 100% correctly.

I was thinking we might just recommend to users that if they want reusability while controlling memory consumption, they opt for reusable functions instead of app.run(). app.run() is basically the same thing as import my_regular_python_module in that it executes the full notebook like you say. In other words, we may be able to close the original issue by making a docs change.

@dmadisetti
Copy link
Copy Markdown
Collaborator Author

My concern is that it feels like there would be many edge cases that could get confusing, for example if user code mutated things in ways that was not trackable by static analysis / required refs. So I'm not sure if this can be done 100% correctly.

Hmm the only edgecase I can think of is globals()["value"], which breaks marimo's reactivity model anyway. FFI isn't a concern here because we're just removing the ref count to the namespace dict, even then only when we're trying to clean up the object anyway. Let me know if I'm missing something. I agree on the docs change regardless

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory is not released when calling notebook as script

2 participants