-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-54753][SQL]fix memory leak of ArtifactManager #53591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good finding. Should not use lamda for cleaner here. It will hold a reference to ArtifactManager.
|
Could you help review the code change. @hvanhovell @vicennial |
allisonwang-db
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @HyukjinKwon
|
cc @zhengruifeng @panbingkun would you help take a look? thanks. |
What changes were proposed in this pull request?
As stated in https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/ref/Cleaner.html
The cleaning action could be a lambda but all too easily will capture the object reference, by referring to fields of the object being cleaned, preventing the object from becoming phantom reachable. Using a static nested class, as above, will avoid accidentally retaining the object reference.
For more details, and the test and analysis are in https://issues.apache.org/jira/browse/SPARK-54753
After running with Spark 4.0.1, the ArtififactManager is leaked, its referenced SessionState/SparkSession is as well leaked.
Why are the changes needed?
use a separate class to ref the cleanup state
Does this PR introduce any user-facing change?
No
How was this patch tested?
with test program in https://issues.apache.org/jira/browse/SPARK-54753, and use Visual VM to monitor the memory usage
Was this patch authored or co-authored using generative AI tooling?
No