fix: avoid memory leak when decoding invalid nested arrays#671
fix: avoid memory leak when decoding invalid nested arrays#671KowalskiThomas wants to merge 2 commits intomsgpack:mainfrom
Conversation
9f24141 to
59fe6b8
Compare
|
Seems like Windows GHA runners are having a rough day... |
Don't mind. Its known problem. Python 3.14t on Windows arm64 is broken at the moment. |
There was a problem hiding this comment.
Pull request overview
Fixes a ref-leak in the C-backed unpacker when decoding invalid data that errors out after creating nested container objects, ensuring intermediate stack containers get freed instead of being lost.
Changes:
- Update
unpack_clear()to clear all live container objects on the unpacker stack (and any pendingmap_keyreference when waiting for a map value). - Ensure
Unpackerreleases any retained unpacking stack objects during destruction by callingunpack_clear()in__dealloc__.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| msgpack/unpack_template.h | Frees all live stack frames (and pending map key refs) in unpack_clear() to prevent leaks on invalid-input error paths. |
| msgpack/_unpacker.pyx | Calls unpack_clear() during Unpacker deallocation to release any partially-decoded objects retained in the context. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
59fe6b8 to
ff1e3ee
Compare
ff1e3ee to
1b54a1e
Compare
| static inline void unpack_clear(unpack_context *ctx) | ||
| { | ||
| unsigned int i; | ||
| for (i = 1; i < ctx->top; i++) { |
There was a problem hiding this comment.
How about use i = 0 here and remove Py_CLEARE(ctx->stack[0].obj) at the bottom?
Is map_key at stack[0] safe?
There was a problem hiding this comment.
I obviously have less knowledge about this than you so let me know if I'm wrong, but my understanding is that unpack_init sets top to 0 but still pushes something into stack[0]:
msgpack-python/msgpack/unpack_template.h
Line 56 in 0d600a3
As a result, if we looped from [0; top) and top == 0 then we wouldn't free stack[0].obj as far as I can tell.
Regarding map_key at stack[0]: I think we may need to free it depending on the case. If top is 0 then we never need to free it; if top >= 1 then we would need to check for CT_MAP_KEY like we do in the loop.
So bottomline: we can probably iterate from 0 to ctx->top > 0 ? ctx->top : 1 to capture both cases. On top of that, we would need to check whether ctx->top == 0 before the CT_MAP_KEY check (since as far as I can tell, ctx->stack[0].ct would be uninitialised so if we're unlucky, we could accidentally call Py_CLEAR(ctx->stack[0].map_key) which would not be safe.
But if we're adding all that logic, what may actually be simpler is a mix of all the things:
static inline void unpack_clear(unpack_context *ctx)
{
unsigned int i;
// The loop captures the case where we did push at least one thing to the stack
for (i = 0; i < ctx->top; i++) {
Py_CLEAR(ctx->stack[i].obj);
/* map_key holds a live reference only while waiting for the value */
if (ctx->stack[i].ct == CT_MAP_VALUE) {
Py_CLEAR(ctx->stack[i].map_key);
}
}
// This captures the case where we did not push anything to the stack
// Clear again at 0, which is safe (it just sets the pointer to NULL => no-op on second call)
Py_CLEAR(ctx->stack[0].obj);
}
What is this PR?
This PR fixes a memory leak that was detected (accidentally) through fuzzing.
The leak happens in some cases when trying to decode invalid data. When decoding an array, the unpacker uses a stack and pushes a new list to that stack for every nested array element. If it eventually reaches a point where the element to decode is problematic, it returns
-2which results inFormatErrorbeing raised, butunpack_clearlacks some memory freeing logic (only relevant in the problematic case -- the outermost list is freed but not the other ones) and the objects are lost forever.The leak can directly be observed by running the following reproducer, which pushes several nested lists, and eventually one undefined format byte. It results in returning an error, without freeing the intermediate list objects. The logic is run under
tracemalloc, with explicit GC calls to avoid transient still-alive objects that would be freed at some point. The deeper the nesting, the more objects are leaked.