Python: Clarify allowed types for deserialization#5488
Open
XananasX7 wants to merge 1 commit intomicrosoft:mainfrom
Open
Python: Clarify allowed types for deserialization#5488XananasX7 wants to merge 1 commit intomicrosoft:mainfrom
XananasX7 wants to merge 1 commit intomicrosoft:mainfrom
Conversation
Updated documentation to clarify that callers must extend the allowed types for deserialization, removing references to framework and OpenAI SDK types. Fixed over-broad namespace whitelisting in RestrictedUnpickler to prevent potential Remote Code Execution (RCE) via malicious checkpoints.
|
@XananasX7 please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why is this change required?
The previous implementation of _RestrictedUnpickler contained a critical security flaw that allowed Remote Code Execution (RCE). It whitelisted entire module namespaces (agent_framework.* and openai.types.*), which is inherently unsafe in Python's pickle deserialization.
What problem does it solve?
It prevents "gadget" attacks. By whitelisting a whole namespace, an attacker could use any class within the framework that has side effects during initialization (e.g., executing commands, opening files) to compromise the host. This change enforces a strict, explicit whitelist policy.
What scenario does it contribute to?
Secure handling of Durable Workflows and Checkpoints. It ensures that when agents load state from potentially untrusted sources (shared storage, databases, or user uploads), the system remains protected against malicious payloads.
If it fixes an open issue, please link to the issue here.
Security vulnerability identified via independent audit (MSRC report pending).
Description
This PR hardens the _RestrictedUnpickler in _checkpoint_encoding.py by:
Removing over-broad whitelists: Deleted the logic that automatically allowed any module starting with agent_framework. or openai.types..
Enforcing Explicit Whitelisting: The unpickler now strictly only allows types defined in _BUILTIN_ALLOWED_TYPE_KEYS or those explicitly provided by the user via the allowed_checkpoint_types parameter.
Updating Documentation: Updated the module and method docstrings to clearly state that callers are now responsible for explicitly whitelisting any framework or custom classes they intend to deserialize.
Contribution Checklist
[x] The code builds clean without any errors or warnings
[x] The PR follows the Contribution Guidelines
[x] All unit tests pass, and I have added new tests where possible
[x] Is this a breaking change? Yes. [BREAKING] (Callers must now explicitly whitelist framework types they use in checkpoints).