Conversation
There was a problem hiding this comment.
Pull request overview
This PR removes unsafe eval usage in PyTorch config parsing and introduces SSRF-oriented safety checks for fetching media over HTTP(S) in the VL media loader.
Changes:
- Replace
eval(f"torch.{...}")withgetattr(torch, ...)when resolving dtypes inlmdeploy/pytorch/config.py. - Add
_is_safe_url()validation before HTTP(S) fetches and disallow redirects inlmdeploy/vl/media/connection.py. - Add unit tests for URL safety validation and redirect handling.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
lmdeploy/vl/media/connection.py |
Adds URL validation for HTTP(S) fetching and blocks redirects. |
lmdeploy/pytorch/config.py |
Removes eval and uses getattr for dtype resolution. |
tests/test_lmdeploy/test_vl/test_safe_url.py |
Adds tests for the new URL safety logic and redirect blocking behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
lmdeploy/vl/media/connection.py
Outdated
| response = client.get(url_spec.geturl(), headers=headers, timeout=fetch_timeout) | ||
| # do not follow redirects automatically | ||
| response = client.get(url_spec.geturl(), headers=headers, timeout=fetch_timeout, allow_redirects=False) | ||
| # block if server attempts to redirect to internal IP |
There was a problem hiding this comment.
The comment mentions blocking redirects specifically to internal IPs, but the code currently blocks all redirects unconditionally. Either update the comment to match the behavior, or implement destination validation if the intent is to only block unsafe redirect targets.
| # block if server attempts to redirect to internal IP | |
| # block all redirects for security reasons |
lmdeploy/pytorch/config.py
Outdated
| else: | ||
| torch_dtype = dtype | ||
| config.dtype = eval(f'torch.{torch_dtype}') | ||
| config.dtype = getattr(torch, torch_dtype) |
There was a problem hiding this comment.
getattr(torch, torch_dtype) can return a non-torch.dtype attribute if an unexpected string is provided (e.g. a function name), which would violate the ModelConfig.dtype: torch.dtype expectation and fail later in harder-to-debug ways. Consider validating the result is an instance of torch.dtype (and raising a clear ValueError on invalid inputs) to keep failures immediate and user-friendly.
| config.dtype = getattr(torch, torch_dtype) | |
| torch_dtype_attr = getattr(torch, torch_dtype, None) | |
| if not isinstance(torch_dtype_attr, torch.dtype): | |
| raise ValueError( | |
| f'Invalid torch dtype "{torch_dtype}" resolved from model config; ' | |
| 'expected a torch.dtype attribute on torch.' | |
| ) | |
| config.dtype = torch_dtype_attr |
lmdeploy/pytorch/config.py
Outdated
| quant_dtype = getattr(torch, quant_dtype) | ||
|
|
There was a problem hiding this comment.
Similar to dtype, quant_dtype is expected to be a torch.dtype, but getattr(torch, quant_dtype) can return arbitrary non-dtype attributes if the config contains an unexpected string. Consider validating the resolved value is a torch.dtype and raising a clear error (or mapping allowed strings) so misconfigured models fail fast with a helpful message.
| quant_dtype = getattr(torch, quant_dtype) | |
| try: | |
| resolved_quant_dtype = getattr(torch, quant_dtype) | |
| except AttributeError as e: | |
| raise TypeError( | |
| f'Unsupported quant dtype: {quant_dtype!r}. ' | |
| 'Expected the name of a torch.dtype (e.g., "int8", "float16").' | |
| ) from e | |
| if not isinstance(resolved_quant_dtype, torch.dtype): | |
| raise TypeError( | |
| f'quant_dtype must be the name of a torch.dtype, but torch.{quant_dtype} ' | |
| f'is of type {type(resolved_quant_dtype).__name__}.' | |
| ) | |
| quant_dtype = resolved_quant_dtype |
| if ip.is_loopback or ip.is_private or ip.is_link_local: | ||
| return False, f'Blocked IP detected: {ip}' |
There was a problem hiding this comment.
The IP safety check only blocks loopback/private/link-local ranges. Other non-global ranges (e.g., multicast, reserved, unspecified) can still pass this check, which undermines the SSRF protection. Consider blocking anything that is not globally routable (e.g., if not ip.is_global: ...) or explicitly checking is_reserved, is_multicast, is_unspecified, etc. in addition to the current checks.
| if ip.is_loopback or ip.is_private or ip.is_link_local: | |
| return False, f'Blocked IP detected: {ip}' | |
| # Block any IP that is not globally routable (covers private, loopback, | |
| # link-local, multicast, reserved, unspecified, etc.) | |
| if not ip.is_global: | |
| return False, f'Blocked non-global IP detected: {ip}' |
lmdeploy/vl/media/connection.py
Outdated
| @@ -31,7 +62,12 @@ def _load_http_url(url_spec: ParseResult, media_io: MediaIO[_M]) -> _M: | |||
| fetch_timeout = int(os.environ.get('LMDEPLOY_VIDEO_FETCH_TIMEOUT', 30)) | |||
|
|
|||
| client = requests.Session() | |||
| response = client.get(url_spec.geturl(), headers=headers, timeout=fetch_timeout) | |||
| # do not follow redirects automatically | |||
| response = client.get(url_spec.geturl(), headers=headers, timeout=fetch_timeout, allow_redirects=False) | |||
There was a problem hiding this comment.
There is a DNS rebinding / TOCTOU gap here: _is_safe_url() resolves the hostname, but the subsequent requests call resolves again when establishing the connection. An attacker-controlled hostname can return a public IP during validation and a private IP at request time. To make this robust, consider pinning the request to the validated IP(s) (or otherwise verifying the actual peer IP used by the connection) rather than validating DNS results separately from the fetch.
evalwithgetattr