-
Notifications
You must be signed in to change notification settings - Fork 729
⚡ Bolt: Optimize RequestMetrics.to_dict serialization speed #6965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| ## 2024-05-18 - Fast `to_dict` serialization for `RequestMetrics` | ||
| **Learning:** `dataclasses.asdict` is extremely slow because it performs deep copies and recursively serializes nested components. By using custom reflection (`__dataclass_fields__`) combined with primitive checking and direct method delegation, object serialization overhead can be significantly reduced (~2x). This is especially critical on high-throughput hot paths like request metrics reporting in API servers/scheduler instances. In addition, avoiding inline imports (e.g., `import dataclasses` inside `to_dict`) is crucial as dictionary lookup on `sys.modules` adds overhead to tight loops. | ||
| **Action:** Consider custom `to_dict()` methods over `asdict()` for heavily instantiated dataclasses on hot execution paths, and always put imports at module scope. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -16,6 +16,7 @@ | |
|
|
||
| from __future__ import annotations | ||
|
|
||
| import dataclasses | ||
| import json | ||
| import time | ||
| import traceback | ||
|
|
@@ -897,7 +898,28 @@ def to_dict(self): | |
| """ | ||
| Convert the RequestMetrics object to a dictionary. | ||
| """ | ||
| return {k: v for k, v in asdict(self).items()} | ||
| # Custom serialization is significantly faster than dataclasses.asdict() | ||
| res = {} | ||
| for k in self.__dataclass_fields__: | ||
| v = getattr(self, k) | ||
| if type(v) in (int, float, str, bool, type(None)): | ||
| res[k] = v | ||
| else: | ||
| if dataclasses.is_dataclass(v): | ||
| res[k] = v.to_dict() if hasattr(v, "to_dict") else dataclasses.asdict(v) | ||
| elif isinstance(v, list): | ||
| res[k] = [ | ||
| (x.to_dict() if hasattr(x, "to_dict") else dataclasses.asdict(x) if dataclasses.is_dataclass(x) else x) | ||
| for x in v | ||
| ] | ||
| elif isinstance(v, dict): | ||
| res[k] = { | ||
| key: (val.to_dict() if hasattr(val, "to_dict") else dataclasses.asdict(val) if dataclasses.is_dataclass(val) else val) | ||
| for key, val in v.items() | ||
| } | ||
| else: | ||
|
Comment on lines
+910
to
+920
|
||
| res[k] = v | ||
| return res | ||
|
Comment on lines
+901
to
+922
|
||
|
|
||
| def record_recv_first_token(self): | ||
| cur_time = time.time() | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里通过
hasattr(x, "to_dict")判断后直接调用,若对象恰好有同名非可调用属性会在运行时报TypeError。建议改为callable(getattr(x, "to_dict", None))(v/list 元素/dict 值都一致处理),避免属性遮蔽导致的序列化失败。