-
Notifications
You must be signed in to change notification settings - Fork 732
⚡ Bolt: optimize dataclass serialization for RequestMetrics and SpeculateMetrics #7067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| ## 2025-03-29 - Optimize dataclass serialization for metrics | ||
| **Learning:** `dataclasses.asdict()` relies on recursive deepcopying which introduces significant overhead, especially for objects created and serialized frequently on the hot path (like `RequestMetrics` per request). | ||
| **Action:** Replace `asdict()` with manual `to_dict()` methods that iterate over `__dataclass_fields__` using `getattr()`. Explicitly copy primitives, shallow copy lists/dicts, and call `.to_dict()` on nested dataclasses (like `SpeculateMetrics`) to avoid deepcopy overhead while maintaining the correct dictionary structure. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -16,6 +16,7 @@ | |
|
|
||
| from __future__ import annotations | ||
|
|
||
| import dataclasses | ||
| import json | ||
| import time | ||
| import traceback | ||
|
|
@@ -897,7 +898,23 @@ def to_dict(self): | |
| """ | ||
| Convert the RequestMetrics object to a dictionary. | ||
| """ | ||
| return {k: v for k, v in asdict(self).items()} | ||
| res = {} | ||
| for k in self.__dataclass_fields__: | ||
| v = getattr(self, k) | ||
| if type(v) in (int, float, str, bool, type(None)): | ||
| res[k] = v | ||
| elif dataclasses.is_dataclass(v): | ||
| if hasattr(v, "to_dict"): | ||
| res[k] = v.to_dict() | ||
| else: | ||
| res[k] = dataclasses.asdict(v) | ||
| elif isinstance(v, list): | ||
| res[k] = list(v) | ||
| elif isinstance(v, dict): | ||
| res[k] = dict(v) | ||
| else: | ||
| res[k] = v | ||
| return res | ||
|
Comment on lines
+901
to
+917
|
||
|
|
||
| def record_recv_first_token(self): | ||
| cur_time = time.time() | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -164,6 +164,19 @@ class SpeculateMetrics: | |
| """ | ||
| accept_ratio_per_head: list[float] | ||
|
|
||
| def to_dict(self): | ||
| """ | ||
| convert SpeculateMetrics to a serialized dict | ||
| """ | ||
| return { | ||
| "accepted_tokens": self.accepted_tokens, | ||
| "rejected_tokens": self.rejected_tokens, | ||
| "accept_ratio": self.accept_ratio, | ||
| "average_accept_length": self.average_accept_length, | ||
| "accepted_tokens_per_head": list(self.accepted_tokens_per_head), | ||
| "accept_ratio_per_head": list(self.accept_ratio_per_head), | ||
| } | ||
|
Comment on lines
+167
to
+178
|
||
|
|
||
|
|
||
| @dataclass | ||
| class SamplerOutput: | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
仓库 PR 模板要求标题至少包含一个 tag(如 [Optimization]、[Engine] 等);当前 PR 标题未包含方括号 tag。建议更新标题以符合模板约定,方便后续变更分类与发布记录。