fix: harden backup import for duplicate platform stats#5594
fix: harden backup import for duplicate platform stats#5594zouyonghe merged 16 commits intoAstrBotDevs:masterfrom
Conversation
- 修复 replace 模式下主库清空失败仍继续导入的问题。 - 导入前对 platform_stats 重复键做聚合(count 累加),并统一时间戳判重格式。 - 非法 count 按 0 处理并告警(限流),补充对应测试。
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 此拉取请求旨在显著提升备份导入功能的可靠性和数据完整性,特别是在 Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
本次 PR 旨在增强备份导入的健壮性,主要解决了两个问题:1)replace 模式下清空数据库失败后导入流程会继续的问题;2)platform_stats 表存在重复键时导入会因唯一约束而失败。
整体来看,改动是有效的:
- 通过在数据库清理失败时抛出异常,有效地中断了导入流程,避免了数据不一致状态。
- 新增的
_merge_platform_stats_rows方法能够正确地聚合platform_stats中的重复记录,并通过_normalize_platform_stats_timestamp方法统一了时间戳格式,增强了判重的准确性。 - 对
count字段的非法值进行了容错处理,并增加了相应的测试用例。
代码质量很高,测试覆盖也比较充分。我提出了一些关于代码重构和测试健壮性的建议,供参考。
astrbot/core/backup/importer.py
Outdated
| existing_raw_count = existing.get("count", 0) | ||
| try: | ||
| existing_count = int(existing_raw_count) | ||
| except (TypeError, ValueError): | ||
| existing_count = 0 | ||
| if invalid_count_warned < invalid_count_warn_limit: | ||
| logger.warning( | ||
| "platform_stats count 非法,已按 0 处理: " | ||
| f"value={existing_raw_count!r}, key={key}" | ||
| ) | ||
| invalid_count_warned += 1 | ||
|
|
||
| incoming_raw_count = row.get("count", 0) | ||
| try: | ||
| incoming_count = int(incoming_raw_count) | ||
| except (TypeError, ValueError): | ||
| incoming_count = 0 | ||
| if invalid_count_warned < invalid_count_warn_limit: | ||
| logger.warning( | ||
| "platform_stats count 非法,已按 0 处理: " | ||
| f"value={incoming_raw_count!r}, key={key}" | ||
| ) | ||
| invalid_count_warned += 1 | ||
| existing["count"] = existing_count + incoming_count |
There was a problem hiding this comment.
这部分代码中,处理 existing_count 和 incoming_count 的逻辑几乎完全相同,存在代码重复。为了遵循 DRY (Don't Repeat Yourself) 原则并提高代码的可读性和可维护性,建议将这部分逻辑提取到一个独立的辅助方法中。
例如,可以创建一个嵌套函数 _parse_count 来专门负责解析 count 值并处理告警逻辑。
def _parse_count(raw_count: Any) -> int:
nonlocal invalid_count_warned
try:
return int(raw_count)
except (TypeError, ValueError):
if invalid_count_warned < invalid_count_warn_limit:
logger.warning(
"platform_stats count 非法,已按 0 处理: "
f"value={raw_count!r}, key={key}"
)
invalid_count_warned += 1
return 0
existing_count = _parse_count(existing.get("count", 0))
incoming_count = _parse_count(row.get("count", 0))
existing["count"] = existing_count + incoming_count
tests/test_backup.py
Outdated
| first = merged_rows[0] | ||
| assert first["timestamp"] == "2025-12-13T20:00:00Z" | ||
| assert first["platform_id"] == "webchat" | ||
| assert first["platform_type"] == "unknown" | ||
| assert first["count"] == 17 |
There was a problem hiding this comment.
当前的测试断言依赖于 merged_rows 列表中元素的顺序。虽然在 Python 3.7+ 中字典会保持插入顺序,但为了让测试更加健壮且不受实现细节(如字典遍历顺序)的影响,建议修改断言方式。
可以通过查找特定 platform_id 的行,而不是直接使用索引 [0] 来进行断言。这样可以确保即使未来实现有变,测试的稳定性也不会受到影响。
| first = merged_rows[0] | |
| assert first["timestamp"] == "2025-12-13T20:00:00Z" | |
| assert first["platform_id"] == "webchat" | |
| assert first["platform_type"] == "unknown" | |
| assert first["count"] == 17 | |
| webchat_row = next( | |
| (r for r in merged_rows if r.get("platform_id") == "webchat"), None | |
| ) | |
| assert webchat_row is not None | |
| assert webchat_row["timestamp"] == "2025-12-13T20:00:00Z" | |
| assert webchat_row["platform_id"] == "webchat" | |
| assert webchat_row["platform_type"] == "unknown" | |
| assert webchat_row["count"] == 17 |
There was a problem hiding this comment.
Hey - 我这边发现了 4 个问题,并给了一些整体反馈:
- 在
_merge_platform_stats_rows中,你只对聚合键使用了规范化后的时间戳,但在存储的行里仍然保留原始的timestamp;建议同时把存储行里的timestamp更新为规范化后的值,避免持久化数据中出现混用格式。 _merge_platform_stats_rows里的魔法数字invalid_count_warn_limit = 5可以提取到模块级常量或者作为参数,这样更便于调节和理解日志行为。
给 AI Agent 的提示词
Please address the comments from this code review:
## Overall Comments
- 在 `_merge_platform_stats_rows` 中,你只对聚合键使用了规范化后的时间戳,但在存储的行里仍然保留原始的 `timestamp`;建议同时把存储行里的 `timestamp` 更新为规范化后的值,避免持久化数据中出现混用格式。
- `_merge_platform_stats_rows` 里的魔法数字 `invalid_count_warn_limit = 5` 可以提取到模块级常量或者作为参数,这样更便于调节和理解日志行为。
## Individual Comments
### Comment 1
<location path="tests/test_backup.py" line_range="311-320" />
<code_context>
+ def test_merge_platform_stats_rows(self):
</code_context>
<issue_to_address>
**suggestion (testing):** 为 `_normalize_platform_stats_timestamp` 增加更聚焦的测试,用于在不依赖合并逻辑的情况下覆盖时间戳的边界场景。
目前 `_merge_platform_stats_rows` 只覆盖了 `_normalize_platform_stats_timestamp` 的一个简单场景(`Z` 与 `+00:00` 的差异)。为了直接覆盖它的分支和边界情况,请增加有针对性的单元测试来覆盖:
- 带 tzinfo 和不带 tzinfo 的 `datetime` 值(验证归一化为 UTC 的行为)
- 字符串时间戳:带 `Z`、带非 UTC 偏移(例如 `+08:00`)、以及 naive 值
- 空字符串和只包含空白的字符串
- `None`(应返回空字符串)
- 完全非法的字符串(`strip` 后按原样返回)
这将帮助明确归一化的约定,并在不依赖合并行为的前提下防止回归。
Suggested implementation:
```python
from datetime import datetime, timezone
```
```python
def test_normalize_platform_stats_timestamp_datetime_values(self):
"""datetime 值有无 tzinfo 都会被标准化为 UTC"""
importer = AstrBotImporter(main_db=MagicMock())
aware = datetime(2025, 1, 1, 12, 0, 0, tzinfo=timezone.utc)
naive = datetime(2025, 1, 1, 12, 0, 0)
aware_norm = importer._normalize_platform_stats_timestamp(aware)
naive_norm = importer._normalize_platform_stats_timestamp(naive)
# 两者都应该标准化为 UTC,并且结果一致
assert isinstance(aware_norm, str)
assert isinstance(naive_norm, str)
assert aware_norm == naive_norm
assert aware_norm.endswith("+00:00")
def test_normalize_platform_stats_timestamp_string_variants(self):
"""字符串时间戳:Z、非 UTC 偏移以及 naive"""
importer = AstrBotImporter(main_db=MagicMock())
ts_z = "2025-01-01T12:00:00Z"
ts_utc_offset = "2025-01-01T12:00:00+00:00"
ts_non_utc_offset = "2025-01-01T20:00:00+08:00"
ts_naive = "2025-01-01T12:00:00"
z_norm = importer._normalize_platform_stats_timestamp(ts_z)
utc_norm = importer._normalize_platform_stats_timestamp(ts_utc_offset)
non_utc_norm = importer._normalize_platform_stats_timestamp(ts_non_utc_offset)
naive_norm = importer._normalize_platform_stats_timestamp(ts_naive)
# Z 和 +00:00 应该归一化为同一表示(当前实现在 merge 测试中依赖这一点)
assert z_norm == utc_norm
# 非 UTC 偏移应当保持其偏移信息(只做 strip / 形式标准化,不强制改为 UTC)
assert "+08:00" in non_utc_norm
# naive 字符串应当被原样保留(除去首尾空白)
assert naive_norm == ts_naive
def test_normalize_platform_stats_timestamp_empty_none_invalid(self):
"""空字符串、None、以及完全非法的字符串"""
importer = AstrBotImporter(main_db=MagicMock())
assert importer._normalize_platform_stats_timestamp("") == ""
assert importer._normalize_platform_stats_timestamp(" ") == ""
assert importer._normalize_platform_stats_timestamp(None) == ""
invalid = " not-a-date "
invalid_norm = importer._normalize_platform_stats_timestamp(invalid)
# 非法字符串:strip 后原样返回
assert invalid_norm == "not-a-date"
def test_merge_platform_stats_rows(self):
"""测试 platform_stats 重复键会在导入前聚合"""
importer = AstrBotImporter(main_db=MagicMock())
```
如果 `_normalize_platform_stats_timestamp` 当前的行为与上述断言不完全一致(例如:将所有可解析时间统一转换为 UTC、或将所有 UTC 格式统一为带 `Z` 的表示),则需要根据实际实现调整断言:
1. 如果实现将所有 UTC 时间统一为带 `Z` 的格式,那么上面的 `endswith("+00:00")` 应改为断言结果以 `"Z"` 结尾。
2. 如果实现会把非 UTC 偏移(如 `+08:00`)也转换为 UTC,那么针对 `non_utc_norm` 的断言应改为检查它与 UTC 表达式相等(例如与 `z_norm` 一致)。
3. 如果对 naive 字符串有额外解析逻辑(例如假定为 UTC 并补上偏移),则需要将 `naive_norm == ts_naive` 的断言改为匹配实际规范化结果。
请对照 `AstrBotImporter._normalize_platform_stats_timestamp` 的真实实现,微调这些测试中的预期字符串形式,但保留这些测试用例覆盖的边界场景。
</issue_to_address>
### Comment 2
<location path="tests/test_backup.py" line_range="348-357" />
<code_context>
+ def test_merge_platform_stats_rows_warns_on_invalid_count(self):
</code_context>
<issue_to_address>
**suggestion (testing):** 扩展对非法 `count` 处理的测试覆盖,包括“已有行中 count 非法”的场景以及告警频率上限。
你新增的测试覆盖了“新行 `count` 非法、已有行合法”的情况。实现还需要覆盖以下场景:
- *已有* 行中的 `count` 非法,验证只有合法值会参与合并到最终的 `count`,并且会发出一次告警。
- 超过 `invalid_count_warn_limit` 数量的非法 `count` 行,验证告警次数会被限制(例如断言 `warning_mock.call_count == invalid_count_warn_limit`)。
补充这些用例可以同时覆盖两种非法 `count` 分支以及告警节流的行为。
Suggested implementation:
```python
def test_merge_platform_stats_rows_warns_on_invalid_count(self, mocker):
"""测试 platform_stats count 非法时会告警并按 0 处理(含已有行与告警上限)"""
importer = AstrBotImporter(main_db=MagicMock())
# NOTE: 替换成实际 logger.warning 的导入路径
warning_mock = mocker.patch("astrbot.backup.logger.warning")
# 1) 新行 count 非法, 已有行合法: 只累加合法的 count, 且产生一次告警
rows = [
{
"timestamp": "2025-12-13T20:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": 5,
},
{
"timestamp": "2025-12-13T20:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
},
]
merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)
assert duplicate_count == 1
assert len(merged_rows) == 1
merged = merged_rows[0]
assert merged["timestamp"] == "2025-12-13T20:00:00Z"
assert merged["platform_id"] == "webchat"
assert merged["platform_type"] == "unknown"
# 只有合法的 5 参与合并, 非法值按 0 处理
assert merged["count"] == 5
assert warning_mock.call_count == 1
warning_mock.reset_mock()
# 2) 已有行 count 非法, 新行合法: 同样只累加合法的 count, 且产生一次告警
rows = [
{
"timestamp": "2025-12-13T21:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
},
{
"timestamp": "2025-12-13T21:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": 7,
},
]
merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)
assert duplicate_count == 1
assert len(merged_rows) == 1
merged = merged_rows[0]
assert merged["timestamp"] == "2025-12-13T21:00:00Z"
assert merged["platform_id"] == "webchat"
assert merged["platform_type"] == "unknown"
# 只有合法的 7 参与合并
assert merged["count"] == 7
assert warning_mock.call_count == 1
warning_mock.reset_mock()
# 3) 超过 invalid_count_warn_limit 的非法 count 只告警到上限
invalid_count_warn_limit = getattr(importer, "invalid_count_warn_limit", 10)
many_invalid_rows = [
{
"timestamp": f"2025-12-13T22:{i:02d}:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
}
for i in range(invalid_count_warn_limit + 5)
]
importer._merge_platform_stats_rows(many_invalid_rows)
# 告警条数被限制在 invalid_count_warn_limit 之内
assert warning_mock.call_count == invalid_count_warn_limit
```
1. 将 `mocker.patch("astrbot.backup.logger.warning")` 中的导入路径替换为实际用于发出非法 `count` 告警的 logger 路径,例如可能是 `tests.test_backup.logger.warning` 或模块内的 `astrbot.backup.importer.logger.warning` 等。
2. 确认 `AstrBotImporter` 实例上确实存在 `invalid_count_warn_limit` 属性;如果该上限定义在模块常量中(如 `INVALID_COUNT_WARN_LIMIT`),则需要改为从常量读取并在断言中使用该常量。
3. 如果现有测试类/模块未使用 `pytest-mock` 的 `mocker` fixture,需要根据项目惯例改用 `monkeypatch` 或 `caplog` 等方式注入并统计 warning 调用次数,同时调整测试函数签名。
</issue_to_address>
### Comment 3
<location path="tests/test_backup.py" line_range="374-376" />
<code_context>
+ assert merged_rows[0]["count"] == 5
+ assert warning_mock.called
+
@pytest.mark.asyncio
async def test_import_file_not_exists(self, mock_main_db, tmp_path):
"""测试导入不存在的文件"""
</code_context>
<issue_to_address>
**suggestion (testing):** 增强 replace 模式下失败场景的测试,确保在 `_clear_main_db` 失败后不会继续执行后续导入步骤。
为了更好地防止 replace 模式下出现“部分导入”的情况,可以在 `_clear_main_db` 失败时再断言不会调用任何后续的数据导入流程。例如,将 `_import_main_database`(以及如有的话 `_import_knowledge_bases`)打补丁为 `AsyncMock`,并断言在 `_clear_main_db` 抛出异常后这些方法没有被调用。这样可以确认导入在清库失败时会提前终止,避免主库被部分清空、部分导入的状态。
Suggested implementation:
```python
@pytest.mark.asyncio
async def test_import_replace_fails_when_clear_main_db_fails(
self, mock_main_db, tmp_path
):
"""测试 replace 模式下主库清空失败会直接终止导入"""
zip_path = tmp_path / "valid_backup.zip"
manifest = {
"version": "1.0.0",
"platform_stats": [],
"knowledge_bases": [],
}
importer = BackupImporter(main_db=mock_main_db, mode="replace")
clear_error = RuntimeError("clear main db failed")
with (
patch.object(
importer,
"_clear_main_db",
new=AsyncMock(side_effect=clear_error),
) as clear_main_db_mock,
patch.object(
importer,
"_import_main_database",
new=AsyncMock(),
) as import_main_db_mock,
patch.object(
importer,
"_import_knowledge_bases",
new=AsyncMock(),
) as import_kb_mock,
):
result = await importer.import_from_zip(zip_path, manifest)
# 基本断言:导入失败且错误来自清空主库失败
assert result.success is False
assert any("清空主库失败" in err or "clear main db failed" in err for err in result.errors)
# 关键断言:在 _clear_main_db 失败后不再继续执行后续导入步骤,避免部分导入
clear_main_db_mock.assert_awaited()
import_main_db_mock.assert_not_awaited()
import_kb_mock.assert_not_awaited()
```
1. 确保在 `tests/test_backup.py` 文件顶部导入了 `AsyncMock`,例如:`from unittest.mock import AsyncMock, patch`。(`patch` 似乎已经在使用了;保持现有导入,仅在缺失时补充 `AsyncMock`。)
2. 如果实际的 `BackupImporter` 构造函数或 `import_from_zip` 方法签名与 `BackupImporter(main_db=..., mode="replace")` 或 `import_from_zip(zip_path, manifest)` 不同,请按真实 API 调整测试代码。
3. 如果实现中不存在 `_import_knowledge_bases`(或者方法名不同),可以:
- 将打补丁和断言的名称改为实际的方法名,或者
- 去掉对应的 patch / 断言,只保留对 `_import_main_database` 的断言。
4. 如果 `_clear_main_db` 失败时的错误信息不同于 `"清空主库失败"` / `"clear main db failed"`,请将 `any(...)` 中的匹配字符串调整为导入器实际返回的错误消息。
</issue_to_address>
### Comment 4
<location path="astrbot/core/backup/importer.py" line_range="523" />
<code_context>
return imported
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** 建议抽取用于“带频率限制的 count 解析日志”和“按类型区分的时间戳规范化”的小工具函数,让 `_merge_platform_stats_rows` 更专注在聚合逻辑本身。
你可以保持当前行为不变,但通过抽取一些小工具函数(用于带节流的 count 解析和时间戳规范化)来减少分支和重复逻辑,这样 `_merge_platform_stats_rows` 的主要循环就可以更聚焦于聚合行为。
### 1. 抽取带告警节流的 count 解析函数
目前对 `existing` 和 `incoming` 的处理分支中都重复了 `try/except` 和告警逻辑。可以抽出一个小的 helper,通过闭包持有告警计数,在保持行为完全一致的前提下简化主循环:
```python
def _make_platform_stats_count_parser(self):
invalid_count_warned = 0
invalid_count_warn_limit = 5
def parse_count(raw: Any, key: tuple[str, str, str]) -> int:
nonlocal invalid_count_warned
try:
return int(raw)
except (TypeError, ValueError):
if invalid_count_warned < invalid_count_warn_limit:
logger.warning(
"platform_stats count 非法,已按 0 处理: "
f"value={raw!r}, key={key}"
)
invalid_count_warned += 1
return 0
return parse_count
```
在 `_merge_platform_stats_rows` 中使用它:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> tuple[list[dict[str, Any]], int]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
timestamp_cache: dict[str, str] = {}
duplicate_count = 0
parse_count = self._make_platform_stats_count_parser()
for row in rows:
raw_timestamp = row.get("timestamp")
if isinstance(raw_timestamp, str):
normalized_timestamp = timestamp_cache.get(raw_timestamp)
if normalized_timestamp is None:
normalized_timestamp = self._normalize_platform_stats_timestamp(
raw_timestamp
)
timestamp_cache[raw_timestamp] = normalized_timestamp
else:
normalized_timestamp = self._normalize_platform_stats_timestamp(
raw_timestamp
)
key = (
normalized_timestamp,
str(row.get("platform_id")),
str(row.get("platform_type")),
)
existing = merged.get(key)
if existing is None:
merged[key] = dict(row)
continue
duplicate_count += 1
existing_count = parse_count(existing.get("count", 0), key)
incoming_count = parse_count(row.get("count", 0), key)
existing["count"] = existing_count + incoming_count
return list(merged.values()), duplicate_count
```
这样可以保持所有行为不变(相同的告警内容、相同的上限、非法时仍然按 0 处理),但主循环会更易阅读和维护。
### 2. 按类型拆分时间戳规范化逻辑
`_normalize_platform_stats_timestamp` 当前逻辑是正确的,但稍显紧凑。你可以在保持返回值语义不变的前提下,把 `datetime` 和字符串的处理分别拆到两个小 helper 中,减少嵌套:
```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str:
if isinstance(value, datetime):
return self._normalize_platform_stats_dt(value)
if isinstance(value, str):
return self._normalize_platform_stats_str(value)
if value is None:
return ""
return str(value)
def _normalize_platform_stats_dt(self, dt: datetime) -> str:
if dt.tzinfo is not None:
dt = dt.astimezone(timezone.utc)
return dt.isoformat()
def _normalize_platform_stats_str(self, value: str) -> str:
timestamp = value.strip()
if not timestamp:
return ""
if timestamp.endswith("Z"):
timestamp = f"{timestamp[:-1]}+00:00"
try:
dt = datetime.fromisoformat(timestamp)
if dt.tzinfo is not None:
dt = dt.astimezone(timezone.utc)
return dt.isoformat()
except ValueError:
return value.strip()
```
这会保留现有的所有语义(空字符串 vs `None` vs 非法字符串),同时让归一化函数和合并循环都更容易理解和维护。
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,你的反馈会用来改进后续的评审质量。
Original comment in English
Hey - I've found 4 issues, and left some high level feedback:
- In
_merge_platform_stats_rows, you normalize the timestamp only for the aggregation key but keep the originaltimestampvalue in the stored row; consider updating the storedtimestampto the normalized value as well to avoid mixed formats in persisted data. - The
invalid_count_warn_limit = 5magic number inside_merge_platform_stats_rowscould be extracted to a module-level constant or parameter to make the logging behavior easier to tune and understand.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_merge_platform_stats_rows`, you normalize the timestamp only for the aggregation key but keep the original `timestamp` value in the stored row; consider updating the stored `timestamp` to the normalized value as well to avoid mixed formats in persisted data.
- The `invalid_count_warn_limit = 5` magic number inside `_merge_platform_stats_rows` could be extracted to a module-level constant or parameter to make the logging behavior easier to tune and understand.
## Individual Comments
### Comment 1
<location path="tests/test_backup.py" line_range="311-320" />
<code_context>
+ def test_merge_platform_stats_rows(self):
</code_context>
<issue_to_address>
**suggestion (testing):** Add focused tests for `_normalize_platform_stats_timestamp` to cover timestamp edge cases independently of the merge logic.
Currently `_merge_platform_stats_rows` only exercises a simple `_normalize_platform_stats_timestamp` case (`Z` vs `+00:00`). To cover its branches and edge cases directly, please add targeted unit tests for:
- `datetime` values with and without tzinfo (verify UTC normalization)
- String timestamps with `Z`, with non-UTC offsets (e.g. `+08:00`), and naive values
- Empty and whitespace-only strings
- `None` (should return empty string)
- Completely invalid strings (should be returned as-is after `strip`)
This will clarify the normalization contract and prevent regressions independently of the merge behavior.
Suggested implementation:
```python
from datetime import datetime, timezone
```
```python
def test_normalize_platform_stats_timestamp_datetime_values(self):
"""datetime 值有无 tzinfo 都会被标准化为 UTC"""
importer = AstrBotImporter(main_db=MagicMock())
aware = datetime(2025, 1, 1, 12, 0, 0, tzinfo=timezone.utc)
naive = datetime(2025, 1, 1, 12, 0, 0)
aware_norm = importer._normalize_platform_stats_timestamp(aware)
naive_norm = importer._normalize_platform_stats_timestamp(naive)
# 两者都应该标准化为 UTC,并且结果一致
assert isinstance(aware_norm, str)
assert isinstance(naive_norm, str)
assert aware_norm == naive_norm
assert aware_norm.endswith("+00:00")
def test_normalize_platform_stats_timestamp_string_variants(self):
"""字符串时间戳:Z、非 UTC 偏移以及 naive"""
importer = AstrBotImporter(main_db=MagicMock())
ts_z = "2025-01-01T12:00:00Z"
ts_utc_offset = "2025-01-01T12:00:00+00:00"
ts_non_utc_offset = "2025-01-01T20:00:00+08:00"
ts_naive = "2025-01-01T12:00:00"
z_norm = importer._normalize_platform_stats_timestamp(ts_z)
utc_norm = importer._normalize_platform_stats_timestamp(ts_utc_offset)
non_utc_norm = importer._normalize_platform_stats_timestamp(ts_non_utc_offset)
naive_norm = importer._normalize_platform_stats_timestamp(ts_naive)
# Z 和 +00:00 应该归一化为同一表示(当前实现在 merge 测试中依赖这一点)
assert z_norm == utc_norm
# 非 UTC 偏移应当保持其偏移信息(只做 strip / 形式标准化,不强制改为 UTC)
assert "+08:00" in non_utc_norm
# naive 字符串应当被原样保留(除去首尾空白)
assert naive_norm == ts_naive
def test_normalize_platform_stats_timestamp_empty_none_invalid(self):
"""空字符串、None、以及完全非法的字符串"""
importer = AstrBotImporter(main_db=MagicMock())
assert importer._normalize_platform_stats_timestamp("") == ""
assert importer._normalize_platform_stats_timestamp(" ") == ""
assert importer._normalize_platform_stats_timestamp(None) == ""
invalid = " not-a-date "
invalid_norm = importer._normalize_platform_stats_timestamp(invalid)
# 非法字符串:strip 后原样返回
assert invalid_norm == "not-a-date"
def test_merge_platform_stats_rows(self):
"""测试 platform_stats 重复键会在导入前聚合"""
importer = AstrBotImporter(main_db=MagicMock())
```
如果 `_normalize_platform_stats_timestamp` 当前的行为与上述断言不完全一致(例如:将所有可解析时间统一转换为 UTC、或将所有 UTC 格式统一为带 `Z` 的表示),则需要根据实际实现调整断言:
1. 如果实现将所有 UTC 时间统一为带 `Z` 的格式,那么上面的 `endswith("+00:00")` 应改为断言结果以 `"Z"` 结尾。
2. 如果实现会把非 UTC 偏移(如 `+08:00`)也转换为 UTC,那么针对 `non_utc_norm` 的断言应改为检查它与 UTC 表达式相等(例如与 `z_norm` 一致)。
3. 如果对 naive 字符串有额外解析逻辑(例如假定为 UTC 并补上偏移),则需要将 `naive_norm == ts_naive` 的断言改为匹配实际规范化结果。
请对照 `AstrBotImporter._normalize_platform_stats_timestamp` 的真实实现,微调这些测试中的预期字符串形式,但保留这些测试用例覆盖的边界场景。
</issue_to_address>
### Comment 2
<location path="tests/test_backup.py" line_range="348-357" />
<code_context>
+ def test_merge_platform_stats_rows_warns_on_invalid_count(self):
</code_context>
<issue_to_address>
**suggestion (testing):** Extend coverage of invalid `count` handling, including existing-row invalid values and the warning rate limit.
The new test covers an invalid `count` in the incoming row with a valid existing row. The implementation also needs coverage for:
- An invalid `count` in the *existing* row, verifying that only the valid value contributes to the merged `count` and a warning is emitted.
- More than `invalid_count_warn_limit` rows with invalid counts, verifying that warnings are capped (e.g., `warning_mock.call_count == invalid_count_warn_limit`).
Adding these cases would exercise both invalid-count branches and the warning rate-limiting behavior.
Suggested implementation:
```python
def test_merge_platform_stats_rows_warns_on_invalid_count(self, mocker):
"""测试 platform_stats count 非法时会告警并按 0 处理(含已有行与告警上限)"""
importer = AstrBotImporter(main_db=MagicMock())
# NOTE: 替换成实际 logger.warning 的导入路径
warning_mock = mocker.patch("astrbot.backup.logger.warning")
# 1) 新行 count 非法, 已有行合法: 只累加合法的 count, 且产生一次告警
rows = [
{
"timestamp": "2025-12-13T20:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": 5,
},
{
"timestamp": "2025-12-13T20:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
},
]
merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)
assert duplicate_count == 1
assert len(merged_rows) == 1
merged = merged_rows[0]
assert merged["timestamp"] == "2025-12-13T20:00:00Z"
assert merged["platform_id"] == "webchat"
assert merged["platform_type"] == "unknown"
# 只有合法的 5 参与合并, 非法值按 0 处理
assert merged["count"] == 5
assert warning_mock.call_count == 1
warning_mock.reset_mock()
# 2) 已有行 count 非法, 新行合法: 同样只累加合法的 count, 且产生一次告警
rows = [
{
"timestamp": "2025-12-13T21:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
},
{
"timestamp": "2025-12-13T21:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": 7,
},
]
merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)
assert duplicate_count == 1
assert len(merged_rows) == 1
merged = merged_rows[0]
assert merged["timestamp"] == "2025-12-13T21:00:00Z"
assert merged["platform_id"] == "webchat"
assert merged["platform_type"] == "unknown"
# 只有合法的 7 参与合并
assert merged["count"] == 7
assert warning_mock.call_count == 1
warning_mock.reset_mock()
# 3) 超过 invalid_count_warn_limit 的非法 count 只告警到上限
invalid_count_warn_limit = getattr(importer, "invalid_count_warn_limit", 10)
many_invalid_rows = [
{
"timestamp": f"2025-12-13T22:{i:02d}:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
}
for i in range(invalid_count_warn_limit + 5)
]
importer._merge_platform_stats_rows(many_invalid_rows)
# 告警条数被限制在 invalid_count_warn_limit 之内
assert warning_mock.call_count == invalid_count_warn_limit
```
1. 将 `mocker.patch("astrbot.backup.logger.warning")` 中的导入路径替换为实际用于发出非法 `count` 告警的 logger 路径,例如可能是 `tests.test_backup.logger.warning` 或模块内的 `astrbot.backup.importer.logger.warning` 等。
2. 确认 `AstrBotImporter` 实例上确实存在 `invalid_count_warn_limit` 属性;如果该上限定义在模块常量中(如 `INVALID_COUNT_WARN_LIMIT`),则需要改为从常量读取并在断言中使用该常量。
3. 如果现有测试类/模块未使用 `pytest-mock` 的 `mocker` fixture,需要根据项目惯例改用 `monkeypatch` 或 `caplog` 等方式注入并统计 warning 调用次数,同时调整测试函数签名。
</issue_to_address>
### Comment 3
<location path="tests/test_backup.py" line_range="374-376" />
<code_context>
+ assert merged_rows[0]["count"] == 5
+ assert warning_mock.called
+
@pytest.mark.asyncio
async def test_import_file_not_exists(self, mock_main_db, tmp_path):
"""测试导入不存在的文件"""
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen the replace-mode failure test by asserting that no further import steps are executed after `_clear_main_db` fails.
To better protect against partial imports in `replace` mode, also assert that no data-import routines run when `_clear_main_db` fails. For instance, patch `_import_main_database` (and `_import_knowledge_bases` if applicable) as `AsyncMock`s and assert they are not called after `_clear_main_db` raises. This will confirm the import aborts early and avoids a partially cleared/partially imported DB state.
Suggested implementation:
```python
@pytest.mark.asyncio
async def test_import_replace_fails_when_clear_main_db_fails(
self, mock_main_db, tmp_path
):
"""测试 replace 模式下主库清空失败会直接终止导入"""
zip_path = tmp_path / "valid_backup.zip"
manifest = {
"version": "1.0.0",
"platform_stats": [],
"knowledge_bases": [],
}
importer = BackupImporter(main_db=mock_main_db, mode="replace")
clear_error = RuntimeError("clear main db failed")
with (
patch.object(
importer,
"_clear_main_db",
new=AsyncMock(side_effect=clear_error),
) as clear_main_db_mock,
patch.object(
importer,
"_import_main_database",
new=AsyncMock(),
) as import_main_db_mock,
patch.object(
importer,
"_import_knowledge_bases",
new=AsyncMock(),
) as import_kb_mock,
):
result = await importer.import_from_zip(zip_path, manifest)
# 基本断言:导入失败且错误来自清空主库失败
assert result.success is False
assert any("清空主库失败" in err or "clear main db failed" in err for err in result.errors)
# 关键断言:在 _clear_main_db 失败后不再继续执行后续导入步骤,避免部分导入
clear_main_db_mock.assert_awaited()
import_main_db_mock.assert_not_awaited()
import_kb_mock.assert_not_awaited()
```
1. Ensure `AsyncMock` is imported at the top of `tests/test_backup.py`, for example:
`from unittest.mock import AsyncMock, patch`. (`patch` seems already used; keep existing imports and just add `AsyncMock` if missing.)
2. Adjust the `BackupImporter` construction and `import_from_zip` call if the actual API differs (e.g., if the constructor or method signature is not `BackupImporter(main_db=..., mode="replace")` or `import_from_zip(zip_path, manifest)`).
3. If the implementation does not have `_import_knowledge_bases` (or uses a different name), either:
- Update the patched attribute name to match the real method, or
- Remove the corresponding patch/assert and only assert on `_import_main_database`.
4. If the error message for `_clear_main_db` failures differs from `"清空主库失败"` / `"clear main db failed"`, adjust the `any(...)` error assertion string to match the actual message used by the importer.
</issue_to_address>
### Comment 4
<location path="astrbot/core/backup/importer.py" line_range="523" />
<code_context>
return imported
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting helpers for count parsing with throttled logging and type-specific timestamp normalization to keep `_merge_platform_stats_rows` focused on aggregation logic.
You can keep the current behavior but reduce branching and duplication by extracting small helpers for count parsing (with throttled logging) and timestamp normalization. This keeps `_merge_platform_stats_rows` focused on aggregation.
### 1. Extract throttled count parsing
Right now, the `existing` and `incoming` branches duplicate `try/except` and warning logic. A tiny helper that closes over the warning counter keeps behavior identical but simplifies the main loop:
```python
def _make_platform_stats_count_parser(self):
invalid_count_warned = 0
invalid_count_warn_limit = 5
def parse_count(raw: Any, key: tuple[str, str, str]) -> int:
nonlocal invalid_count_warned
try:
return int(raw)
except (TypeError, ValueError):
if invalid_count_warned < invalid_count_warn_limit:
logger.warning(
"platform_stats count 非法,已按 0 处理: "
f"value={raw!r}, key={key}"
)
invalid_count_warned += 1
return 0
return parse_count
```
Use it in `_merge_platform_stats_rows`:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> tuple[list[dict[str, Any]], int]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
timestamp_cache: dict[str, str] = {}
duplicate_count = 0
parse_count = self._make_platform_stats_count_parser()
for row in rows:
raw_timestamp = row.get("timestamp")
if isinstance(raw_timestamp, str):
normalized_timestamp = timestamp_cache.get(raw_timestamp)
if normalized_timestamp is None:
normalized_timestamp = self._normalize_platform_stats_timestamp(
raw_timestamp
)
timestamp_cache[raw_timestamp] = normalized_timestamp
else:
normalized_timestamp = self._normalize_platform_stats_timestamp(
raw_timestamp
)
key = (
normalized_timestamp,
str(row.get("platform_id")),
str(row.get("platform_type")),
)
existing = merged.get(key)
if existing is None:
merged[key] = dict(row)
continue
duplicate_count += 1
existing_count = parse_count(existing.get("count", 0), key)
incoming_count = parse_count(row.get("count", 0), key)
existing["count"] = existing_count + incoming_count
return list(merged.values()), duplicate_count
```
Behavior stays the same (same warning text, same limit, same default to `0`) but the core loop is much easier to scan.
### 2. Split timestamp normalization by type
`_normalize_platform_stats_timestamp` is correct but dense. You can keep the same return values while separating datetime and string handling into small helpers, which reduces nesting:
```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str:
if isinstance(value, datetime):
return self._normalize_platform_stats_dt(value)
if isinstance(value, str):
return self._normalize_platform_stats_str(value)
if value is None:
return ""
return str(value)
def _normalize_platform_stats_dt(self, dt: datetime) -> str:
if dt.tzinfo is not None:
dt = dt.astimezone(timezone.utc)
return dt.isoformat()
def _normalize_platform_stats_str(self, value: str) -> str:
timestamp = value.strip()
if not timestamp:
return ""
if timestamp.endswith("Z"):
timestamp = f"{timestamp[:-1]}+00:00"
try:
dt = datetime.fromisoformat(timestamp)
if dt.tzinfo is not None:
dt = dt.astimezone(timezone.utc)
return dt.isoformat()
except ValueError:
return value.strip()
```
This keeps all existing semantics (empty vs `None` vs invalid string) but makes both the normalization function and the merge loop easier to reason about.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def test_merge_platform_stats_rows(self): | ||
| """测试 platform_stats 重复键会在导入前聚合""" | ||
| importer = AstrBotImporter(main_db=MagicMock()) | ||
| rows = [ | ||
| { | ||
| "id": 1, | ||
| "timestamp": "2025-12-13T20:00:00Z", | ||
| "platform_id": "webchat", | ||
| "platform_type": "unknown", | ||
| "count": 14, |
There was a problem hiding this comment.
suggestion (testing): 为 _normalize_platform_stats_timestamp 增加更聚焦的测试,用于在不依赖合并逻辑的情况下覆盖时间戳的边界场景。
目前 _merge_platform_stats_rows 只覆盖了 _normalize_platform_stats_timestamp 的一个简单场景(Z 与 +00:00 的差异)。为了直接覆盖它的分支和边界情况,请增加有针对性的单元测试来覆盖:
- 带 tzinfo 和不带 tzinfo 的
datetime值(验证归一化为 UTC 的行为) - 字符串时间戳:带
Z、带非 UTC 偏移(例如+08:00)、以及 naive 值 - 空字符串和只包含空白的字符串
None(应返回空字符串)- 完全非法的字符串(
strip后按原样返回)
这将帮助明确归一化的约定,并在不依赖合并行为的前提下防止回归。
Suggested implementation:
from datetime import datetime, timezone def test_normalize_platform_stats_timestamp_datetime_values(self):
"""datetime 值有无 tzinfo 都会被标准化为 UTC"""
importer = AstrBotImporter(main_db=MagicMock())
aware = datetime(2025, 1, 1, 12, 0, 0, tzinfo=timezone.utc)
naive = datetime(2025, 1, 1, 12, 0, 0)
aware_norm = importer._normalize_platform_stats_timestamp(aware)
naive_norm = importer._normalize_platform_stats_timestamp(naive)
# 两者都应该标准化为 UTC,并且结果一致
assert isinstance(aware_norm, str)
assert isinstance(naive_norm, str)
assert aware_norm == naive_norm
assert aware_norm.endswith("+00:00")
def test_normalize_platform_stats_timestamp_string_variants(self):
"""字符串时间戳:Z、非 UTC 偏移以及 naive"""
importer = AstrBotImporter(main_db=MagicMock())
ts_z = "2025-01-01T12:00:00Z"
ts_utc_offset = "2025-01-01T12:00:00+00:00"
ts_non_utc_offset = "2025-01-01T20:00:00+08:00"
ts_naive = "2025-01-01T12:00:00"
z_norm = importer._normalize_platform_stats_timestamp(ts_z)
utc_norm = importer._normalize_platform_stats_timestamp(ts_utc_offset)
non_utc_norm = importer._normalize_platform_stats_timestamp(ts_non_utc_offset)
naive_norm = importer._normalize_platform_stats_timestamp(ts_naive)
# Z 和 +00:00 应该归一化为同一表示(当前实现在 merge 测试中依赖这一点)
assert z_norm == utc_norm
# 非 UTC 偏移应当保持其偏移信息(只做 strip / 形式标准化,不强制改为 UTC)
assert "+08:00" in non_utc_norm
# naive 字符串应当被原样保留(除去首尾空白)
assert naive_norm == ts_naive
def test_normalize_platform_stats_timestamp_empty_none_invalid(self):
"""空字符串、None、以及完全非法的字符串"""
importer = AstrBotImporter(main_db=MagicMock())
assert importer._normalize_platform_stats_timestamp("") == ""
assert importer._normalize_platform_stats_timestamp(" ") == ""
assert importer._normalize_platform_stats_timestamp(None) == ""
invalid = " not-a-date "
invalid_norm = importer._normalize_platform_stats_timestamp(invalid)
# 非法字符串:strip 后原样返回
assert invalid_norm == "not-a-date"
def test_merge_platform_stats_rows(self):
"""测试 platform_stats 重复键会在导入前聚合"""
importer = AstrBotImporter(main_db=MagicMock())如果 _normalize_platform_stats_timestamp 当前的行为与上述断言不完全一致(例如:将所有可解析时间统一转换为 UTC、或将所有 UTC 格式统一为带 Z 的表示),则需要根据实际实现调整断言:
- 如果实现将所有 UTC 时间统一为带
Z的格式,那么上面的endswith("+00:00")应改为断言结果以"Z"结尾。 - 如果实现会把非 UTC 偏移(如
+08:00)也转换为 UTC,那么针对non_utc_norm的断言应改为检查它与 UTC 表达式相等(例如与z_norm一致)。 - 如果对 naive 字符串有额外解析逻辑(例如假定为 UTC 并补上偏移),则需要将
naive_norm == ts_naive的断言改为匹配实际规范化结果。
请对照 AstrBotImporter._normalize_platform_stats_timestamp 的真实实现,微调这些测试中的预期字符串形式,但保留这些测试用例覆盖的边界场景。
Original comment in English
suggestion (testing): Add focused tests for _normalize_platform_stats_timestamp to cover timestamp edge cases independently of the merge logic.
Currently _merge_platform_stats_rows only exercises a simple _normalize_platform_stats_timestamp case (Z vs +00:00). To cover its branches and edge cases directly, please add targeted unit tests for:
datetimevalues with and without tzinfo (verify UTC normalization)- String timestamps with
Z, with non-UTC offsets (e.g.+08:00), and naive values - Empty and whitespace-only strings
None(should return empty string)- Completely invalid strings (should be returned as-is after
strip)
This will clarify the normalization contract and prevent regressions independently of the merge behavior.
Suggested implementation:
from datetime import datetime, timezone def test_normalize_platform_stats_timestamp_datetime_values(self):
"""datetime 值有无 tzinfo 都会被标准化为 UTC"""
importer = AstrBotImporter(main_db=MagicMock())
aware = datetime(2025, 1, 1, 12, 0, 0, tzinfo=timezone.utc)
naive = datetime(2025, 1, 1, 12, 0, 0)
aware_norm = importer._normalize_platform_stats_timestamp(aware)
naive_norm = importer._normalize_platform_stats_timestamp(naive)
# 两者都应该标准化为 UTC,并且结果一致
assert isinstance(aware_norm, str)
assert isinstance(naive_norm, str)
assert aware_norm == naive_norm
assert aware_norm.endswith("+00:00")
def test_normalize_platform_stats_timestamp_string_variants(self):
"""字符串时间戳:Z、非 UTC 偏移以及 naive"""
importer = AstrBotImporter(main_db=MagicMock())
ts_z = "2025-01-01T12:00:00Z"
ts_utc_offset = "2025-01-01T12:00:00+00:00"
ts_non_utc_offset = "2025-01-01T20:00:00+08:00"
ts_naive = "2025-01-01T12:00:00"
z_norm = importer._normalize_platform_stats_timestamp(ts_z)
utc_norm = importer._normalize_platform_stats_timestamp(ts_utc_offset)
non_utc_norm = importer._normalize_platform_stats_timestamp(ts_non_utc_offset)
naive_norm = importer._normalize_platform_stats_timestamp(ts_naive)
# Z 和 +00:00 应该归一化为同一表示(当前实现在 merge 测试中依赖这一点)
assert z_norm == utc_norm
# 非 UTC 偏移应当保持其偏移信息(只做 strip / 形式标准化,不强制改为 UTC)
assert "+08:00" in non_utc_norm
# naive 字符串应当被原样保留(除去首尾空白)
assert naive_norm == ts_naive
def test_normalize_platform_stats_timestamp_empty_none_invalid(self):
"""空字符串、None、以及完全非法的字符串"""
importer = AstrBotImporter(main_db=MagicMock())
assert importer._normalize_platform_stats_timestamp("") == ""
assert importer._normalize_platform_stats_timestamp(" ") == ""
assert importer._normalize_platform_stats_timestamp(None) == ""
invalid = " not-a-date "
invalid_norm = importer._normalize_platform_stats_timestamp(invalid)
# 非法字符串:strip 后原样返回
assert invalid_norm == "not-a-date"
def test_merge_platform_stats_rows(self):
"""测试 platform_stats 重复键会在导入前聚合"""
importer = AstrBotImporter(main_db=MagicMock())如果 _normalize_platform_stats_timestamp 当前的行为与上述断言不完全一致(例如:将所有可解析时间统一转换为 UTC、或将所有 UTC 格式统一为带 Z 的表示),则需要根据实际实现调整断言:
- 如果实现将所有 UTC 时间统一为带
Z的格式,那么上面的endswith("+00:00")应改为断言结果以"Z"结尾。 - 如果实现会把非 UTC 偏移(如
+08:00)也转换为 UTC,那么针对non_utc_norm的断言应改为检查它与 UTC 表达式相等(例如与z_norm一致)。 - 如果对 naive 字符串有额外解析逻辑(例如假定为 UTC 并补上偏移),则需要将
naive_norm == ts_naive的断言改为匹配实际规范化结果。
请对照 AstrBotImporter._normalize_platform_stats_timestamp 的真实实现,微调这些测试中的预期字符串形式,但保留这些测试用例覆盖的边界场景。
| def test_merge_platform_stats_rows_warns_on_invalid_count(self): | ||
| """测试 platform_stats count 非法时会告警并按 0 处理""" | ||
| importer = AstrBotImporter(main_db=MagicMock()) | ||
| rows = [ | ||
| { | ||
| "timestamp": "2025-12-13T20:00:00+00:00", | ||
| "platform_id": "webchat", | ||
| "platform_type": "unknown", | ||
| "count": 5, | ||
| }, |
There was a problem hiding this comment.
suggestion (testing): 扩展对非法 count 处理的测试覆盖,包括“已有行中 count 非法”的场景以及告警频率上限。
你新增的测试覆盖了“新行 count 非法、已有行合法”的情况。实现还需要覆盖以下场景:
- 已有 行中的
count非法,验证只有合法值会参与合并到最终的count,并且会发出一次告警。 - 超过
invalid_count_warn_limit数量的非法count行,验证告警次数会被限制(例如断言warning_mock.call_count == invalid_count_warn_limit)。
补充这些用例可以同时覆盖两种非法 count 分支以及告警节流的行为。
Suggested implementation:
def test_merge_platform_stats_rows_warns_on_invalid_count(self, mocker):
"""测试 platform_stats count 非法时会告警并按 0 处理(含已有行与告警上限)"""
importer = AstrBotImporter(main_db=MagicMock())
# NOTE: 替换成实际 logger.warning 的导入路径
warning_mock = mocker.patch("astrbot.backup.logger.warning")
# 1) 新行 count 非法, 已有行合法: 只累加合法的 count, 且产生一次告警
rows = [
{
"timestamp": "2025-12-13T20:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": 5,
},
{
"timestamp": "2025-12-13T20:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
},
]
merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)
assert duplicate_count == 1
assert len(merged_rows) == 1
merged = merged_rows[0]
assert merged["timestamp"] == "2025-12-13T20:00:00Z"
assert merged["platform_id"] == "webchat"
assert merged["platform_type"] == "unknown"
# 只有合法的 5 参与合并, 非法值按 0 处理
assert merged["count"] == 5
assert warning_mock.call_count == 1
warning_mock.reset_mock()
# 2) 已有行 count 非法, 新行合法: 同样只累加合法的 count, 且产生一次告警
rows = [
{
"timestamp": "2025-12-13T21:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
},
{
"timestamp": "2025-12-13T21:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": 7,
},
]
merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)
assert duplicate_count == 1
assert len(merged_rows) == 1
merged = merged_rows[0]
assert merged["timestamp"] == "2025-12-13T21:00:00Z"
assert merged["platform_id"] == "webchat"
assert merged["platform_type"] == "unknown"
# 只有合法的 7 参与合并
assert merged["count"] == 7
assert warning_mock.call_count == 1
warning_mock.reset_mock()
# 3) 超过 invalid_count_warn_limit 的非法 count 只告警到上限
invalid_count_warn_limit = getattr(importer, "invalid_count_warn_limit", 10)
many_invalid_rows = [
{
"timestamp": f"2025-12-13T22:{i:02d}:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
}
for i in range(invalid_count_warn_limit + 5)
]
importer._merge_platform_stats_rows(many_invalid_rows)
# 告警条数被限制在 invalid_count_warn_limit 之内
assert warning_mock.call_count == invalid_count_warn_limit- 将
mocker.patch("astrbot.backup.logger.warning")中的导入路径替换为实际用于发出非法count告警的 logger 路径,例如可能是tests.test_backup.logger.warning或模块内的astrbot.backup.importer.logger.warning等。 - 确认
AstrBotImporter实例上确实存在invalid_count_warn_limit属性;如果该上限定义在模块常量中(如INVALID_COUNT_WARN_LIMIT),则需要改为从常量读取并在断言中使用该常量。 - 如果现有测试类/模块未使用
pytest-mock的mockerfixture,需要根据项目惯例改用monkeypatch或caplog等方式注入并统计 warning 调用次数,同时调整测试函数签名。
Original comment in English
suggestion (testing): Extend coverage of invalid count handling, including existing-row invalid values and the warning rate limit.
The new test covers an invalid count in the incoming row with a valid existing row. The implementation also needs coverage for:
- An invalid
countin the existing row, verifying that only the valid value contributes to the mergedcountand a warning is emitted. - More than
invalid_count_warn_limitrows with invalid counts, verifying that warnings are capped (e.g.,warning_mock.call_count == invalid_count_warn_limit).
Adding these cases would exercise both invalid-count branches and the warning rate-limiting behavior.
Suggested implementation:
def test_merge_platform_stats_rows_warns_on_invalid_count(self, mocker):
"""测试 platform_stats count 非法时会告警并按 0 处理(含已有行与告警上限)"""
importer = AstrBotImporter(main_db=MagicMock())
# NOTE: 替换成实际 logger.warning 的导入路径
warning_mock = mocker.patch("astrbot.backup.logger.warning")
# 1) 新行 count 非法, 已有行合法: 只累加合法的 count, 且产生一次告警
rows = [
{
"timestamp": "2025-12-13T20:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": 5,
},
{
"timestamp": "2025-12-13T20:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
},
]
merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)
assert duplicate_count == 1
assert len(merged_rows) == 1
merged = merged_rows[0]
assert merged["timestamp"] == "2025-12-13T20:00:00Z"
assert merged["platform_id"] == "webchat"
assert merged["platform_type"] == "unknown"
# 只有合法的 5 参与合并, 非法值按 0 处理
assert merged["count"] == 5
assert warning_mock.call_count == 1
warning_mock.reset_mock()
# 2) 已有行 count 非法, 新行合法: 同样只累加合法的 count, 且产生一次告警
rows = [
{
"timestamp": "2025-12-13T21:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
},
{
"timestamp": "2025-12-13T21:00:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": 7,
},
]
merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)
assert duplicate_count == 1
assert len(merged_rows) == 1
merged = merged_rows[0]
assert merged["timestamp"] == "2025-12-13T21:00:00Z"
assert merged["platform_id"] == "webchat"
assert merged["platform_type"] == "unknown"
# 只有合法的 7 参与合并
assert merged["count"] == 7
assert warning_mock.call_count == 1
warning_mock.reset_mock()
# 3) 超过 invalid_count_warn_limit 的非法 count 只告警到上限
invalid_count_warn_limit = getattr(importer, "invalid_count_warn_limit", 10)
many_invalid_rows = [
{
"timestamp": f"2025-12-13T22:{i:02d}:00+00:00",
"platform_id": "webchat",
"platform_type": "unknown",
"count": "not-a-number",
}
for i in range(invalid_count_warn_limit + 5)
]
importer._merge_platform_stats_rows(many_invalid_rows)
# 告警条数被限制在 invalid_count_warn_limit 之内
assert warning_mock.call_count == invalid_count_warn_limit- 将
mocker.patch("astrbot.backup.logger.warning")中的导入路径替换为实际用于发出非法count告警的 logger 路径,例如可能是tests.test_backup.logger.warning或模块内的astrbot.backup.importer.logger.warning等。 - 确认
AstrBotImporter实例上确实存在invalid_count_warn_limit属性;如果该上限定义在模块常量中(如INVALID_COUNT_WARN_LIMIT),则需要改为从常量读取并在断言中使用该常量。 - 如果现有测试类/模块未使用
pytest-mock的mockerfixture,需要根据项目惯例改用monkeypatch或caplog等方式注入并统计 warning 调用次数,同时调整测试函数签名。
| @pytest.mark.asyncio | ||
| async def test_import_file_not_exists(self, mock_main_db, tmp_path): | ||
| """测试导入不存在的文件""" |
There was a problem hiding this comment.
suggestion (testing): 增强 replace 模式下失败场景的测试,确保在 _clear_main_db 失败后不会继续执行后续导入步骤。
为了更好地防止 replace 模式下出现“部分导入”的情况,可以在 _clear_main_db 失败时再断言不会调用任何后续的数据导入流程。例如,将 _import_main_database(以及如有的话 _import_knowledge_bases)打补丁为 AsyncMock,并断言在 _clear_main_db 抛出异常后这些方法没有被调用。这样可以确认导入在清库失败时会提前终止,避免主库被部分清空、部分导入的状态。
Suggested implementation:
@pytest.mark.asyncio
async def test_import_replace_fails_when_clear_main_db_fails(
self, mock_main_db, tmp_path
):
"""测试 replace 模式下主库清空失败会直接终止导入"""
zip_path = tmp_path / "valid_backup.zip"
manifest = {
"version": "1.0.0",
"platform_stats": [],
"knowledge_bases": [],
}
importer = BackupImporter(main_db=mock_main_db, mode="replace")
clear_error = RuntimeError("clear main db failed")
with (
patch.object(
importer,
"_clear_main_db",
new=AsyncMock(side_effect=clear_error),
) as clear_main_db_mock,
patch.object(
importer,
"_import_main_database",
new=AsyncMock(),
) as import_main_db_mock,
patch.object(
importer,
"_import_knowledge_bases",
new=AsyncMock(),
) as import_kb_mock,
):
result = await importer.import_from_zip(zip_path, manifest)
# 基本断言:导入失败且错误来自清空主库失败
assert result.success is False
assert any("清空主库失败" in err or "clear main db failed" in err for err in result.errors)
# 关键断言:在 _clear_main_db 失败后不再继续执行后续导入步骤,避免部分导入
clear_main_db_mock.assert_awaited()
import_main_db_mock.assert_not_awaited()
import_kb_mock.assert_not_awaited()- 确保在
tests/test_backup.py文件顶部导入了AsyncMock,例如:from unittest.mock import AsyncMock, patch。(patch似乎已经在使用了;保持现有导入,仅在缺失时补充AsyncMock。) - 如果实际的
BackupImporter构造函数或import_from_zip方法签名与BackupImporter(main_db=..., mode="replace")或import_from_zip(zip_path, manifest)不同,请按真实 API 调整测试代码。 - 如果实现中不存在
_import_knowledge_bases(或者方法名不同),可以:- 将打补丁和断言的名称改为实际的方法名,或者
- 去掉对应的 patch / 断言,只保留对
_import_main_database的断言。
- 如果
_clear_main_db失败时的错误信息不同于"清空主库失败"/"clear main db failed",请将any(...)中的匹配字符串调整为导入器实际返回的错误消息。
Original comment in English
suggestion (testing): Strengthen the replace-mode failure test by asserting that no further import steps are executed after _clear_main_db fails.
To better protect against partial imports in replace mode, also assert that no data-import routines run when _clear_main_db fails. For instance, patch _import_main_database (and _import_knowledge_bases if applicable) as AsyncMocks and assert they are not called after _clear_main_db raises. This will confirm the import aborts early and avoids a partially cleared/partially imported DB state.
Suggested implementation:
@pytest.mark.asyncio
async def test_import_replace_fails_when_clear_main_db_fails(
self, mock_main_db, tmp_path
):
"""测试 replace 模式下主库清空失败会直接终止导入"""
zip_path = tmp_path / "valid_backup.zip"
manifest = {
"version": "1.0.0",
"platform_stats": [],
"knowledge_bases": [],
}
importer = BackupImporter(main_db=mock_main_db, mode="replace")
clear_error = RuntimeError("clear main db failed")
with (
patch.object(
importer,
"_clear_main_db",
new=AsyncMock(side_effect=clear_error),
) as clear_main_db_mock,
patch.object(
importer,
"_import_main_database",
new=AsyncMock(),
) as import_main_db_mock,
patch.object(
importer,
"_import_knowledge_bases",
new=AsyncMock(),
) as import_kb_mock,
):
result = await importer.import_from_zip(zip_path, manifest)
# 基本断言:导入失败且错误来自清空主库失败
assert result.success is False
assert any("清空主库失败" in err or "clear main db failed" in err for err in result.errors)
# 关键断言:在 _clear_main_db 失败后不再继续执行后续导入步骤,避免部分导入
clear_main_db_mock.assert_awaited()
import_main_db_mock.assert_not_awaited()
import_kb_mock.assert_not_awaited()- Ensure
AsyncMockis imported at the top oftests/test_backup.py, for example:
from unittest.mock import AsyncMock, patch. (patchseems already used; keep existing imports and just addAsyncMockif missing.) - Adjust the
BackupImporterconstruction andimport_from_zipcall if the actual API differs (e.g., if the constructor or method signature is notBackupImporter(main_db=..., mode="replace")orimport_from_zip(zip_path, manifest)). - If the implementation does not have
_import_knowledge_bases(or uses a different name), either:- Update the patched attribute name to match the real method, or
- Remove the corresponding patch/assert and only assert on
_import_main_database.
- If the error message for
_clear_main_dbfailures differs from"清空主库失败"/"clear main db failed", adjust theany(...)error assertion string to match the actual message used by the importer.
- 告警上限魔法数字提取为模块常量 PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT - 抽取 parse_count 内联函数,消除重复的 try/except 分支 - 存储行的 timestamp 同步写入规范化值,避免落库格式混用 - 补充测试:已有行 count 非法、告警限流、replace 模式中断断言
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我已经审查了你的更改,看起来很棒!
帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的评审。
Original comment in English
Hey - I've reviewed your changes and they look great!
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我在这里给出了一些总体反馈:
- 在
_normalize_platform_stats_timestamp中,空/无效时间戳会被规范化为空字符串,然后作为去重键的一部分使用,这会把所有这类行聚合到一起;请考虑这些行是否应该保持区分(例如跳过规范化或使用原始值),或者显式跳过这些行,以避免出现非预期的合并。 PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT是一个全局常量,而计数器的作用域仅限于单次_merge_platform_stats_rows调用;如果你预期这个函数会在一次导入的多个阶段或不同的导入之间复用,你可能需要进一步明确该限制是按导入、按表,还是全局的,并据此调整实现。
给 AI 代理的提示
Please address the comments from this code review:
## Overall Comments
- In `_normalize_platform_stats_timestamp`, empty/invalid timestamps are normalized to an empty string and then used as part of the deduplication key, which will aggregate all such rows together; consider whether these should instead be left distinct (e.g., by skipping normalization or using the original value) or explicitly skipped to avoid unintended merging.
- The `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` is a global constant and the counter is scoped to a single `_merge_platform_stats_rows` call; if you expect this function to be used in multiple stages of one import or across different imports, you may want to clarify whether the limit should be per-import, per-table, or global and adjust the implementation accordingly.帮我变得更有用!请对每条评论点 👍 或 👎,我会根据你的反馈改进后续评审。
Original comment in English
Hey - I've left some high level feedback:
- In
_normalize_platform_stats_timestamp, empty/invalid timestamps are normalized to an empty string and then used as part of the deduplication key, which will aggregate all such rows together; consider whether these should instead be left distinct (e.g., by skipping normalization or using the original value) or explicitly skipped to avoid unintended merging. - The
PLATFORM_STATS_INVALID_COUNT_WARN_LIMITis a global constant and the counter is scoped to a single_merge_platform_stats_rowscall; if you expect this function to be used in multiple stages of one import or across different imports, you may want to clarify whether the limit should be per-import, per-table, or global and adjust the implementation accordingly.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_normalize_platform_stats_timestamp`, empty/invalid timestamps are normalized to an empty string and then used as part of the deduplication key, which will aggregate all such rows together; consider whether these should instead be left distinct (e.g., by skipping normalization or using the original value) or explicitly skipped to avoid unintended merging.
- The `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` is a global constant and the counter is scoped to a single `_merge_platform_stats_rows` call; if you expect this function to be used in multiple stages of one import or across different imports, you may want to clarify whether the limit should be per-import, per-table, or global and adjust the implementation accordingly.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Addressed both review comments in follow-up commit f091b85:
Also added test coverage for invalid timestamp behavior:
Local verification:
|
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 1 个问题,并留下了一些高层反馈:
- 在
_normalize_platform_stats_timestamp中,能够成功解析的 naive 时间戳会原样返回,没有显式的 UTC 偏移,这意味着对于逻辑上等价的时间点,你最终可能得到Z/+00:00和 naive 值的混合;建议将 naive 时间戳视为 UTC,并始终规范化为带有显式 UTC 偏移的形式(例如dt.replace(tzinfo=timezone.utc).isoformat()),以确保键和值序列化结果的一致性。
提供给 AI Agent 的提示
Please address the comments from this code review:
## Overall Comments
- In `_normalize_platform_stats_timestamp`, naive timestamps that successfully parse are returned as-is without an explicit UTC offset, which means you can end up with a mix of `Z/+00:00` and naive values for logically equivalent instants; consider treating naive timestamps as UTC and always normalizing to an explicit UTC offset (e.g., `dt.replace(tzinfo=timezone.utc).isoformat()`) to ensure consistent keys and serialized values.
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>
return imported
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** 考虑将 `_merge_platform_stats_rows` 重构为更小的辅助方法,用于计数解析、时间戳规范化/缓存以及无效 key 处理,从而在不改变行为的前提下,让整体逻辑更扁平、更容易理解。
你可以通过把 `_merge_platform_stats_rows` 拆分为几个小的辅助函数,并去掉嵌套函数 / `nonlocal` 状态,在不改变行为的前提下降低认知负担。
### 1. 提取一个用于计数解析的纯辅助函数(不使用 `nonlocal`)
将告警 + 解析移动到一个独立的方法中,它保持告警频率限制,但自身是自包含的:
```python
def _parse_platform_stats_count(
self,
raw_count: Any,
key: tuple[str, str, str],
warned_count: int,
) -> tuple[int, int]:
if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
return 0, warned_count
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
logger.warning(
"platform_stats count 非法,已按 0 处理: "
f"value={raw_count!r}, key={key}"
)
return 0, warned_count + 1
```
在 `_merge_platform_stats_rows` 中的用法会变得更加清晰且易于测试:
```python
invalid_count_warned = 0
# ...
normalized_row = dict(row)
normalized_row["timestamp"] = normalized_timestamp
normalized_row["count"], invalid_count_warned = self._parse_platform_stats_count(
normalized_row.get("count", 0),
key,
invalid_count_warned,
)
```
以及:
```python
existing_count, invalid_count_warned = self._parse_platform_stats_count(
existing.get("count", 0),
key,
invalid_count_warned,
)
incoming_count, invalid_count_warned = self._parse_platform_stats_count(
row.get("count", 0),
key,
invalid_count_warned,
)
existing["count"] = existing_count + incoming_count
```
这样就可以去掉内部函数和 `nonlocal`,同时保持告警频率限制的行为完全一致。
### 2. 将时间戳规范化 + 缓存提取到一个辅助函数中
把缓存和有效性逻辑隔离出来,使主循环更加线性:
```python
def _normalize_platform_stats_timestamp_cached(
self,
raw_timestamp: Any,
cache: dict[str, tuple[str, bool]],
) -> tuple[str, bool]:
if isinstance(raw_timestamp, str):
cached = cache.get(raw_timestamp)
if cached is not None:
return cached
result = self._normalize_platform_stats_timestamp(raw_timestamp)
cache[raw_timestamp] = result
return result
return self._normalize_platform_stats_timestamp(raw_timestamp)
```
然后 `_merge_platform_stats_rows` 的循环会更容易浏览:
```python
for row_index, row in enumerate(rows):
raw_timestamp = row.get("timestamp")
normalized_timestamp, is_timestamp_valid = (
self._normalize_platform_stats_timestamp_cached(
raw_timestamp, timestamp_cache
)
)
timestamp_for_key = (
normalized_timestamp
if is_timestamp_valid
else f"__invalid_timestamp_row_{row_index}"
)
key = (
timestamp_for_key,
str(row.get("platform_id")),
str(row.get("platform_type")),
)
# ... rest unchanged
```
### 3. 可选:明确“无效时间戳” key 的构造意图
你可以通过包装一层方法,让这个合成 key 的意图更清晰:
```python
def _platform_stats_key_timestamp(
self,
normalized_timestamp: str,
is_valid: bool,
row_index: int,
) -> str:
if is_valid:
return normalized_timestamp
# 保证无效时间戳不被合并:使用行号加前缀作为唯一 key
return f"__invalid_timestamp_row_{row_index}"
```
随后:
```python
timestamp_for_key = self._platform_stats_key_timestamp(
normalized_timestamp,
is_timestamp_valid,
row_index,
)
```
这样可以在保持完全相同行为(基于行号保证唯一性)的前提下,使“为什么要这么做”更加直观。
综合来看,这些修改在保留所有语义(缓存、告警频率限制、无效行处理)的同时,把 `_merge_platform_stats_rows` 扁平化为一条清晰的流程:规范化时间戳、构建 key、解析计数、执行合并。
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的评审。
Original comment in English
Hey - I've found 1 issue, and left some high level feedback:
- In
_normalize_platform_stats_timestamp, naive timestamps that successfully parse are returned as-is without an explicit UTC offset, which means you can end up with a mix ofZ/+00:00and naive values for logically equivalent instants; consider treating naive timestamps as UTC and always normalizing to an explicit UTC offset (e.g.,dt.replace(tzinfo=timezone.utc).isoformat()) to ensure consistent keys and serialized values.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_normalize_platform_stats_timestamp`, naive timestamps that successfully parse are returned as-is without an explicit UTC offset, which means you can end up with a mix of `Z/+00:00` and naive values for logically equivalent instants; consider treating naive timestamps as UTC and always normalizing to an explicit UTC offset (e.g., `dt.replace(tzinfo=timezone.utc).isoformat()`) to ensure consistent keys and serialized values.
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>
return imported
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring `_merge_platform_stats_rows` into smaller helper methods for count parsing, timestamp normalization/caching, and invalid-key handling to make the logic flatter and easier to follow without changing behavior.
You can reduce the cognitive load without changing behavior by splitting `_merge_platform_stats_rows` into small helpers and removing the nested function / nonlocal state.
### 1. Extract a pure helper for count parsing (no `nonlocal`)
Move the warning + parsing into a dedicated method that keeps the rate limit but is self-contained:
```python
def _parse_platform_stats_count(
self,
raw_count: Any,
key: tuple[str, str, str],
warned_count: int,
) -> tuple[int, int]:
if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
return 0, warned_count
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
logger.warning(
"platform_stats count 非法,已按 0 处理: "
f"value={raw_count!r}, key={key}"
)
return 0, warned_count + 1
```
Usage in `_merge_platform_stats_rows` becomes explicit and testable:
```python
invalid_count_warned = 0
# ...
normalized_row = dict(row)
normalized_row["timestamp"] = normalized_timestamp
normalized_row["count"], invalid_count_warned = self._parse_platform_stats_count(
normalized_row.get("count", 0),
key,
invalid_count_warned,
)
```
and
```python
existing_count, invalid_count_warned = self._parse_platform_stats_count(
existing.get("count", 0),
key,
invalid_count_warned,
)
incoming_count, invalid_count_warned = self._parse_platform_stats_count(
row.get("count", 0),
key,
invalid_count_warned,
)
existing["count"] = existing_count + incoming_count
```
This drops the inner function + `nonlocal`, while keeping the warning limit behavior identical.
### 2. Extract timestamp normalization + caching into a helper
Isolate the cache and validity logic to make the main loop more linear:
```python
def _normalize_platform_stats_timestamp_cached(
self,
raw_timestamp: Any,
cache: dict[str, tuple[str, bool]],
) -> tuple[str, bool]:
if isinstance(raw_timestamp, str):
cached = cache.get(raw_timestamp)
if cached is not None:
return cached
result = self._normalize_platform_stats_timestamp(raw_timestamp)
cache[raw_timestamp] = result
return result
return self._normalize_platform_stats_timestamp(raw_timestamp)
```
Then `_merge_platform_stats_rows` loop becomes easier to scan:
```python
for row_index, row in enumerate(rows):
raw_timestamp = row.get("timestamp")
normalized_timestamp, is_timestamp_valid = (
self._normalize_platform_stats_timestamp_cached(
raw_timestamp, timestamp_cache
)
)
timestamp_for_key = (
normalized_timestamp
if is_timestamp_valid
else f"__invalid_timestamp_row_{row_index}"
)
key = (
timestamp_for_key,
str(row.get("platform_id")),
str(row.get("platform_type")),
)
# ... rest unchanged
```
### 3. Optional: clarify the “invalid timestamp” key shaping
You can make the intent of the synthetic key explicit by wrapping it:
```python
def _platform_stats_key_timestamp(
self,
normalized_timestamp: str,
is_valid: bool,
row_index: int,
) -> str:
if is_valid:
return normalized_timestamp
# 保证无效时间戳不被合并:使用行号加前缀作为唯一 key
return f"__invalid_timestamp_row_{row_index}"
```
Then:
```python
timestamp_for_key = self._platform_stats_key_timestamp(
normalized_timestamp,
is_timestamp_valid,
row_index,
)
```
This keeps the exact behavior (row-index-based uniqueness) but makes *why* it’s done more obvious.
Together, these changes keep all semantics (caching, warning rate-limiting, invalid row handling) but flatten `_merge_platform_stats_rows` into a straightforward read: normalize timestamp, build key, parse counts, merge.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Addressed latest review feedback in commit 7a590f2:
Added/updated tests:
Local verification:
|
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 1 个问题
给 AI Agents 的提示
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>
return imported
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** 请考虑通过在 `_merge_platform_stats_rows` 中使用局部闭包,而不是多个辅助方法和到处传递状态变量,来简化新的 platform_stats 合并逻辑。
你可以保留所有新的行为,但把这些辅助方法“压平”到本地闭包中,以减少状态传递和间接层级。
### 1. 保持 `_merge_platform_stats_rows` 作为主要调度函数,使用闭包
与其在多个辅助函数之间传递 `invalid_count_warned` 和缓存,不如把它们保留在 `_merge_platform_stats_rows` 的局部作用域中,并使用小的内部函数:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> tuple[list[dict[str, Any]], int]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
timestamp_cache: dict[str, tuple[str, bool]] = {}
invalid_count_warned = 0
duplicate_count = 0
def normalize_ts(raw_timestamp: Any) -> tuple[str, bool]:
if isinstance(raw_timestamp, str):
cached = timestamp_cache.get(raw_timestamp)
if cached is not None:
return cached
result = self._normalize_platform_stats_timestamp(raw_timestamp)
if isinstance(raw_timestamp, str):
timestamp_cache[raw_timestamp] = result
return result
def parse_count(raw_count: Any, key: tuple[str, str, str]) -> int:
nonlocal invalid_count_warned
try:
return int(raw_count)
except (TypeError, ValueError):
if invalid_count_warned < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
logger.warning(
"platform_stats count 非法,已按 0 处理: "
f"value={raw_count!r}, key={key}"
)
invalid_count_warned += 1
return 0
for row_index, row in enumerate(rows):
normalized_timestamp, is_timestamp_valid = normalize_ts(row.get("timestamp"))
# keep existing key format, just build inline
if is_timestamp_valid:
timestamp_for_key = normalized_timestamp
else:
timestamp_for_key = f"__invalid_timestamp_row_{row_index}"
key = (
timestamp_for_key,
str(row.get("platform_id")),
str(row.get("platform_type")),
)
existing = merged.get(key)
if existing is None:
normalized_row = dict(row)
normalized_row["timestamp"] = normalized_timestamp
normalized_row["count"] = parse_count(normalized_row.get("count", 0), key)
merged[key] = normalized_row
continue
duplicate_count += 1
existing["count"] = (
parse_count(existing.get("count", 0), key)
+ parse_count(row.get("count", 0), key)
)
return list(merged.values()), duplicate_count
```
这样可以保持:
- 相同的合并语义
- 相同的 `__invalid_timestamp_row_{row_index}` 键格式
- 相同的限频告警行为,但无需在各处传递计数器
### 2. 将时间戳辅助函数收敛为一个对外方法
在使用上述闭包后,`_normalize_platform_stats_timestamp_cached` 和 `_platform_stats_key_timestamp` 就不再需要了。你可以只保留一个归一化辅助函数:
```python
def _normalize_platform_stats_timestamp(self, value: Any) -> tuple[str, bool]:
if isinstance(value, datetime):
dt = value
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
else:
dt = dt.astimezone(timezone.utc)
return dt.isoformat(), True
if isinstance(value, str):
timestamp = value.strip()
if not timestamp:
return "", False
if timestamp.endswith("Z"):
timestamp = f"{timestamp[:-1]}+00:00"
try:
dt = datetime.fromisoformat(timestamp)
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
else:
dt = dt.astimezone(timezone.utc)
return dt.isoformat(), True
except ValueError:
return value.strip(), False
if value is None:
return "", False
return str(value), False
```
然后你就可以安全地删除:
- `_normalize_platform_stats_timestamp_cached`
- `_platform_stats_key_timestamp`
- `_parse_platform_stats_count`
所有功能(合并、归一化、限频告警、无效时间戳分离)都会被保留,但逻辑会更加内聚,也更容易理解。
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的代码审查。
Original comment in English
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>
return imported
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the new platform_stats merging logic by using local closures inside `_merge_platform_stats_rows` instead of multiple helper methods and threaded state variables.
You can keep all the new behavior but flatten the helpers to reduce state‑threading and indirection.
### 1. Keep `_merge_platform_stats_rows` as the main orchestration, use closures
Instead of threading `invalid_count_warned` and the cache through multiple helpers, keep them local to `_merge_platform_stats_rows` and use small inner functions:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> tuple[list[dict[str, Any]], int]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
timestamp_cache: dict[str, tuple[str, bool]] = {}
invalid_count_warned = 0
duplicate_count = 0
def normalize_ts(raw_timestamp: Any) -> tuple[str, bool]:
if isinstance(raw_timestamp, str):
cached = timestamp_cache.get(raw_timestamp)
if cached is not None:
return cached
result = self._normalize_platform_stats_timestamp(raw_timestamp)
if isinstance(raw_timestamp, str):
timestamp_cache[raw_timestamp] = result
return result
def parse_count(raw_count: Any, key: tuple[str, str, str]) -> int:
nonlocal invalid_count_warned
try:
return int(raw_count)
except (TypeError, ValueError):
if invalid_count_warned < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
logger.warning(
"platform_stats count 非法,已按 0 处理: "
f"value={raw_count!r}, key={key}"
)
invalid_count_warned += 1
return 0
for row_index, row in enumerate(rows):
normalized_timestamp, is_timestamp_valid = normalize_ts(row.get("timestamp"))
# keep existing key format, just build inline
if is_timestamp_valid:
timestamp_for_key = normalized_timestamp
else:
timestamp_for_key = f"__invalid_timestamp_row_{row_index}"
key = (
timestamp_for_key,
str(row.get("platform_id")),
str(row.get("platform_type")),
)
existing = merged.get(key)
if existing is None:
normalized_row = dict(row)
normalized_row["timestamp"] = normalized_timestamp
normalized_row["count"] = parse_count(normalized_row.get("count", 0), key)
merged[key] = normalized_row
continue
duplicate_count += 1
existing["count"] = (
parse_count(existing.get("count", 0), key)
+ parse_count(row.get("count", 0), key)
)
return list(merged.values()), duplicate_count
```
This keeps:
- Same merging semantics
- Same `__invalid_timestamp_row_{row_index}` key shape
- Same rate‑limited warnings, but without passing counters around
### 2. Collapse timestamp helpers into one public method
With the closure above, `_normalize_platform_stats_timestamp_cached` and `_platform_stats_key_timestamp` become unnecessary. You can keep a single normalization helper:
```python
def _normalize_platform_stats_timestamp(self, value: Any) -> tuple[str, bool]:
if isinstance(value, datetime):
dt = value
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
else:
dt = dt.astimezone(timezone.utc)
return dt.isoformat(), True
if isinstance(value, str):
timestamp = value.strip()
if not timestamp:
return "", False
if timestamp.endswith("Z"):
timestamp = f"{timestamp[:-1]}+00:00"
try:
dt = datetime.fromisoformat(timestamp)
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
else:
dt = dt.astimezone(timezone.utc)
return dt.isoformat(), True
except ValueError:
return value.strip(), False
if value is None:
return "", False
return str(value), False
```
Then you can safely delete:
- `_normalize_platform_stats_timestamp_cached`
- `_platform_stats_key_timestamp`
- `_parse_platform_stats_count`
All functionality (merging, normalization, warning rate‑limiting, invalid‑timestamp separation) is preserved, but the logic is more localized and easier to follow.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Addressed this complexity review in commit 9df6a66:
Validation run:
|
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 1 个问题
给 AI Agents 的提示
Please address the comments from this code review:
## 单独评论
### 评论 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>
return imported
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** 考虑将 `_merge_platform_stats_rows` 重构为一个更扁平、基于小辅助函数的实现,在保留现有行为的同时,避免使用内部函数、合成键以及额外的计数器。
你可以通过以下方式简化 `_merge_platform_stats_rows` 并保持所有行为不变:
* 移除内部函数和闭包状态。
* 去掉每次调用的 `timestamp_cache`(直接做归一化;重新解析少量时间戳成本很低)。
* 避免通过行索引为无效时间戳生成合成键,而是简单地“对无效时间戳从不合并”。
* 通过长度差来推导 `duplicate_count`,而不是在逻辑中传递一个额外计数器。
* 将“安全的 int 转换 + 限流日志”抽成一个小的辅助函数,该函数同时返回解析后的值和更新后的告警计数。
一个具体示例草图:
```python
# keep the existing constant
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5
def _parse_platform_stats_count(
self,
raw_count: Any,
invalid_count_warned: int,
key: tuple[str, str, str],
) -> tuple[int, int]:
"""Safe int parse with per-call rate limited warning."""
try:
return int(raw_count), invalid_count_warned
except (TypeError, ValueError):
if invalid_count_warned < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
logger.warning(
"platform_stats count 非法,已按 0 处理: "
f"value={raw_count!r}, key={key}"
)
invalid_count_warned += 1
return 0, invalid_count_warned
```
然后,`_merge_platform_stats_rows` 就可以变成一个没有内部函数或合成键的扁平循环:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
result: list[dict[str, Any]] = []
invalid_count_warned = 0
for row in rows:
# copy so we don't mutate caller's data
normalized_row = dict(row)
raw_ts = normalized_row.get("timestamp")
normalized_ts, ts_valid = self._normalize_platform_stats_timestamp(raw_ts)
normalized_row["timestamp"] = normalized_ts
key_for_log = (
normalized_ts if ts_valid else "<invalid_timestamp>",
str(normalized_row.get("platform_id")),
str(normalized_row.get("platform_type")),
)
count, invalid_count_warned = self._parse_platform_stats_count(
normalized_row.get("count", 0),
invalid_count_warned,
key_for_log,
)
normalized_row["count"] = count
# Invalid timestamps: keep as distinct rows (no merging)
if not ts_valid:
result.append(normalized_row)
continue
key = (
normalized_ts,
key_for_log[1],
key_for_log[2],
)
existing = merged.get(key)
if existing is None:
merged[key] = normalized_row
result.append(normalized_row)
else:
existing["count"] += count
return result
```
这样,你就可以避免返回 `duplicate_count`,并在记录日志时再计算它:
```python
if table_name == "platform_stats":
normalized_rows = self._merge_platform_stats_rows(rows)
duplicate_count = len(rows) - len(normalized_rows)
if duplicate_count > 0:
logger.warning(
f"检测到 platform_stats 重复键 {duplicate_count} 条,已在导入前聚合"
)
else:
normalized_rows = rows
```
这样可以保持:
* 按 `(normalized_timestamp, platform_id, platform_type)` 聚合。
* 无效时间戳保留为未合并的独立行。
* 每次函数调用级别的限流日志记录,用于无效的 `count`。
* 基于合并后行数来记录重复数据的日志。
但同时移除了内部函数、闭包状态、时间戳缓存,以及基于行索引的合成键,使得整体逻辑更容易理解。
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续评审。
Original comment in English
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>
return imported
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring `_merge_platform_stats_rows` into a flatter helper-based implementation that avoids inner functions, synthetic keys, and extra counters while preserving existing behavior.
You can simplify `_merge_platform_stats_rows` and keep all behavior by:
* Removing inner functions and closure state.
* Dropping the per-call `timestamp_cache` (normalize directly; re-parsing a few timestamps is cheap).
* Avoiding synthetic keys with row indices for invalid timestamps by simply “never merging” invalid timestamps.
* Deriving `duplicate_count` from the length difference instead of threading an extra counter through the logic.
* Isolating “safe int + rate-limited logging” as a small helper that returns both the parsed value and updated warning counter.
A concrete sketch:
```python
# keep the existing constant
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5
def _parse_platform_stats_count(
self,
raw_count: Any,
invalid_count_warned: int,
key: tuple[str, str, str],
) -> tuple[int, int]:
"""Safe int parse with per-call rate limited warning."""
try:
return int(raw_count), invalid_count_warned
except (TypeError, ValueError):
if invalid_count_warned < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
logger.warning(
"platform_stats count 非法,已按 0 处理: "
f"value={raw_count!r}, key={key}"
)
invalid_count_warned += 1
return 0, invalid_count_warned
```
Then `_merge_platform_stats_rows` becomes a flat loop without inner functions or synthetic keys:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
result: list[dict[str, Any]] = []
invalid_count_warned = 0
for row in rows:
# copy so we don't mutate caller's data
normalized_row = dict(row)
raw_ts = normalized_row.get("timestamp")
normalized_ts, ts_valid = self._normalize_platform_stats_timestamp(raw_ts)
normalized_row["timestamp"] = normalized_ts
key_for_log = (
normalized_ts if ts_valid else "<invalid_timestamp>",
str(normalized_row.get("platform_id")),
str(normalized_row.get("platform_type")),
)
count, invalid_count_warned = self._parse_platform_stats_count(
normalized_row.get("count", 0),
invalid_count_warned,
key_for_log,
)
normalized_row["count"] = count
# Invalid timestamps: keep as distinct rows (no merging)
if not ts_valid:
result.append(normalized_row)
continue
key = (
normalized_ts,
key_for_log[1],
key_for_log[2],
)
existing = merged.get(key)
if existing is None:
merged[key] = normalized_row
result.append(normalized_row)
else:
existing["count"] += count
return result
```
With that, you can avoid returning `duplicate_count` and compute it when logging:
```python
if table_name == "platform_stats":
normalized_rows = self._merge_platform_stats_rows(rows)
duplicate_count = len(rows) - len(normalized_rows)
if duplicate_count > 0:
logger.warning(
f"检测到 platform_stats 重复键 {duplicate_count} 条,已在导入前聚合"
)
else:
normalized_rows = rows
```
This keeps:
* Aggregation by `(normalized_timestamp, platform_id, platform_type)`.
* Invalid timestamps preserved as non-merged rows.
* Per-function-call rate-limited logging for invalid `count`.
* Duplicate logging based on the number of merged rows.
But it removes the inner functions, closure state, timestamp cache, and row-index-based synthetic keys, making the logic much easier to follow.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Addressed this review in commit 5bd71e4:
Behavior preserved:
Local verification:
|
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 1 个问题
给 AI Agents 的提示词
请根据本次代码审查中的评论进行修改:
## 单条评论
### 评论 1
<location path="astrbot/core/backup/importer.py" line_range="203" />
<code_context>
self.kb_manager = kb_manager
self.config_path = config_path
self.kb_root_dir = kb_root_dir
+ self._main_table_preprocessors: dict[
+ str, Callable[[list[dict[str, Any]]], list[dict[str, Any]]]
+ ] = {
</code_context>
<issue_to_address>
**issue (complexity):** 建议重构新增的预处理与合并逻辑:将单表的特殊分支内联、为计数和时间戳处理拆出小的辅助函数,并在可行的情况下移除基于环境变量的限制,以简化控制流和状态管理。
通过少量有针对性的重构,你可以在保留全部新行为的前提下降低复杂度。
### 1. 移除主表预处理器注册表
目前只有一个表是特殊情况,用 dict 做注册表收益不大,反而增加了一层间接调用。
可以不在 `__init__` 中保存 `self._main_table_preprocessors` 并做字典查询,而是在 `_preprocess_main_table_rows` 中直接内联这个分支逻辑,同时删掉这个属性:
```python
def _preprocess_main_table_rows(
self, table_name: str, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
if table_name == "platform_stats":
normalized_rows = self._merge_platform_stats_rows(rows)
duplicate_count = len(rows) - len(normalized_rows)
if duplicate_count > 0:
logger.warning(
f"检测到 {table_name} 重复键 {duplicate_count} 条,已在导入前聚合"
)
return normalized_rows
return rows
```
这样可以保持当前行为不变,同时去掉注册表抽象和一层间接调用。
### 2. 去掉 `nonlocal`,拆分 `_merge_platform_stats_rows` 的职责
你可以将计数解析和行归一化抽取为小的辅助函数,从而避免使用 `nonlocal` 状态。
**a) 抽取不使用闭包的计数解析函数:**
```python
def _parse_platform_stats_count(
self,
raw_count: Any,
key_for_log: tuple[str, str, str],
warned_count: int,
) -> tuple[int, int]:
if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
return 0, warned_count
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
raw_count,
key_for_log,
)
return 0, warned_count + 1
```
然后 `_merge_platform_stats_rows` 就可以更简单,只使用一个普通的 `invalid_count_warned` 整数:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
non_mergeable: list[dict[str, Any]] = []
invalid_count_warned = 0
for row in rows:
normalized_row, timestamp, is_timestamp_valid = (
self._normalize_platform_stats_row(row)
)
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
key_for_log = (
timestamp if is_timestamp_valid else "<invalid_timestamp>",
repr(platform_id),
repr(platform_type),
)
count, invalid_count_warned = self._parse_platform_stats_count(
normalized_row.get("count", 0),
key_for_log,
invalid_count_warned,
)
normalized_row["count"] = count
if not is_timestamp_valid:
non_mergeable.append(normalized_row)
continue
if not isinstance(platform_id, str) or not isinstance(platform_type, str):
non_mergeable.append(normalized_row)
continue
key = (timestamp, platform_id, platform_type)
existing = merged.get(key)
if existing is None:
merged[key] = normalized_row
else:
existing["count"] += count
return [*non_mergeable, *merged.values()]
```
**b) 抽取负责行归一化(时间戳 + 拷贝)的辅助函数:**
```python
def _normalize_platform_stats_row(
self, row: dict[str, Any]
) -> tuple[dict[str, Any], str, bool]:
normalized_row = dict(row)
normalized_timestamp, is_timestamp_valid = self._normalize_platform_stats_timestamp(
normalized_row.get("timestamp")
)
normalized_row["timestamp"] = normalized_timestamp
return normalized_row, normalized_timestamp, is_timestamp_valid
```
这样可以分离职责,使 `_merge_platform_stats_rows` 的逻辑更易理解。
### 3. 简化时间戳归一化逻辑,使其返回 `str | None`
与其返回 `(value, bool)`,不如直接返回归一化后的时间戳字符串或 `None`。调用方可以直接将 `None` 视为不可合并,从而避免单独维护一个布尔标记。
```python
from typing import Optional
def _normalize_platform_stats_timestamp(self, value: Any) -> Optional[str]:
if isinstance(value, datetime):
return self._to_utc_iso(value)
if isinstance(value, str):
timestamp = value.strip()
if not timestamp:
return None
if timestamp.endswith("Z"):
timestamp = f"{timestamp[:-1]}+00:00"
try:
return self._to_utc_iso(datetime.fromisoformat(timestamp))
except ValueError:
# keep original string but mark as invalid at caller if desired
return None
if value is None:
return None
# non-string unexpected type; treat as invalid
return None
```
然后在 `_merge_platform_stats_rows` 中:
```python
timestamp = self._normalize_platform_stats_timestamp(row.get("timestamp"))
normalized_row = dict(row)
normalized_row["timestamp"] = timestamp or ""
is_timestamp_valid = timestamp is not None
```
当 `timestamp is None` 时,你可以通过设置存储的值,来保持现有行为(将时间戳非法的行视为不可合并,并保留原始/空的时间戳值)。
### 4. 考虑简化基于环境变量的告警上限
如果并不强依赖可配置性,可以保留告警限流的概念,但移除环境变量加载和模块级的相关逻辑:
```python
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5 # 简单的固定常量
```
这样其它逻辑(按调用计数)保持不变,但可以删除 `_load_platform_stats_invalid_count_warn_limit` 和环境变量处理相关代码,从而降低本模块的理解成本。如果确实需要环境变量,也可以将解析逻辑内联到一个小辅助函数中,并与常量放在一起以提高可读性:
```python
def _get_invalid_count_warn_limit() -> int:
raw = os.getenv("ASTRBOT_PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT")
if not raw:
return 5
try:
value = int(raw)
return value if value >= 0 else 5
except ValueError:
return 5
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = _get_invalid_count_warn_limit()
```
这些改动可以保持现有行为(去重合并、时间戳归一化以及告警节流)不变,同时减少间接层级、分支数量以及共享可变状态。
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈持续改进代码审查。
Original comment in English
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="203" />
<code_context>
self.kb_manager = kb_manager
self.config_path = config_path
self.kb_root_dir = kb_root_dir
+ self._main_table_preprocessors: dict[
+ str, Callable[[list[dict[str, Any]]], list[dict[str, Any]]]
+ ] = {
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the new preprocessing and merging logic to inline the single-table special case, extract small helpers for count and timestamp handling, and optionally drop the env-based limit to simplify control flow and state management.
You can keep all the new behavior while reducing complexity with a few targeted refactors.
### 1. Drop the registry for main table preprocessors
Since there is only one special-case table, the dict-based registry doesn’t buy much yet and adds indirection.
Instead of storing `self._main_table_preprocessors` in `__init__` and doing a dict lookup, you can inline the decision in `_preprocess_main_table_rows` and drop the attribute entirely:
```python
def _preprocess_main_table_rows(
self, table_name: str, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
if table_name == "platform_stats":
normalized_rows = self._merge_platform_stats_rows(rows)
duplicate_count = len(rows) - len(normalized_rows)
if duplicate_count > 0:
logger.warning(
f"检测到 {table_name} 重复键 {duplicate_count} 条,已在导入前聚合"
)
return normalized_rows
return rows
```
This preserves behavior but removes the registry abstraction and a layer of indirection.
### 2. Remove `nonlocal` and split `_merge_platform_stats_rows` responsibilities
You can factor out count parsing and row normalization into small helpers and avoid `nonlocal` state.
**a) Extract count parsing without closures:**
```python
def _parse_platform_stats_count(
self,
raw_count: Any,
key_for_log: tuple[str, str, str],
warned_count: int,
) -> tuple[int, int]:
if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
return 0, warned_count
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
raw_count,
key_for_log,
)
return 0, warned_count + 1
```
Then `_merge_platform_stats_rows` becomes simpler and uses a plain `invalid_count_warned` int:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
non_mergeable: list[dict[str, Any]] = []
invalid_count_warned = 0
for row in rows:
normalized_row, timestamp, is_timestamp_valid = (
self._normalize_platform_stats_row(row)
)
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
key_for_log = (
timestamp if is_timestamp_valid else "<invalid_timestamp>",
repr(platform_id),
repr(platform_type),
)
count, invalid_count_warned = self._parse_platform_stats_count(
normalized_row.get("count", 0),
key_for_log,
invalid_count_warned,
)
normalized_row["count"] = count
if not is_timestamp_valid:
non_mergeable.append(normalized_row)
continue
if not isinstance(platform_id, str) or not isinstance(platform_type, str):
non_mergeable.append(normalized_row)
continue
key = (timestamp, platform_id, platform_type)
existing = merged.get(key)
if existing is None:
merged[key] = normalized_row
else:
existing["count"] += count
return [*non_mergeable, *merged.values()]
```
**b) Extract row normalization (timestamp + cloning) into a helper:**
```python
def _normalize_platform_stats_row(
self, row: dict[str, Any]
) -> tuple[dict[str, Any], str, bool]:
normalized_row = dict(row)
normalized_timestamp, is_timestamp_valid = self._normalize_platform_stats_timestamp(
normalized_row.get("timestamp")
)
normalized_row["timestamp"] = normalized_timestamp
return normalized_row, normalized_timestamp, is_timestamp_valid
```
This separates concerns and makes `_merge_platform_stats_rows` easier to follow.
### 3. Simplify timestamp normalization to return `str | None`
Instead of `(value, bool)`, you can return a normalized timestamp string or `None`. The call site then directly treats `None` as non-mergeable, avoiding separate boolean tracking.
```python
from typing import Optional
def _normalize_platform_stats_timestamp(self, value: Any) -> Optional[str]:
if isinstance(value, datetime):
return self._to_utc_iso(value)
if isinstance(value, str):
timestamp = value.strip()
if not timestamp:
return None
if timestamp.endswith("Z"):
timestamp = f"{timestamp[:-1]}+00:00"
try:
return self._to_utc_iso(datetime.fromisoformat(timestamp))
except ValueError:
# keep original string but mark as invalid at caller if desired
return None
if value is None:
return None
# non-string unexpected type; treat as invalid
return None
```
Then in `_merge_platform_stats_rows`:
```python
timestamp = self._normalize_platform_stats_timestamp(row.get("timestamp"))
normalized_row = dict(row)
normalized_row["timestamp"] = timestamp or ""
is_timestamp_valid = timestamp is not None
```
You can preserve the existing behavior (keeping invalid timestamps as non-mergeable rows with the original/empty value) by setting the stored value appropriately when `timestamp is None`.
### 4. Consider simplifying the env-driven warning limit
If configurability is not strictly required, you can keep the rate-limiting concept but drop the env loader and module-level plumbing:
```python
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5 # simple fixed constant
```
All other logic (per-invocation counting) remains the same, but you remove `_load_platform_stats_invalid_count_warn_limit` and the env variable handling, which reduces cognitive load in this module. If you do still need the env, you could at least inline the parsing into a short helper and keep it near the constant for readability:
```python
def _get_invalid_count_warn_limit() -> int:
raw = os.getenv("ASTRBOT_PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT")
if not raw:
return 5
try:
value = int(raw)
return value if value >= 0 else 5
except ValueError:
return 5
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = _get_invalid_count_warn_limit()
```
These changes keep the same behavior (duplicate merging, timestamp normalization, and warning throttling) but reduce indirection, branching, and shared mutable state.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Addressed this review in commit ec57b98:
All existing behavior around dedup, invalid timestamp non-merge, and warning rate limiting is preserved. Local verification:
|
|
@sourcery-ai review |
|
Hi @zouyonghe! 👋 Only authors and team members can run @sourcery-ai commands on public repos. If you are a team member, install the @sourcery-ai bot to get access ✨ |
There was a problem hiding this comment.
Hey - 我发现了 1 个问题
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+ return normalized_rows
+ return rows
+
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** 建议将时间戳标准化和计数解析内联为 `_merge_platform_stats_rows` 内部的小型本地辅助函数,这样 `platform_stats` 的导入路径就可以变成单一的线性流程,而不需要额外的辅助方法或元组传递。
你可以通过将大部分逻辑收拢到 `_merge_platform_stats_rows` 中,并使用小型本地辅助函数 `build_key` 和 `parse_count`,来简化 `platform_stats` 这条路径。这样可以保留现有的行为(标准化、不可合并行的处理、警告次数上限),同时去掉元组穿透和类级别的辅助函数。
### 1. 扁平化标准化和 key 构建
与其让 `_normalize_platform_stats_row`、`_normalize_platform_stats_timestamp` 和 `_to_utc_iso` 返回 `(row, ts, bool)`,不如在 `_merge_platform_stats_rows` 中内联这些逻辑,使用一个 `build_key(row)`,在不可合并时返回 `None`:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
non_mergeable: list[dict[str, Any]] = []
invalid_count_warned = 0
def normalize_timestamp(value: Any) -> str | None:
if isinstance(value, datetime):
return self._to_utc_iso(value)
if isinstance(value, str):
ts = value.strip()
if not ts:
return None
if ts.endswith("Z"):
ts = f"{ts[:-1]}+00:00"
try:
return self._to_utc_iso(datetime.fromisoformat(ts))
except ValueError:
return None
return None
def build_key(row: dict[str, Any]) -> tuple[str, str, str] | None:
ts = normalize_timestamp(row.get("timestamp"))
pid = row.get("platform_id")
ptype = row.get("platform_type")
if ts is None or not isinstance(pid, str) or not isinstance(ptype, str):
return None
# keep normalized timestamp on row so later code sees it
row["timestamp"] = ts
return ts, pid, ptype
```
这样一来,你就不再需要 `_normalize_platform_stats_row` 或 `is_timestamp_valid` 标志;不可合并的行就是那些 `build_key(row) is None` 的行,并原样保留下来:
```python
for raw in rows:
row = dict(raw)
key = build_key(row)
key_for_log = (
row.get("timestamp"),
repr(row.get("platform_id")),
repr(row.get("platform_type")),
)
count, invalid_count_warned = parse_count(
row.get("count", 0),
key_for_log,
invalid_count_warned,
)
row["count"] = count
if key is None:
non_mergeable.append(row)
continue
existing = merged.get(key)
if existing is None:
merged[key] = row
else:
existing["count"] += count
return [*non_mergeable, *merged.values()]
```
### 2. 简化计数解析和警告上限
你可以保持相同的警告语义,但通过将 `_parse_platform_stats_count` 简化为一个仅处理 `raw_count` 和简单整型计数器的小型本地辅助函数来简化其 API 和用法:
```python
def parse_count(
raw_count: Any,
key_for_log: tuple[Any, Any, Any],
warned_count: int,
) -> tuple[int, int]:
if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
return 0, warned_count
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
raw_count,
key_for_log,
)
return 0, warned_count + 1
```
如果你采用这些本地辅助函数:
- 你可以删除 `_normalize_platform_stats_row`、`_normalize_platform_stats_timestamp` 和 `_parse_platform_stats_count`。
- `_merge_platform_stats_rows` 中的控制流会变得线性:`row -> key or None -> parse_count -> aggregate or non_mergeable`。
- 针对非法时间戳的重写将消失;非法时间戳只会导致 `key=None`,从而保留原始的 `row` 透传。
</issue_to_address>帮我变得更有用!请在每条评论上点击 👍 或 👎,我会根据反馈改进后续的代码审查。
Original comment in English
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+ return normalized_rows
+ return rows
+
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider inlining timestamp normalization and count parsing as small local helpers inside `_merge_platform_stats_rows` so the `platform_stats` import path becomes a single linear flow without extra helper methods or tuple plumbing.
You can simplify the `platform_stats` path by collapsing most of the logic into `_merge_platform_stats_rows` and using a small local `build_key` and `parse_count` helper. This keeps behavior (normalization, non‑mergeable handling, warn limit) but removes tuple threading and class‑level helpers.
### 1. Flatten normalization + key building
Instead of `_normalize_platform_stats_row` + `_normalize_platform_stats_timestamp` + `_to_utc_iso` returning `(row, ts, bool)`, inline this in `_merge_platform_stats_rows` with a single `build_key(row)` that returns `None` when non‑mergeable:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
non_mergeable: list[dict[str, Any]] = []
invalid_count_warned = 0
def normalize_timestamp(value: Any) -> str | None:
if isinstance(value, datetime):
return self._to_utc_iso(value)
if isinstance(value, str):
ts = value.strip()
if not ts:
return None
if ts.endswith("Z"):
ts = f"{ts[:-1]}+00:00"
try:
return self._to_utc_iso(datetime.fromisoformat(ts))
except ValueError:
return None
return None
def build_key(row: dict[str, Any]) -> tuple[str, str, str] | None:
ts = normalize_timestamp(row.get("timestamp"))
pid = row.get("platform_id")
ptype = row.get("platform_type")
if ts is None or not isinstance(pid, str) or not isinstance(ptype, str):
return None
# keep normalized timestamp on row so later code sees it
row["timestamp"] = ts
return ts, pid, ptype
```
With this, you no longer need `_normalize_platform_stats_row` or the `is_timestamp_valid` flag; non‑mergeable rows are simply those with `build_key(row) is None` and are left as‑is:
```python
for raw in rows:
row = dict(raw)
key = build_key(row)
key_for_log = (
row.get("timestamp"),
repr(row.get("platform_id")),
repr(row.get("platform_type")),
)
count, invalid_count_warned = parse_count(
row.get("count", 0),
key_for_log,
invalid_count_warned,
)
row["count"] = count
if key is None:
non_mergeable.append(row)
continue
existing = merged.get(key)
if existing is None:
merged[key] = row
else:
existing["count"] += count
return [*non_mergeable, *merged.values()]
```
### 2. Simplify count parsing + warning limit
You can keep the same warning semantics but simplify `_parse_platform_stats_count`’s API and usage by making it a small local helper that only deals with `raw_count` and a simple integer counter:
```python
def parse_count(
raw_count: Any,
key_for_log: tuple[Any, Any, Any],
warned_count: int,
) -> tuple[int, int]:
if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
return 0, warned_count
try:
return int(raw_count), warned_count
except (TypeError, ValueError):
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
raw_count,
key_for_log,
)
return 0, warned_count + 1
```
If you adopt these local helpers:
- You can delete `_normalize_platform_stats_row`, `_normalize_platform_stats_timestamp`, and `_parse_platform_stats_count`.
- Control flow in `_merge_platform_stats_rows` becomes linear: `row -> key or None -> parse_count -> aggregate or non_mergeable`.
- Timestamp rewriting for invalid values disappears; invalid timestamps just result in `key=None`, preserving the original `row` for pass‑through.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Addressed this comment in commit 628defa:
Local verification:
|
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 2 个问题
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="64-67" />
<code_context>
CMD_CONFIG_FILE_PATH = os.path.join(get_astrbot_data_path(), "cmd_config.json")
KB_PATH = get_astrbot_knowledge_base_path()
+PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5
</code_context>
<issue_to_address>
**suggestion (bug_risk):** 对无效 `count` 警告做频率限制会在达到上限后把后续无效值静默强制为 0,这可能掩盖数据问题。
一旦 `invalid_count_warned` 达到 `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT`,后续无效的 `count` 值会被静默地强制为 `0`。这会隐藏持续存在的数据问题(例如导出损坏或 schema 变更)。请在达到限制时发出一次性的汇总警告(例如:“后续无效 count 已被抑制”),或者继续以较低的日志级别(如 `debug`)记录后续无效值,这样既能保持日志量受控,又能让持续性问题仍然可观测。
</issue_to_address>
### Comment 2
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+ return normalized_rows
+ return rows
+
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** 建议将 `_merge_platform_stats_rows` 中的嵌套辅助函数提取成私有方法,避免隐藏式修改,并拆分解析与日志逻辑,使函数更扁平、更易理解和扩展。
在保持当前行为不变的前提下,你可以通过抽取内部辅助函数、移除隐藏式修改,并将“解析”和“日志”职责分离来降低复杂度。
### 1. 将嵌套辅助函数移到私有方法中
这会让 `_merge_platform_stats_rows` 更易扫读,也方便复用和测试。
```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str | None:
if isinstance(value, datetime):
return self._to_utc_iso(value)
if isinstance(value, str):
timestamp = value.strip()
if not timestamp:
return None
if timestamp.endswith("Z"):
timestamp = f"{timestamp[:-1]}+00:00"
try:
return self._to_utc_iso(datetime.fromisoformat(timestamp))
except ValueError:
return None
return None
def _build_platform_stats_key(
self, row: dict[str, Any]
) -> tuple[str, str, str] | None:
normalized_timestamp = self._normalize_platform_stats_timestamp(row.get("timestamp"))
platform_id = row.get("platform_id")
platform_type = row.get("platform_type")
if (
normalized_timestamp is None
or not isinstance(platform_id, str)
or not isinstance(platform_type, str)
):
return None
return (normalized_timestamp, platform_id, platform_type), normalized_timestamp
```
这样 `_merge_platform_stats_rows` 主要就变成了一个编排函数。
### 2. 在构建 key 时避免隐藏式修改
当前的 `build_key` 会作为副作用修改 `row["timestamp"]`。相反,可以返回归一化后的时间戳,让调用方在合适的位置显式更新该字段:
```python
key_and_ts = self._build_platform_stats_key(normalized_row)
if key_and_ts is None:
non_mergeable.append(normalized_row)
continue
key, normalized_timestamp = key_and_ts
normalized_row["timestamp"] = normalized_timestamp
```
这样可以在调用点清楚地看到“哪些内容被修改了”。
### 3. 将解析和受限日志拆开
你可以通过拆分解析和日志逻辑来移除 `warned_count` 在函数间传递的线程式依赖:
```python
def _parse_platform_stats_count_value(self, raw_count: Any) -> int:
try:
return int(raw_count)
except (TypeError, ValueError):
return 0
def _log_invalid_platform_stats_count(
self,
raw_count: Any,
key_for_log: tuple[Any, Any, Any],
warned_count: int,
) -> int:
if warned_count < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
raw_count,
key_for_log,
)
return warned_count + 1
return warned_count
```
在 `_merge_platform_stats_rows` 中使用:
```python
warned = invalid_count_warned
raw_count = normalized_row.get("count", 0)
count = self._parse_platform_stats_count_value(raw_count)
if count == 0 and raw_count not in (0, "0"):
warned = self._log_invalid_platform_stats_count(raw_count, key_for_log, warned)
invalid_count_warned = warned
normalized_row["count"] = count
```
这样既保持 `_merge_platform_stats_rows` 的状态(`invalid_count_warned`)是局部的,又不需要在嵌套辅助函数之间传进传出该状态,并让每个函数都只专注于单一职责。
这些改动应当在保持所有现有行为(包括限频日志和合并语义)不变的同时,使代码结构更扁平、更不易产生意外,并更容易扩展。
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会利用这些反馈来改进后续的评审。
Original comment in English
Hey - I've found 2 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="64-67" />
<code_context>
CMD_CONFIG_FILE_PATH = os.path.join(get_astrbot_data_path(), "cmd_config.json")
KB_PATH = get_astrbot_knowledge_base_path()
+PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Rate limiting of invalid `count` warnings silently coerces later invalid values to 0, which may hide data issues.
Once `invalid_count_warned` hits `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT`, later invalid `count` values are silently coerced to `0`. This hides ongoing data issues (e.g., broken exports or schema changes). Please either emit a one-time summary warning when the limit is reached (e.g., “further invalid counts suppressed”) or continue logging later invalid values at a lower level (such as `debug`) so persistent issues remain observable while keeping log volume bounded.
</issue_to_address>
### Comment 2
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+ return normalized_rows
+ return rows
+
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the nested helpers in `_merge_platform_stats_rows` into private methods, avoiding hidden mutation, and splitting parsing from logging to make the function flatter and easier to understand and extend.
You can keep the current behaviour but reduce complexity by extracting the inner helpers, removing hidden mutation, and separating “parse vs log” concerns.
### 1. Move nested helpers to private methods
This makes `_merge_platform_stats_rows` easier to scan and allows reuse/testing.
```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str | None:
if isinstance(value, datetime):
return self._to_utc_iso(value)
if isinstance(value, str):
timestamp = value.strip()
if not timestamp:
return None
if timestamp.endswith("Z"):
timestamp = f"{timestamp[:-1]}+00:00"
try:
return self._to_utc_iso(datetime.fromisoformat(timestamp))
except ValueError:
return None
return None
def _build_platform_stats_key(
self, row: dict[str, Any]
) -> tuple[str, str, str] | None:
normalized_timestamp = self._normalize_platform_stats_timestamp(row.get("timestamp"))
platform_id = row.get("platform_id")
platform_type = row.get("platform_type")
if (
normalized_timestamp is None
or not isinstance(platform_id, str)
or not isinstance(platform_type, str)
):
return None
return (normalized_timestamp, platform_id, platform_type), normalized_timestamp
```
Then `_merge_platform_stats_rows` becomes mostly orchestration.
### 2. Avoid hidden mutation in key building
Currently `build_key` mutates `row["timestamp"]` as a side effect. Instead, return the normalized timestamp and let the caller update the row explicitly:
```python
key_and_ts = self._build_platform_stats_key(normalized_row)
if key_and_ts is None:
non_mergeable.append(normalized_row)
continue
key, normalized_timestamp = key_and_ts
normalized_row["timestamp"] = normalized_timestamp
```
This keeps the “what is being changed” visible at the call site.
### 3. Separate parsing from rate‑limited logging
You can remove the `warned_count` threading by splitting parsing and logging:
```python
def _parse_platform_stats_count_value(self, raw_count: Any) -> int:
try:
return int(raw_count)
except (TypeError, ValueError):
return 0
def _log_invalid_platform_stats_count(
self,
raw_count: Any,
key_for_log: tuple[Any, Any, Any],
warned_count: int,
) -> int:
if warned_count < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
raw_count,
key_for_log,
)
return warned_count + 1
return warned_count
```
Usage in `_merge_platform_stats_rows`:
```python
warned = invalid_count_warned
raw_count = normalized_row.get("count", 0)
count = self._parse_platform_stats_count_value(raw_count)
if count == 0 and raw_count not in (0, "0"):
warned = self._log_invalid_platform_stats_count(raw_count, key_for_log, warned)
invalid_count_warned = warned
normalized_row["count"] = count
```
This keeps `_merge_platform_stats_rows` state local (`invalid_count_warned`) but removes the need to pass it in/out of a nested helper and keeps each function single‑purpose.
These changes should preserve all behaviour (including rate‑limited logging and merging semantics) while making the code flatter, less surprising, and easier to extend.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
|
Addressed the two review comments in commit 960c4db:
Tests updated:
Local verification:
|
There was a problem hiding this comment.
Hey - 我发现了 1 个问题,并给出了一些整体性反馈:
_preprocess_main_table_rows函数目前对硬编码的表名 "platform_stats" 做了特殊处理;可以考虑把这种映射集中起来(例如使用 table_name → preprocessor 的字典),这样以后新增按表定制的预处理逻辑时,就不需要在这个方法里继续增加分支。- 在
_merge_platform_stats_rows中,返回的列表是[ *non_mergeable, *merged.values() ],这会改变相对于输入的行顺序;如果这个表的导入顺序是有意义的,可能需要在聚合计数的同时保留原始顺序。 _log_invalid_platform_stats_count中关于非法计数告警的限流逻辑比较复杂;可以考虑抽取一个小的辅助对象/状态,或者简化两个输出抑制日志的分支,使得行为更容易理解,也在以后需要修改逻辑时更不容易出错。
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `_preprocess_main_table_rows` function currently special-cases the hard-coded table name "platform_stats`; consider centralizing this mapping (e.g., a dict of table_name → preprocessor) so additional table-specific preprocessing can be added without further branching in this method.
- In `_merge_platform_stats_rows`, the returned list is `[ *non_mergeable, *merged.values() ]`, which changes row ordering relative to the input; if import order is significant for this table, you may want to preserve the original order while still aggregating counts.
- The invalid-count warning rate limiting in `_log_invalid_platform_stats_count` is fairly intricate; extracting a small helper object/state or simplifying the two branches that emit the suppression log could make the behavior easier to reason about and less error-prone if the logic needs to change later.
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+ return normalized_rows
+ return rows
+
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying `_merge_platform_stats_rows` by using a local closure for invalid-count warnings and inlining the key-building logic to reduce helper functions and threaded state.
You can keep the current behavior but reduce complexity and state‑threading by:
### 1. Encapsulate invalid‑count warning state in a local helper
You don’t need to pass `invalid_count_warned` and `suppression_warned` through multiple calls. A simple closure in `_merge_platform_stats_rows` keeps state local and uses a single counter:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
non_mergeable: list[dict[str, Any]] = []
invalid_count_warnings = 0
def log_invalid_count(raw_count: Any, key_for_log: tuple[Any, Any, Any]) -> None:
nonlocal invalid_count_warnings
limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT
if invalid_count_warnings < limit:
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
raw_count,
key_for_log,
)
invalid_count_warnings += 1
if invalid_count_warnings == limit:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制", limit
)
for row in rows:
normalized_row = dict(row)
raw_timestamp = normalized_row.get("timestamp")
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
# ... timestamp normalization + key building (see below) ...
count, is_valid_count = self._parse_platform_stats_count_value(
normalized_row.get("count", 0)
)
if not is_valid_count:
key_for_log = (
normalized_row.get("timestamp"),
repr(platform_id),
repr(platform_type),
)
log_invalid_count(normalized_row.get("count", 0), key_for_log)
normalized_row["count"] = count
# ... merge logic ...
```
This preserves the warning limit and suppression behavior but removes the need for `_log_invalid_platform_stats_count`, `warned_count`, and `suppression_warned` as threaded arguments.
### 2. Inline key building to avoid tiny cross‑calling helpers
`_build_platform_stats_key` is small and only used once. Inlining makes control flow easier to follow and removes the need to compute `key_for_log` separately:
```python
for row in rows:
normalized_row = dict(row)
raw_timestamp = normalized_row.get("timestamp")
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
if normalized_timestamp is not None:
normalized_row["timestamp"] = normalized_timestamp
elif isinstance(raw_timestamp, str):
normalized_row["timestamp"] = raw_timestamp.strip()
elif raw_timestamp is None:
normalized_row["timestamp"] = ""
else:
normalized_row["timestamp"] = str(raw_timestamp)
if (
normalized_timestamp is None
or not isinstance(platform_id, str)
or not isinstance(platform_type, str)
):
key: tuple[str, str, str] | None = None
else:
key = (normalized_timestamp, platform_id, platform_type)
# rest of loop unchanged: count parsing, invalid count logging, merge/non-merge paths
```
With this, you can safely delete `_build_platform_stats_key`. The behavior (merge key conditions and “non‑mergeable” treatment) stays identical but the number of tiny helpers and back‑and‑forth arguments is reduced.
### 3. Optional: keep timestamp normalization helper, but use it more linearly
`_normalize_platform_stats_timestamp` / `_to_utc_iso` are fine as shared utilities; the main readability win comes from (1) and (2). If you want to reduce indirection further while preserving behavior, consider placing the timestamp normalization logic as a clearly commented block in `_merge_platform_stats_rows` instead of two separate helpers, but that’s less critical than simplifying the warning and key‑building paths above.
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的 Review。
Original comment in English
Hey - I've found 1 issue, and left some high level feedback:
- The
_preprocess_main_table_rowsfunction currently special-cases the hard-coded table name "platform_stats`; consider centralizing this mapping (e.g., a dict of table_name → preprocessor) so additional table-specific preprocessing can be added without further branching in this method. - In
_merge_platform_stats_rows, the returned list is[ *non_mergeable, *merged.values() ], which changes row ordering relative to the input; if import order is significant for this table, you may want to preserve the original order while still aggregating counts. - The invalid-count warning rate limiting in
_log_invalid_platform_stats_countis fairly intricate; extracting a small helper object/state or simplifying the two branches that emit the suppression log could make the behavior easier to reason about and less error-prone if the logic needs to change later.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `_preprocess_main_table_rows` function currently special-cases the hard-coded table name "platform_stats`; consider centralizing this mapping (e.g., a dict of table_name → preprocessor) so additional table-specific preprocessing can be added without further branching in this method.
- In `_merge_platform_stats_rows`, the returned list is `[ *non_mergeable, *merged.values() ]`, which changes row ordering relative to the input; if import order is significant for this table, you may want to preserve the original order while still aggregating counts.
- The invalid-count warning rate limiting in `_log_invalid_platform_stats_count` is fairly intricate; extracting a small helper object/state or simplifying the two branches that emit the suppression log could make the behavior easier to reason about and less error-prone if the logic needs to change later.
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+ return normalized_rows
+ return rows
+
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying `_merge_platform_stats_rows` by using a local closure for invalid-count warnings and inlining the key-building logic to reduce helper functions and threaded state.
You can keep the current behavior but reduce complexity and state‑threading by:
### 1. Encapsulate invalid‑count warning state in a local helper
You don’t need to pass `invalid_count_warned` and `suppression_warned` through multiple calls. A simple closure in `_merge_platform_stats_rows` keeps state local and uses a single counter:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
non_mergeable: list[dict[str, Any]] = []
invalid_count_warnings = 0
def log_invalid_count(raw_count: Any, key_for_log: tuple[Any, Any, Any]) -> None:
nonlocal invalid_count_warnings
limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT
if invalid_count_warnings < limit:
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
raw_count,
key_for_log,
)
invalid_count_warnings += 1
if invalid_count_warnings == limit:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制", limit
)
for row in rows:
normalized_row = dict(row)
raw_timestamp = normalized_row.get("timestamp")
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
# ... timestamp normalization + key building (see below) ...
count, is_valid_count = self._parse_platform_stats_count_value(
normalized_row.get("count", 0)
)
if not is_valid_count:
key_for_log = (
normalized_row.get("timestamp"),
repr(platform_id),
repr(platform_type),
)
log_invalid_count(normalized_row.get("count", 0), key_for_log)
normalized_row["count"] = count
# ... merge logic ...
```
This preserves the warning limit and suppression behavior but removes the need for `_log_invalid_platform_stats_count`, `warned_count`, and `suppression_warned` as threaded arguments.
### 2. Inline key building to avoid tiny cross‑calling helpers
`_build_platform_stats_key` is small and only used once. Inlining makes control flow easier to follow and removes the need to compute `key_for_log` separately:
```python
for row in rows:
normalized_row = dict(row)
raw_timestamp = normalized_row.get("timestamp")
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
if normalized_timestamp is not None:
normalized_row["timestamp"] = normalized_timestamp
elif isinstance(raw_timestamp, str):
normalized_row["timestamp"] = raw_timestamp.strip()
elif raw_timestamp is None:
normalized_row["timestamp"] = ""
else:
normalized_row["timestamp"] = str(raw_timestamp)
if (
normalized_timestamp is None
or not isinstance(platform_id, str)
or not isinstance(platform_type, str)
):
key: tuple[str, str, str] | None = None
else:
key = (normalized_timestamp, platform_id, platform_type)
# rest of loop unchanged: count parsing, invalid count logging, merge/non-merge paths
```
With this, you can safely delete `_build_platform_stats_key`. The behavior (merge key conditions and “non‑mergeable” treatment) stays identical but the number of tiny helpers and back‑and‑forth arguments is reduced.
### 3. Optional: keep timestamp normalization helper, but use it more linearly
`_normalize_platform_stats_timestamp` / `_to_utc_iso` are fine as shared utilities; the main readability win comes from (1) and (2). If you want to reduce indirection further while preserving behavior, consider placing the timestamp normalization logic as a clearly commented block in `_merge_platform_stats_rows` instead of two separate helpers, but that’s less critical than simplifying the warning and key‑building paths above.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 1 个问题,并给出了一些整体层面的反馈:
- 在
log_invalid_count中,新引入的PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT行为在限制值为 0 时有点不直观(在第一个非法值时仍然会打印一条“已达到上限 (0)”的日志);建议简化这一分支逻辑,让 0 明确表示“完全不告警”,或者在文档中说明这种特殊情况的行为。 _preprocess_main_table_rows中对每张存在重复platform_stats键的表都打 warning,在大批量导入时可能会非常吵;可以考虑把这条日志降级为info,或者增加一个阈值再打印日志,以避免在正常运行时产生日志噪音。
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` behavior in `log_invalid_count` is a bit unintuitive when the limit is 0 (it still logs a '已达到上限 (0)' message on the first invalid value); consider simplifying this branch so that 0 cleanly means 'no warnings' or documenting the special-case behavior.
- The warning in `_preprocess_main_table_rows` for every table with duplicate `platform_stats` keys could get noisy on large imports; you might want to downgrade this to `info` or add a threshold before logging to avoid log spam in normal operation.
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>
return imported
+ def _preprocess_main_table_rows(
+ self, table_name: str, rows: list[dict[str, Any]]
+ ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** 考虑将新增的 `platform_stats` 行预处理和合并逻辑重构为更清晰、层次更扁平的辅助函数,并使用更简单的条件逻辑,以便在不改变行为的前提下让整体流程更容易理解和维护。
你可以保留目前所有的新行为,只是对几个关键热点做一下简化,让代码更易读。
### 1. 用直接条件替换只处理单一情况的分发字典
当前 `_preprocess_main_table_rows` 使用了只包含一个条目的字典;此处改用直接的条件判断会更简单,也能让执行流程更清晰:
```python
def _preprocess_main_table_rows(
self, table_name: str, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
if table_name == "platform_stats":
normalized_rows = self._merge_platform_stats_rows(rows)
duplicate_count = len(rows) - len(normalized_rows)
if duplicate_count > 0:
logger.warning(
"检测到 %s 重复键 %d 条,已在导入前聚合",
table_name,
duplicate_count,
)
return normalized_rows
return rows
```
当真正需要支持多张表时,再引入分发表也不迟。
---
### 2. 在保持语义不变的前提下简化非法 count 的限流逻辑
你可以保持当前所有行为(包括 `limit <= 0` 时“全部抑制,但打印一次提示”的语义),但通过理顺控制流程并把辅助函数从嵌套作用域中抽取出来,使代码更易理解。
当前行为总结(需要原样保留):
- `limit <= 0`:遇到第一个非法值时,仅打印一次“已达到上限 / 将抑制后续告警”的提示;不对每个非法值单独打印告警。
- `limit > 0`:对前 `limit` 个非法值逐个打印告警,当达到 `limit` 时再打印一条“已达到上限 / 将抑制后续告警”的提示,之后保持静默。
更清晰的实现方式示例:
```python
def _log_platform_stats_invalid_count(
self,
raw_count: Any,
key_for_log: tuple[Any, Any, Any],
invalid_count_warnings: int,
) -> int:
limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT
# limit <= 0: only log the suppression message once
if limit <= 0:
if invalid_count_warnings == 0:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
limit,
)
return 1
return invalid_count_warnings
# limit > 0: log up to `limit` invalids, then one suppression message
if invalid_count_warnings < limit:
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
raw_count,
key_for_log,
)
invalid_count_warnings += 1
if invalid_count_warnings == limit:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
limit,
)
# keep counting above limit to avoid repeated suppression logs
invalid_count_warnings += 1
return invalid_count_warnings
def _parse_platform_stats_count(
self,
raw_count: Any,
key_for_log: tuple[Any, Any, Any],
invalid_count_warnings: int,
) -> tuple[int, int]:
try:
return int(raw_count), invalid_count_warnings
except (TypeError, ValueError):
invalid_count_warnings = self._log_platform_stats_invalid_count(
raw_count, key_for_log, invalid_count_warnings
)
return 0, invalid_count_warnings
```
然后 `_merge_platform_stats_rows` 就可以变得更扁平,并避免内部定义函数:
```python
invalid_count_warnings = 0
for row in rows:
# ... build normalized_row, key_for_log ...
count, invalid_count_warnings = self._parse_platform_stats_count(
normalized_row.get("count", 0),
key_for_log,
invalid_count_warnings,
)
normalized_row["count"] = count
# ... merge logic ...
```
这样可以消除闭包语义,同时使限流行为更加显式。
---
### 3. 抽取行归一化 / 合并键计算逻辑
目前 `_merge_platform_stats_rows` 的主循环混合了归一化、日志和合并逻辑。通过抽取一个返回 `(merge_key | None, normalized_row)` 的辅助函数,可以让控制流更加线性清晰:
```python
def _normalize_platform_stats_row(
self, row: dict[str, Any]
) -> tuple[tuple[str, str, str] | None, dict[str, Any], tuple[Any, Any, Any]]:
normalized_row = dict(row)
raw_timestamp = normalized_row.get("timestamp")
normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
# normalize timestamp field for storage/logging
if normalized_timestamp is not None:
normalized_row["timestamp"] = normalized_timestamp
elif isinstance(raw_timestamp, str):
normalized_row["timestamp"] = raw_timestamp.strip()
elif raw_timestamp is None:
normalized_row["timestamp"] = ""
else:
normalized_row["timestamp"] = str(raw_timestamp)
key_for_log = (
normalized_row.get("timestamp"),
repr(platform_id),
repr(platform_type),
)
# decide whether this row is mergeable
if (
normalized_timestamp is None
or not isinstance(platform_id, str)
or not isinstance(platform_type, str)
):
merge_key: tuple[str, str, str] | None = None
else:
merge_key = (normalized_timestamp, platform_id, platform_type)
return merge_key, normalized_row, key_for_log
```
随后 `_merge_platform_stats_rows` 可以简化为:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
result: list[dict[str, Any]] = []
invalid_count_warnings = 0
for row in rows:
merge_key, normalized_row, key_for_log = self._normalize_platform_stats_row(row)
count, invalid_count_warnings = self._parse_platform_stats_count(
normalized_row.get("count", 0),
key_for_log,
invalid_count_warnings,
)
normalized_row["count"] = count
if merge_key is None:
result.append(normalized_row)
continue
existing = merged.get(merge_key)
if existing is None:
merged[merge_key] = normalized_row
result.append(normalized_row)
else:
existing["count"] += count
return result
```
这样可以保持当前所有语义(包括哪些行会被合并、哪些会单独保留),同时让逐行处理和合并决策的逻辑更容易阅读和维护。
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈来改进后续的 Review。
Original comment in English
Hey - I've found 1 issue, and left some high level feedback:
- The new
PLATFORM_STATS_INVALID_COUNT_WARN_LIMITbehavior inlog_invalid_countis a bit unintuitive when the limit is 0 (it still logs a '已达到上限 (0)' message on the first invalid value); consider simplifying this branch so that 0 cleanly means 'no warnings' or documenting the special-case behavior. - The warning in
_preprocess_main_table_rowsfor every table with duplicateplatform_statskeys could get noisy on large imports; you might want to downgrade this toinfoor add a threshold before logging to avoid log spam in normal operation.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` behavior in `log_invalid_count` is a bit unintuitive when the limit is 0 (it still logs a '已达到上限 (0)' message on the first invalid value); consider simplifying this branch so that 0 cleanly means 'no warnings' or documenting the special-case behavior.
- The warning in `_preprocess_main_table_rows` for every table with duplicate `platform_stats` keys could get noisy on large imports; you might want to downgrade this to `info` or add a threshold before logging to avoid log spam in normal operation.
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>
return imported
+ def _preprocess_main_table_rows(
+ self, table_name: str, rows: list[dict[str, Any]]
+ ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the new preprocessing and merge logic for `platform_stats` rows into clearer, flatter helpers and simpler conditionals to make the flow easier to follow and maintain without changing behavior.
You can keep all the new behavior but simplify a few hotspots to make the code easier to follow.
### 1. Replace single‑case dispatch dict
`_preprocess_main_table_rows` currently uses a dict with a single entry. A direct conditional is simpler and makes the flow clearer:
```python
def _preprocess_main_table_rows(
self, table_name: str, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
if table_name == "platform_stats":
normalized_rows = self._merge_platform_stats_rows(rows)
duplicate_count = len(rows) - len(normalized_rows)
if duplicate_count > 0:
logger.warning(
"检测到 %s 重复键 %d 条,已在导入前聚合",
table_name,
duplicate_count,
)
return normalized_rows
return rows
```
You can always reintroduce a dispatch map when there are multiple tables.
---
### 2. Simplify invalid count rate limiting (while preserving semantics)
You can keep the current behavior (including the `limit <= 0` “suppress all, but log once” case) but make the control flow clearer and move the helpers out of the nested scope.
Current behavior recap (to preserve exactly):
- `limit <= 0`: on the first invalid value, log only the “reached limit / suppress” message once; no per‑value warnings.
- `limit > 0`: log per invalid value up to `limit`, then log a single “reached limit / suppress” message, then stay silent.
A clearer implementation:
```python
def _log_platform_stats_invalid_count(
self,
raw_count: Any,
key_for_log: tuple[Any, Any, Any],
invalid_count_warnings: int,
) -> int:
limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT
# limit <= 0: only log the suppression message once
if limit <= 0:
if invalid_count_warnings == 0:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
limit,
)
return 1
return invalid_count_warnings
# limit > 0: log up to `limit` invalids, then one suppression message
if invalid_count_warnings < limit:
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
raw_count,
key_for_log,
)
invalid_count_warnings += 1
if invalid_count_warnings == limit:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
limit,
)
# keep counting above limit to avoid repeated suppression logs
invalid_count_warnings += 1
return invalid_count_warnings
def _parse_platform_stats_count(
self,
raw_count: Any,
key_for_log: tuple[Any, Any, Any],
invalid_count_warnings: int,
) -> tuple[int, int]:
try:
return int(raw_count), invalid_count_warnings
except (TypeError, ValueError):
invalid_count_warnings = self._log_platform_stats_invalid_count(
raw_count, key_for_log, invalid_count_warnings
)
return 0, invalid_count_warnings
```
Then `_merge_platform_stats_rows` becomes flatter and avoids inner functions:
```python
invalid_count_warnings = 0
for row in rows:
# ... build normalized_row, key_for_log ...
count, invalid_count_warnings = self._parse_platform_stats_count(
normalized_row.get("count", 0),
key_for_log,
invalid_count_warnings,
)
normalized_row["count"] = count
# ... merge logic ...
```
This removes closure semantics and makes the rate‑limiting behavior explicit.
---
### 3. Extract row normalization/merge‑key computation
The main loop in `_merge_platform_stats_rows` mixes normalization, logging, and merging. Extracting a helper that returns `(merge_key | None, normalized_row)` makes the control flow linear:
```python
def _normalize_platform_stats_row(
self, row: dict[str, Any]
) -> tuple[tuple[str, str, str] | None, dict[str, Any], tuple[Any, Any, Any]]:
normalized_row = dict(row)
raw_timestamp = normalized_row.get("timestamp")
normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
# normalize timestamp field for storage/logging
if normalized_timestamp is not None:
normalized_row["timestamp"] = normalized_timestamp
elif isinstance(raw_timestamp, str):
normalized_row["timestamp"] = raw_timestamp.strip()
elif raw_timestamp is None:
normalized_row["timestamp"] = ""
else:
normalized_row["timestamp"] = str(raw_timestamp)
key_for_log = (
normalized_row.get("timestamp"),
repr(platform_id),
repr(platform_type),
)
# decide whether this row is mergeable
if (
normalized_timestamp is None
or not isinstance(platform_id, str)
or not isinstance(platform_type, str)
):
merge_key: tuple[str, str, str] | None = None
else:
merge_key = (normalized_timestamp, platform_id, platform_type)
return merge_key, normalized_row, key_for_log
```
Then `_merge_platform_stats_rows` simplifies to:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
result: list[dict[str, Any]] = []
invalid_count_warnings = 0
for row in rows:
merge_key, normalized_row, key_for_log = self._normalize_platform_stats_row(row)
count, invalid_count_warnings = self._parse_platform_stats_count(
normalized_row.get("count", 0),
key_for_log,
invalid_count_warnings,
)
normalized_row["count"] = count
if merge_key is None:
result.append(normalized_row)
continue
existing = merged.get(merge_key)
if existing is None:
merged[merge_key] = normalized_row
result.append(normalized_row)
else:
existing["count"] += count
return result
```
This keeps all existing semantics (including when rows are mergeable vs kept separate) but makes the per‑row logic and merge decisions easier to scan and maintain.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 1 个问题,并留下了一些整体性的反馈:
- 新的
platform_stats聚合日志使用的是logger.info,而示例日志和说明中使用的是 warning 级别;建议将其改为logger.warning(或者相应更新说明),以保持行为和预期的一致性。 - 硬编码的
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT可以改为可配置参数(例如通过配置文件或环境变量),这样运维人员可以在不同环境中根据需要调整日志量,而不必修改代码。
面向 AI Agent 的提示词
Please address the comments from this code review:
## Overall Comments
- The new `platform_stats` aggregation logs use `logger.info` while the sample logs and description refer to a warning level; consider switching this to `logger.warning` (or adjusting the description) so the behavior and expectations stay consistent.
- The hard-coded `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` could be made configurable (e.g., via settings or environment) to allow operators to tune log volume in different environments without code changes.
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="578" />
<code_context>
+
+ return result
+
+ def _normalize_platform_stats_row(
+ self, row: dict[str, Any]
+ ) -> tuple[tuple[str, str, str] | None, dict[str, Any], tuple[Any, Any, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** 考虑通过合并强耦合的辅助函数、返回更简单的值、并将解析和日志逻辑本地化,来简化 platform_stats 的辅助方法,从而减少间接调用和基于元组的“协议”复杂度。
你可以在保持现有行为不变的前提下,通过合并部分强耦合的 helper,并简化它们之间的“协议”,来减少间接层次。
### 1. 简化 `_normalize_platform_stats_row` 的返回类型
目前它返回一个三元组 `(merge_key | None, normalized_row, key_for_log)`,这是造成“元组协议”复杂度的主要来源。
你可以改为:
- 只返回归一化后的行(normalized row)。
- 让 `_merge_platform_stats_rows` 基于归一化后的行同时计算 merge key 和 log key。
这样可以让“归一化”逻辑保持可复用,同时把“什么是 key”这部分逻辑集中在一个地方。
```python
def _normalize_platform_stats_row(
self, row: dict[str, Any]
) -> dict[str, Any]:
normalized_row = dict(row)
raw_timestamp = normalized_row.get("timestamp")
normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
if normalized_timestamp is not None:
normalized_row["timestamp"] = normalized_timestamp
elif isinstance(raw_timestamp, str):
normalized_row["timestamp"] = raw_timestamp.strip()
elif raw_timestamp is None:
normalized_row["timestamp"] = ""
else:
normalized_row["timestamp"] = str(raw_timestamp)
return normalized_row
```
然后由 `_merge_platform_stats_rows` 直接构建 key 并实现合并规则:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
result: list[dict[str, Any]] = []
invalid_count_warnings = 0
for row in rows:
normalized_row = self._normalize_platform_stats_row(row)
timestamp = normalized_row.get("timestamp")
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
key_for_log = (timestamp, repr(platform_id), repr(platform_type))
count, invalid_count_warnings = self._parse_platform_stats_count(
normalized_row.get("count", 0),
key_for_log,
invalid_count_warnings,
)
normalized_row["count"] = count
if (
not isinstance(timestamp, str)
or not isinstance(platform_id, str)
or not isinstance(platform_type, str)
):
result.append(normalized_row)
continue
merge_key = (timestamp, platform_id, platform_type)
existing = merged.get(merge_key)
if existing is None:
merged[merge_key] = normalized_row
result.append(normalized_row)
else:
existing["count"] += count
return result
```
这样可以在保留所有当前语义(合并规则和日志 key)的前提下,去掉三元组返回和在 normalizer 内构造 key 的逻辑。
### 2. 将非法 count 的告警逻辑本地化
当前的 `invalid_count_warnings` 协议(通过 `_parse_platform_stats_count` 和 `_log_platform_stats_invalid_count` 传递,并带有 `+1` 的计数)比较难以理解。
你可以在保持频率限制行为完全一致的情况下:
- 让 `_parse_platform_stats_count` 只负责解析。
- 把所有计数和频率限制逻辑都放到 `_merge_platform_stats_rows` 里集中处理。
例如:
```python
def _parse_platform_stats_count(self, raw_count: Any) -> int | None:
try:
return int(raw_count)
except (TypeError, ValueError):
return None
```
然后:
```python
def _merge_platform_stats_rows(...):
...
invalid_count_warnings = 0
limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT
suppression_logged = False
for row in rows:
...
parsed = self._parse_platform_stats_count(normalized_row.get("count", 0))
if parsed is None:
if limit > 0:
if invalid_count_warnings < limit:
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
normalized_row.get("count", 0),
key_for_log,
)
invalid_count_warnings += 1
if invalid_count_warnings == limit and not suppression_logged:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
limit,
)
suppression_logged = True
elif not suppression_logged:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
limit,
)
suppression_logged = True
count = 0
else:
count = parsed
normalized_row["count"] = count
...
```
这样保持了现有行为(上限、只记录一次抑制日志、`limit <= 0` 的语义),但不再需要在 helper 之间来回传递计数器,也无需理解 `_log_platform_stats_invalid_count` 的“状态机”。
### 3. 将 `_to_utc_iso` 内联到 `_normalize_platform_stats_timestamp` 中
这两个方法是强耦合且只在这里使用。内联后可以在不改变行为的情况下减少一次跳转:
```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str | None:
def _to_utc_iso(dt: datetime) -> str:
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
else:
dt = dt.astimezone(timezone.utc)
return dt.isoformat()
if isinstance(value, datetime):
return _to_utc_iso(value)
if isinstance(value, str):
timestamp = value.strip()
if not timestamp:
return None
if timestamp.endswith("Z"):
timestamp = f"{timestamp[:-1]}+00:00"
try:
return _to_utc_iso(datetime.fromisoformat(timestamp))
except ValueError:
return None
return None
```
这样可以让时间戳处理逻辑更加自包含,同时减少在阅读代码时不同方法之间的跳转。
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的 Review。
Original comment in English
Hey - I've found 1 issue, and left some high level feedback:
- The new
platform_statsaggregation logs uselogger.infowhile the sample logs and description refer to a warning level; consider switching this tologger.warning(or adjusting the description) so the behavior and expectations stay consistent. - The hard-coded
PLATFORM_STATS_INVALID_COUNT_WARN_LIMITcould be made configurable (e.g., via settings or environment) to allow operators to tune log volume in different environments without code changes.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new `platform_stats` aggregation logs use `logger.info` while the sample logs and description refer to a warning level; consider switching this to `logger.warning` (or adjusting the description) so the behavior and expectations stay consistent.
- The hard-coded `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` could be made configurable (e.g., via settings or environment) to allow operators to tune log volume in different environments without code changes.
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="578" />
<code_context>
+
+ return result
+
+ def _normalize_platform_stats_row(
+ self, row: dict[str, Any]
+ ) -> tuple[tuple[str, str, str] | None, dict[str, Any], tuple[Any, Any, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the platform_stats helpers by collapsing tightly coupled functions, returning simpler values, and localizing parsing and logging logic to reduce indirection and tuple-based protocols.
You can keep all behavior but reduce indirection by collapsing some of the tightly‑coupled helpers and simplifying the “protocols” between them.
### 1. Simplify `_normalize_platform_stats_row`’s return type
Right now you return a triple `(merge_key | None, normalized_row, key_for_log)`. That’s the main source of “tuple protocol” complexity.
You can instead:
- Return just the normalized row.
- Let `_merge_platform_stats_rows` compute both the merge key and the log key from the normalized row.
This keeps normalization reusable and localizes the “what’s a key?” logic to one place.
```python
def _normalize_platform_stats_row(
self, row: dict[str, Any]
) -> dict[str, Any]:
normalized_row = dict(row)
raw_timestamp = normalized_row.get("timestamp")
normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
if normalized_timestamp is not None:
normalized_row["timestamp"] = normalized_timestamp
elif isinstance(raw_timestamp, str):
normalized_row["timestamp"] = raw_timestamp.strip()
elif raw_timestamp is None:
normalized_row["timestamp"] = ""
else:
normalized_row["timestamp"] = str(raw_timestamp)
return normalized_row
```
Then `_merge_platform_stats_rows` builds the keys directly and contains the merging rules:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
result: list[dict[str, Any]] = []
invalid_count_warnings = 0
for row in rows:
normalized_row = self._normalize_platform_stats_row(row)
timestamp = normalized_row.get("timestamp")
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
key_for_log = (timestamp, repr(platform_id), repr(platform_type))
count, invalid_count_warnings = self._parse_platform_stats_count(
normalized_row.get("count", 0),
key_for_log,
invalid_count_warnings,
)
normalized_row["count"] = count
if (
not isinstance(timestamp, str)
or not isinstance(platform_id, str)
or not isinstance(platform_type, str)
):
result.append(normalized_row)
continue
merge_key = (timestamp, platform_id, platform_type)
existing = merged.get(merge_key)
if existing is None:
merged[merge_key] = normalized_row
result.append(normalized_row)
else:
existing["count"] += count
return result
```
This preserves all current semantics (merging rules and logging key) but eliminates the triple return and “key” construction inside the normalizer.
### 2. Localize invalid count warning logic
The current `invalid_count_warnings` protocol (threading through `_parse_platform_stats_count` and `_log_platform_stats_invalid_count`, with the `+1` bump) is hard to follow.
You can keep rate limiting exactly the same but:
- Make `_parse_platform_stats_count` only parse.
- Do all counting and rate limiting in `_merge_platform_stats_rows`, in one place.
For example:
```python
def _parse_platform_stats_count(self, raw_count: Any) -> int | None:
try:
return int(raw_count)
except (TypeError, ValueError):
return None
```
Then:
```python
def _merge_platform_stats_rows(...):
...
invalid_count_warnings = 0
limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT
suppression_logged = False
for row in rows:
...
parsed = self._parse_platform_stats_count(normalized_row.get("count", 0))
if parsed is None:
if limit > 0:
if invalid_count_warnings < limit:
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
normalized_row.get("count", 0),
key_for_log,
)
invalid_count_warnings += 1
if invalid_count_warnings == limit and not suppression_logged:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
limit,
)
suppression_logged = True
elif not suppression_logged:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
limit,
)
suppression_logged = True
count = 0
else:
count = parsed
normalized_row["count"] = count
...
```
This keeps the behavior (limit, one suppression log, `limit <= 0` semantics) but removes the need to pass a counter in/out of helpers and to understand `_log_platform_stats_invalid_count`’s state machine.
### 3. Inline `_to_utc_iso` into `_normalize_platform_stats_timestamp`
These two are tightly coupled and only used here. Inlining removes one more hop without changing behavior:
```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str | None:
def _to_utc_iso(dt: datetime) -> str:
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
else:
dt = dt.astimezone(timezone.utc)
return dt.isoformat()
if isinstance(value, datetime):
return _to_utc_iso(value)
if isinstance(value, str):
timestamp = value.strip()
if not timestamp:
return None
if timestamp.endswith("Z"):
timestamp = f"{timestamp[:-1]}+00:00"
try:
return _to_utc_iso(datetime.fromisoformat(timestamp))
except ValueError:
return None
return None
```
This keeps timestamp handling self-contained and reduces cross-method jumping when reading the code.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 1 个问题
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="568" />
<code_context>
+ return normalized_rows
+ return rows
+
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** 请考虑将“非法 count 告警节流逻辑”和“行归一化 / count 解析逻辑”抽取到辅助函数中,这样 `_merge_platform_stats_rows` 就可以只关注分组和合并的核心逻辑。
你可以在不改变行为的前提下,通过以下方式降低新的 `platform_stats` 处理逻辑在本地范围内的复杂度:
1. 将告警节流逻辑从 `_merge_platform_stats_rows` 中抽离出来
2. 将归一化 + count 解析折叠到一个单独的辅助函数中,让 `_merge_platform_stats_rows` 主要只负责分组 / 合并
### 1. 将限流告警抽取为辅助类
目前 `_merge_platform_stats_rows` 同时在处理 `invalid_count_warnings`、`suppression_logged` 以及基于环境变量的上限,这些都可以放到一个带少量状态的辅助类里,这样合并循环只需要“请求发出告警”,而不需要关心具体机制:
```python
class _InvalidCountWarnLimiter:
def __init__(self, limit: int) -> None:
self.limit = limit
self._count = 0
self._suppression_logged = False
def warn_invalid_count(self, value: Any, key_for_log: tuple[Any, ...]) -> None:
if self.limit > 0:
if self._count < self.limit:
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
value,
key_for_log,
)
self._count += 1
if self._count == self.limit and not self._suppression_logged:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
self.limit,
)
self._suppression_logged = True
elif not self._suppression_logged:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
self.limit,
)
self._suppression_logged = True
```
然后 `_merge_platform_stats_rows` 会变得更简单:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
result: list[dict[str, Any]] = []
warn_limiter = _InvalidCountWarnLimiter(PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT)
for row in rows:
normalized_row, normalized_timestamp, count = self._normalize_platform_stats_entry(row, warn_limiter)
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
if (
normalized_timestamp is None
or not isinstance(platform_id, str)
or not isinstance(platform_type, str)
):
result.append(normalized_row)
continue
merge_key = (normalized_timestamp, platform_id, platform_type)
existing = merged.get(merge_key)
if existing is None:
merged[merge_key] = normalized_row
result.append(normalized_row)
else:
existing["count"] += count
return result
```
### 2. 合并行归一化和 count 解析
你已经有 `_normalize_platform_stats_row`、`_normalize_platform_stats_timestamp` 和 `_parse_platform_stats_count`,当前的归一化和解析散落在多处调用中,并且存在一些重复归一化。你可以在保持现有行为的基础上,通过一个单一的辅助函数,让契约更加清晰,该函数:
- 返回一个完全归一化后的行
- 返回归一化后的时间戳(或 `None`)
- 处理 count 的解析和告警(通过限流器)
示例:
```python
def _normalize_platform_stats_entry(
self,
row: dict[str, Any],
warn_limiter: _InvalidCountWarnLimiter,
) -> tuple[dict[str, Any], str | None, int]:
normalized_row = dict(row)
# timestamp normalization
raw_timestamp = normalized_row.get("timestamp")
normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
if normalized_timestamp is not None:
normalized_row["timestamp"] = normalized_timestamp
elif isinstance(raw_timestamp, str):
normalized_row["timestamp"] = raw_timestamp.strip()
elif raw_timestamp is None:
normalized_row["timestamp"] = ""
else:
normalized_row["timestamp"] = str(raw_timestamp)
# count parsing with warning
raw_count = normalized_row.get("count", 0)
try:
count = int(raw_count)
except (TypeError, ValueError):
key_for_log = (
normalized_row.get("timestamp"),
repr(normalized_row.get("platform_id")),
repr(normalized_row.get("platform_type")),
)
warn_limiter.warn_invalid_count(raw_count, key_for_log)
count = 0
normalized_row["count"] = count
return normalized_row, normalized_timestamp, count
```
这样可以保持所有现有行为(相同的 timestamp 回退逻辑、非法 count 默认仍为 `0`、相同的告警限流),但同时:
- `_merge_platform_stats_rows` 只需要关心:“给我一个归一化的行、它的归一化时间戳以及一个数值类型的 count”
- count 告警节流被隔离出来,并且可复用
- 每行的时间戳归一化只做一次,且有清晰的返回约定(`normalized_timestamp` 用于合并 key,`"timestamp"` 字段用于存储 / 日志)
你可以保持 `_normalize_platform_stats_timestamp` 现有实现不变;它已经足够内聚且易读。
</issue_to_address>帮我变得更有用!请在每条评论上点击 👍 或 👎,我会根据你的反馈改进后续评审。
Original comment in English
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="568" />
<code_context>
+ return normalized_rows
+ return rows
+
+ def _merge_platform_stats_rows(
+ self, rows: list[dict[str, Any]]
+ ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the invalid-count warning throttling and row normalization/count parsing into helpers so `_merge_platform_stats_rows` focuses only on grouping and merging logic.
You can reduce the localized complexity in the new `platform_stats` handling without changing behavior by:
1. Extracting the warning throttling out of `_merge_platform_stats_rows`
2. Collapsing normalization + count parsing into a single helper so `_merge_platform_stats_rows` is mostly about grouping/merging
### 1. Extract rate-limited warning into a helper
Right now `_merge_platform_stats_rows` is juggling `invalid_count_warnings`, `suppression_logged`, and the env-driven limit. That can be pushed into a small stateful helper so the merge loop only “asks to warn” and doesn’t care about the mechanics:
```python
class _InvalidCountWarnLimiter:
def __init__(self, limit: int) -> None:
self.limit = limit
self._count = 0
self._suppression_logged = False
def warn_invalid_count(self, value: Any, key_for_log: tuple[Any, ...]) -> None:
if self.limit > 0:
if self._count < self.limit:
logger.warning(
"platform_stats count 非法,已按 0 处理: value=%r, key=%s",
value,
key_for_log,
)
self._count += 1
if self._count == self.limit and not self._suppression_logged:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
self.limit,
)
self._suppression_logged = True
elif not self._suppression_logged:
logger.warning(
"platform_stats 非法 count 告警已达到上限 (%d),后续将抑制",
self.limit,
)
self._suppression_logged = True
```
Then `_merge_platform_stats_rows` becomes simpler:
```python
def _merge_platform_stats_rows(
self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
merged: dict[tuple[str, str, str], dict[str, Any]] = {}
result: list[dict[str, Any]] = []
warn_limiter = _InvalidCountWarnLimiter(PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT)
for row in rows:
normalized_row, normalized_timestamp, count = self._normalize_platform_stats_entry(row, warn_limiter)
platform_id = normalized_row.get("platform_id")
platform_type = normalized_row.get("platform_type")
if (
normalized_timestamp is None
or not isinstance(platform_id, str)
or not isinstance(platform_type, str)
):
result.append(normalized_row)
continue
merge_key = (normalized_timestamp, platform_id, platform_type)
existing = merged.get(merge_key)
if existing is None:
merged[merge_key] = normalized_row
result.append(normalized_row)
else:
existing["count"] += count
return result
```
### 2. Consolidate row normalization and count parsing
You already have `_normalize_platform_stats_row`, `_normalize_platform_stats_timestamp`, and `_parse_platform_stats_count`, with normalization and parsing split across multiple calls and some re-normalization. You can keep behavior but present a clearer contract by having a single helper that:
- Returns a fully normalized row
- Returns the normalized timestamp (or `None`)
- Handles count parsing and warning (via the limiter)
Example:
```python
def _normalize_platform_stats_entry(
self,
row: dict[str, Any],
warn_limiter: _InvalidCountWarnLimiter,
) -> tuple[dict[str, Any], str | None, int]:
normalized_row = dict(row)
# timestamp normalization
raw_timestamp = normalized_row.get("timestamp")
normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
if normalized_timestamp is not None:
normalized_row["timestamp"] = normalized_timestamp
elif isinstance(raw_timestamp, str):
normalized_row["timestamp"] = raw_timestamp.strip()
elif raw_timestamp is None:
normalized_row["timestamp"] = ""
else:
normalized_row["timestamp"] = str(raw_timestamp)
# count parsing with warning
raw_count = normalized_row.get("count", 0)
try:
count = int(raw_count)
except (TypeError, ValueError):
key_for_log = (
normalized_row.get("timestamp"),
repr(normalized_row.get("platform_id")),
repr(normalized_row.get("platform_type")),
)
warn_limiter.warn_invalid_count(raw_count, key_for_log)
count = 0
normalized_row["count"] = count
return normalized_row, normalized_timestamp, count
```
This keeps all existing behavior (same timestamp fallback, same default `0` for invalid counts, same rate limiting) but:
- `_merge_platform_stats_rows` only cares about: “give me a normalized row, its normalized timestamp, and a numeric count”
- Count warning throttling is isolated and reusable
- Timestamp normalization is done once per row with a clear return contract (`normalized_timestamp` used for merge key, `"timestamp"` field used for storage/logging)
You can keep `_normalize_platform_stats_timestamp` as-is; it’s already self-contained and readable.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
嘿,我已经审查了你的修改,一切看起来都很棒!
帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的代码审查。
Original comment in English
Hey - I've reviewed your changes and they look great!
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
No docs changes were generated in this run (docs repo had no updates). Docs repo: AstrBotDevs/AstrBot-docs AI change summary (not committed):
Experimental bot notice:
|
Closes #5584
Changed Files / 改动文件
astrbot/core/backup/importer.pytests/test_backup.pyMotivation / 变更动机
修复备份导入在
replace模式下的两个实际问题:platform_stats存在重复键时会触发唯一约束错误,导入中断。这次改动的目标是:导入要么完整成功,要么明确失败,不留下半成状态。
Modifications / 改动点
修复
replace模式下主库清空失败仍继续导入的问题。导入前对
platform_stats重复键做聚合(count累加),并统一时间戳判重格式。非法
count按0处理并告警(限流),补充对应测试。This is NOT a breaking change. / 这不是一个破坏性变更。
Screenshots or Test Results / 运行截图或测试结果
已在服务器环境完成分片上传 + 导入验证测试,结果如下:
分片上传完成:
69/69导入完成:
success=True,errors=[]重复键处理生效:日志显示
platform_stats重复键已聚合,最终正常导入(详细日志见本次测试记录)
Checklist / 检查清单
requirements.txt和pyproject.toml文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations inrequirements.txtandpyproject.toml.Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录进行聚合,以防止唯一约束冲突。platform_stats计数值视为 0,并以限速方式记录警告,从而避免坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录,以防止在恢复过程中触发唯一约束冲突。platform_stats计数值视为 0,并以限频方式记录警告日志,从而不再因为这些问题导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录进行聚合,以防止唯一约束冲突。platform_stats计数值视为 0,并以限速方式记录警告,从而避免坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录进行聚合,以防止唯一约束冲突。platform_stats计数值视为 0,并以限速方式记录警告,从而避免坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录,以防止在恢复过程中触发唯一约束冲突。platform_stats计数值视为 0,并以限频方式记录警告日志,从而不再因为这些问题导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录进行聚合,以防止唯一约束冲突。platform_stats计数值视为 0,并以限速方式记录警告,从而避免坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录,以避免在恢复过程中触发唯一约束错误。platform_stats计数值视为 0,并通过限流的警告日志进行提示,以防止导入失败。测试:
platform_stats行聚合逻辑以及对无效计数值的处理。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录进行聚合,以防止唯一约束冲突。platform_stats计数值视为 0,并以限速方式记录警告,从而避免坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录,以防止在恢复过程中触发唯一约束冲突。platform_stats计数值视为 0,并以限频方式记录警告日志,从而不再因为这些问题导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录进行聚合,以防止唯一约束冲突。platform_stats计数值视为 0,并以限速方式记录警告,从而避免坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录进行聚合,以防止唯一约束冲突。platform_stats计数值视为 0,并以限速方式记录警告,从而避免坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录,以防止在恢复过程中触发唯一约束冲突。platform_stats计数值视为 0,并以限频方式记录警告日志,从而不再因为这些问题导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats记录进行聚合,以防止唯一约束冲突。platform_stats计数值视为 0,并以限速方式记录警告,从而避免坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以避免唯一约束冲突。platform_stats计数值视为 0,并以限速告警的方式记录日志,从而防止坏数据导致导入失败。测试:
platform_stats行聚合、无效计数处理以及时间戳规范化行为。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests:
Bug 修复:
platform_stats行,以消除导入过程中重复键冲突。platform_stats计数值视为 0,并记录速率受限的警告日志,从而避免因为坏数据导致导入中止。测试:
platform_stats行聚合、时间戳规范化以及无效计数处理,包括对警告日志的速率限制测试。Original summary in English
Summary by Sourcery
在替换模式下强化备份导入行为,并规范化平台统计数据,以避免部分导入以及重复键导致的失败。
Bug 修复:
platform_stats记录,以防止导入过程中出现重复键和唯一性约束冲突。platform_stats计数值视为 0,并使用限速的告警日志记录,从而避免因数据格式错误导致导入失败。测试:
platform_stats行聚合、时间戳规范化、无效计数处理以及告警限速机制。Original summary in English
Summary by Sourcery
Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.
Bug Fixes:
Tests: