fix: harden backup import for duplicate platform stats by Clhikari · Pull Request #5594 · AstrBotDevs/AstrBot

Clhikari · 2026-03-01T09:05:28Z

Closes #5584

修复 replace 模式下主库清空失败仍继续导入的问题。
导入前对 platform_stats 重复键做聚合（count 累加），并统一时间戳判重格式。
非法 count 按 0 处理并告警（限流），补充对应测试。

Changed Files / 改动文件

astrbot/core/backup/importer.py
tests/test_backup.py

Motivation / 变更动机

修复备份导入在 replace 模式下的两个实际问题：

主库清空失败时，流程仍会继续，可能导致“旧数据未清空 + 新数据已写入”的混合状态。
platform_stats 存在重复键时会触发唯一约束错误，导入中断。

这次改动的目标是：导入要么完整成功，要么明确失败，不留下半成状态。

Modifications / 改动点

修复 replace 模式下主库清空失败仍继续导入的问题。
导入前对 platform_stats 重复键做聚合（count 累加），并统一时间戳判重格式。
非法 count 按 0 处理并告警（限流），补充对应测试。
This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

已在服务器环境完成分片上传 + 导入验证测试，结果如下：

分片上传完成：69/69
导入完成：success=True，errors=[]
重复键处理生效：日志显示 platform_stats 重复键已聚合，最终正常导入

（详细日志见本次测试记录）

[2026-03-01 08:42:24.679] [Core] [INFO] [routes.backup:708]: 分片上传完成: astrbot_backup_20260301_065125_dup_platform_stats_20260301_084218.zip, size=71879813, chunks=69
[2026-03-01 08:42:25.832] [Core] [INFO] [backup.importer:304]: 开始从 /AstrBot/data/backups/astrbot_backup_20260301_065125_dup_platform_stats_20260301_084218.zip 导入备份
[2026-03-01 08:42:25.843] [Core] [DBUG] [backup.importer:453]: 已清空表 platform_stats
[2026-03-01 08:42:25.845] [Core] [DBUG] [backup.importer:453]: 已清空表 conversations
[2026-03-01 08:42:25.847] [Core] [DBUG] [backup.importer:453]: 已清空表 personas
[2026-03-01 08:42:25.848] [Core] [DBUG] [backup.importer:453]: 已清空表 persona_folders
[2026-03-01 08:42:25.850] [Core] [DBUG] [backup.importer:453]: 已清空表 preferences
[2026-03-01 08:42:25.851] [Core] [DBUG] [backup.importer:453]: 已清空表 platform_message_history
[2026-03-01 08:42:25.852] [Core] [DBUG] [backup.importer:453]: 已清空表 platform_sessions
[2026-03-01 08:42:25.853] [Core] [DBUG] [backup.importer:453]: 已清空表 attachments
[2026-03-01 08:42:25.854] [Core] [DBUG] [backup.importer:453]: 已清空表 command_configs
[2026-03-01 08:42:25.856] [Core] [DBUG] [backup.importer:453]: 已清空表 command_conflicts
[2026-03-01 08:42:25.921] [Core] [WARN] [v4.18.3] [backup.importer:503]: 检测到 platform_stats 重复键 1 条，已在导入前聚合
[2026-03-01 08:42:25.927] [Core] [DBUG] [backup.importer:519]: 导入表 platform_stats: 79 条记录
[2026-03-01 08:42:25.930] [Core] [DBUG] [backup.importer:519]: 导入表 conversations: 14 条记录
[2026-03-01 08:42:25.930] [Core] [DBUG] [backup.importer:519]: 导入表 personas: 2 条记录
[2026-03-01 08:42:25.930] [Core] [DBUG] [backup.importer:519]: 导入表 persona_folders: 0 条记录
[2026-03-01 08:42:25.932] [Core] [DBUG] [backup.importer:519]: 导入表 preferences: 20 条记录
[2026-03-01 08:42:25.935] [Core] [DBUG] [backup.importer:519]: 导入表 platform_message_history: 33 条记录
[2026-03-01 08:42:25.936] [Core] [DBUG] [backup.importer:519]: 导入表 platform_sessions: 7 条记录
[2026-03-01 08:42:25.937] [Core] [DBUG] [backup.importer:519]: 导入表 attachments: 2 条记录
[2026-03-01 08:42:25.937] [Core] [DBUG] [backup.importer:519]: 导入表 command_configs: 0 条记录
[2026-03-01 08:42:25.937] [Core] [DBUG] [backup.importer:519]: 导入表 command_conflicts: 0 条记录
[2026-03-01 08:42:26.010] [Core] [DBUG] [backup.importer:468]: 已清空知识库表 knowledge_bases
[2026-03-01 08:42:26.012] [Core] [DBUG] [backup.importer:468]: 已清空知识库表 kb_documents
[2026-03-01 08:42:26.013] [Core] [DBUG] [backup.importer:468]: 已清空知识库表 kb_media
[2026-03-01 08:42:26.248] [Core] [DBUG] [backup.importer:797]: 已备份现有目录 /AstrBot/data/plugins 到 /AstrBot/data/plugins.bak
[2026-03-01 08:42:26.997] [Core] [DBUG] [backup.importer:820]: 导入目录 plugins: 765 个文件
[2026-03-01 08:42:27.083] [Core] [DBUG] [backup.importer:797]: 已备份现有目录 /AstrBot/data/plugin_data 到 /AstrBot/data/plugin_data.bak
[2026-03-01 08:42:27.166] [Core] [DBUG] [backup.importer:820]: 导入目录 plugin_data: 23 个文件
[2026-03-01 08:42:27.168] [Core] [DBUG] [backup.importer:797]: 已备份现有目录 /AstrBot/data/config 到 /AstrBot/data/config.bak
[2026-03-01 08:42:27.173] [Core] [DBUG] [backup.importer:820]: 导入目录 config: 19 个文件
[2026-03-01 08:42:27.174] [Core] [DBUG] [backup.importer:797]: 已备份现有目录 /AstrBot/data/t2i_templates 到 /AstrBot/data/t2i_templates.bak
[2026-03-01 08:42:27.175] [Core] [DBUG] [backup.importer:820]: 导入目录 t2i_templates: 3 个文件
[2026-03-01 08:42:27.176] [Core] [DBUG] [backup.importer:797]: 已备份现有目录 /AstrBot/data/webchat 到 /AstrBot/data/webchat.bak
[2026-03-01 08:42:27.178] [Core] [DBUG] [backup.importer:820]: 导入目录 webchat: 2 个文件
[2026-03-01 08:42:27.179] [Core] [DBUG] [backup.importer:797]: 已备份现有目录 /AstrBot/data/temp 到 /AstrBot/data/temp.bak
[2026-03-01 08:42:27.182] [Core] [DBUG] [backup.importer:820]: 导入目录 temp: 6 个文件
[2026-03-01 08:42:27.182] [Core] [INFO] [backup.importer:416]: 备份导入完成: {'success': True, 'imported_tables': {'platform_stats': 79, 'conversations': 14, 'personas': 2, 'persona_folders': 0, 'preferences': 20, 'platform_message_history': 33, 'platform_sessions': 7, 'attachments': 2, 'command_configs': 0, 'command_conflicts': 0, 'kb_knowledge_bases': 0, 'kb_kb_documents': 0, 'kb_kb_media': 0}, 'imported_files': {'config': 1, 'attachments': 2}, 'imported_directories': {'plugins': 765, 'plugin_data': 23, 'config': 19, 't2i_templates': 3, 'webchat': 2, 'temp': 6}, 'warnings': [], 'errors': []}

Checklist / 检查清单

😊 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
👀 我的更改经过了良好的测试，并已在上方提供了“验证步骤”和“运行截图”。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
🤓 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到了 requirements.txt 和 pyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前，按规范化后的时间戳/平台键对重复的 platform_stats 记录进行聚合，以防止唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速方式记录警告，从而避免坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库时抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前聚合重复的 platform_stats 记录，以防止在恢复过程中触发唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限频方式记录警告日志，从而不再因为这些问题导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化。
新增一个异步测试，用于验证在替换模式下，当清空主数据库抛出错误时，导入会快速失败并停止继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前，按规范化后的时间戳/平台键对重复的 platform_stats 记录进行聚合，以防止唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速方式记录警告，从而避免坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库时抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前，按规范化后的时间戳/平台键对重复的 platform_stats 记录进行聚合，以防止唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速方式记录警告，从而避免坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库时抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前聚合重复的 platform_stats 记录，以防止在恢复过程中触发唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限频方式记录警告日志，从而不再因为这些问题导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化。
新增一个异步测试，用于验证在替换模式下，当清空主数据库抛出错误时，导入会快速失败并停止继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前，按规范化后的时间戳/平台键对重复的 platform_stats 记录进行聚合，以防止唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速方式记录警告，从而避免坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库时抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，当清空主数据库失败时，停止备份导入，而不是在主数据库被部分清空的状态下继续执行。
在导入前先聚合重复的 platform_stats 记录，以避免在恢复过程中触发唯一约束错误。
将无效的 platform_stats 计数值视为 0，并通过限流的警告日志进行提示，以防止导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合逻辑以及对无效计数值的处理。
新增一个异步测试，用于验证在替换模式下，当清空主数据库失败时，导入会快速失败并正确报告错误。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前，按规范化后的时间戳/平台键对重复的 platform_stats 记录进行聚合，以防止唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速方式记录警告，从而避免坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库时抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前聚合重复的 platform_stats 记录，以防止在恢复过程中触发唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限频方式记录警告日志，从而不再因为这些问题导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化。
新增一个异步测试，用于验证在替换模式下，当清空主数据库抛出错误时，导入会快速失败并停止继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前，按规范化后的时间戳/平台键对重复的 platform_stats 记录进行聚合，以防止唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速方式记录警告，从而避免坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库时抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前，按规范化后的时间戳/平台键对重复的 platform_stats 记录进行聚合，以防止唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速方式记录警告，从而避免坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库时抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前聚合重复的 platform_stats 记录，以防止在恢复过程中触发唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限频方式记录警告日志，从而不再因为这些问题导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化。
新增一个异步测试，用于验证在替换模式下，当清空主数据库抛出错误时，导入会快速失败并停止继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
在导入前，按规范化后的时间戳/平台键对重复的 platform_stats 记录进行聚合，以防止唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速方式记录警告，从而避免坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库时抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

当在替换模式下清空主数据库失败时，中止备份导入，而不是在部分清空的状态下继续执行。
在导入前，按规范化后的时间戳和平台标识符聚合重复的 platform_stats 行，以避免唯一约束冲突。
将无效的 platform_stats 计数值视为 0，并以限速告警的方式记录日志，从而防止坏数据导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、无效计数处理以及时间戳规范化行为。
新增异步测试，确保在替换模式下，当清空主数据库抛出错误时，导入会快速失败且不会继续执行。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库只被部分清空的状态下继续执行。
通过按规范化后的时间戳和平台键预先聚合 platform_stats 行，以消除导入过程中重复键冲突。
将无效的 platform_stats 计数值视为 0，并记录速率受限的警告日志，从而避免因为坏数据导致导入中止。

测试：

添加单元测试，覆盖 platform_stats 行聚合、时间戳规范化以及无效计数处理，包括对警告日志的速率限制测试。
添加一个异步测试，用于验证在替换模式下，如果在清空主数据库时发生失败，将导致导入快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

在替换模式下强化备份导入行为，并规范化平台统计数据，以避免部分导入以及重复键导致的失败。

Bug 修复：

在替换模式下，如果清空主数据库失败，则中止备份导入，而不是在数据库仅被部分清空的状态下继续执行。
按规范化后的时间戳和平台标识符预聚合 platform_stats 记录，以防止导入过程中出现重复键和唯一性约束冲突。
将无效的 platform_stats 计数值视为 0，并使用限速的告警日志记录，从而避免因数据格式错误导致导入失败。

测试：

新增单元测试，覆盖 platform_stats 行聚合、时间戳规范化、无效计数处理以及告警限速机制。
新增一个异步测试，用于验证在替换模式下清空主数据库失败时，导入会快速失败并跳过后续处理。

Original summary in English

Summary by Sourcery

Harden backup import behavior in replace mode and normalize platform statistics data to avoid partial imports and duplicate key failures.

Bug Fixes:

Abort backup import in replace mode when clearing the main database fails instead of proceeding with a partially cleared state.
Pre-aggregate platform_stats records by normalized timestamp and platform identifiers to prevent duplicate key and unique constraint conflicts during import.
Treat invalid platform_stats count values as 0 with rate-limited warnings so malformed data no longer breaks imports.

Tests:

Add unit tests covering platform_stats row aggregation, timestamp normalization, invalid count handling, and warning rate limiting.
Add an async test verifying that a failure while clearing the main database in replace mode causes the import to fail fast and skip further processing.

- 修复 replace 模式下主库清空失败仍继续导入的问题。 - 导入前对 platform_stats 重复键做聚合（count 累加），并统一时间戳判重格式。 - 非法 count 按 0 处理并告警（限流），补充对应测试。

gemini-code-assist · 2026-03-01T09:05:50Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在显著提升备份导入功能的可靠性和数据完整性，特别是在 replace 模式下。通过解决主数据库清空失败时导入流程不中断的问题，以及在导入 platform_stats 数据时自动处理重复键和非法 count 值，确保了备份导入要么完全成功，要么明确失败，从而避免了潜在的数据损坏或不一致状态。

Highlights

增强备份导入的健壮性: 修复了在 replace 模式下，如果主数据库清空失败，导入过程仍会继续的问题。现在，清空失败会立即抛出 RuntimeError 并终止导入，避免数据处于不一致状态。
优化 platform_stats 数据处理: 在导入 platform_stats 表数据前，会自动聚合重复的键（基于时间戳、平台ID和平台类型），并将它们的 count 值累加，以解决唯一约束错误。
统一时间戳格式: 规范化了 platform_stats 表中时间戳的判断重复格式，确保不同表示形式的时间戳（如带 Z 或 +00:00）能被正确识别和聚合。
处理非法 count 值: 对于 platform_stats 数据中非法的 count 值，现在会将其按 0 处理并记录警告日志（有频率限制），提高了数据导入的容错性。
补充测试用例: 增加了针对上述改进的单元测试，包括重复键聚合、非法 count 值处理以及清空主数据库失败时的导入终止逻辑，确保功能正确性。

Changelog

astrbot/core/backup/importer.py
- 修改了 _clear_main_db 方法，使其在清空数据库表失败时抛出 RuntimeError，从而中断导入流程。
- 新增 _merge_platform_stats_rows 方法，用于在导入 platform_stats 数据前聚合重复的行，并累加其 count 值。
- 新增 _normalize_platform_stats_timestamp 方法，用于统一 platform_stats 中时间戳的格式，以便进行准确的重复键判断。
- 在 _import_main_database 方法中集成了 _merge_platform_stats_rows，对 platform_stats 数据进行预处理。
- 从 datetime 模块导入了 timezone。
tests/test_backup.py
- 新增 test_merge_platform_stats_rows 测试用例，验证 platform_stats 重复键的聚合逻辑。
- 新增 test_merge_platform_stats_rows_warns_on_invalid_count 测试用例，验证非法 count 值按 0 处理并告警的逻辑。
- 新增 test_import_replace_fails_when_clear_main_db_fails 测试用例，验证 replace 模式下主数据库清空失败时导入会终止。
- 从 unittest.mock 模块导入了 patch。

Activity

作者已在服务器环境完成了分片上传和导入验证测试。
测试结果显示分片上传成功 (69/69)，导入完成 (success=True, errors=[])。
日志确认 platform_stats 重复键处理生效，已成功聚合并正常导入。

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

dosubot · 2026-03-01T09:06:27Z

Related Documentation

Checked 1 published document(s) in 1 knowledge base(s). No updates required.

^{How did I do? Any feedback?}

gemini-code-assist

Code Review

本次 PR 旨在增强备份导入的健壮性，主要解决了两个问题：1）replace 模式下清空数据库失败后导入流程会继续的问题；2）platform_stats 表存在重复键时导入会因唯一约束而失败。

整体来看，改动是有效的：

通过在数据库清理失败时抛出异常，有效地中断了导入流程，避免了数据不一致状态。
新增的 _merge_platform_stats_rows 方法能够正确地聚合 platform_stats 中的重复记录，并通过 _normalize_platform_stats_timestamp 方法统一了时间戳格式，增强了判重的准确性。
对 count 字段的非法值进行了容错处理，并增加了相应的测试用例。

代码质量很高，测试覆盖也比较充分。我提出了一些关于代码重构和测试健壮性的建议，供参考。

gemini-code-assist · 2026-03-01T09:08:02Z

astrbot/core/backup/importer.py

+            existing_raw_count = existing.get("count", 0)
+            try:
+                existing_count = int(existing_raw_count)
+            except (TypeError, ValueError):
+                existing_count = 0
+                if invalid_count_warned < invalid_count_warn_limit:
+                    logger.warning(
+                        "platform_stats count 非法，已按 0 处理: "
+                        f"value={existing_raw_count!r}, key={key}"
+                    )
+                    invalid_count_warned += 1
+
+            incoming_raw_count = row.get("count", 0)
+            try:
+                incoming_count = int(incoming_raw_count)
+            except (TypeError, ValueError):
+                incoming_count = 0
+                if invalid_count_warned < invalid_count_warn_limit:
+                    logger.warning(
+                        "platform_stats count 非法，已按 0 处理: "
+                        f"value={incoming_raw_count!r}, key={key}"
+                    )
+                    invalid_count_warned += 1
+            existing["count"] = existing_count + incoming_count


这部分代码中，处理 existing_count 和 incoming_count 的逻辑几乎完全相同，存在代码重复。为了遵循 DRY (Don't Repeat Yourself) 原则并提高代码的可读性和可维护性，建议将这部分逻辑提取到一个独立的辅助方法中。

例如，可以创建一个嵌套函数 _parse_count 来专门负责解析 count 值并处理告警逻辑。

def _parse_count(raw_count: Any) -> int: nonlocal invalid_count_warned try: return int(raw_count) except (TypeError, ValueError): if invalid_count_warned < invalid_count_warn_limit: logger.warning( "platform_stats count 非法，已按 0 处理: " f"value={raw_count!r}, key={key}" ) invalid_count_warned += 1 return 0 existing_count = _parse_count(existing.get("count", 0)) incoming_count = _parse_count(row.get("count", 0)) existing["count"] = existing_count + incoming_count

gemini-code-assist · 2026-03-01T09:08:02Z

tests/test_backup.py

+        first = merged_rows[0]
+        assert first["timestamp"] == "2025-12-13T20:00:00Z"
+        assert first["platform_id"] == "webchat"
+        assert first["platform_type"] == "unknown"
+        assert first["count"] == 17


当前的测试断言依赖于 merged_rows 列表中元素的顺序。虽然在 Python 3.7+ 中字典会保持插入顺序，但为了让测试更加健壮且不受实现细节（如字典遍历顺序）的影响，建议修改断言方式。

可以通过查找特定 platform_id 的行，而不是直接使用索引 [0] 来进行断言。这样可以确保即使未来实现有变，测试的稳定性也不会受到影响。

Suggested change

first = merged_rows[0]

assert first["timestamp"] == "2025-12-13T20:00:00Z"

assert first["platform_id"] == "webchat"

assert first["platform_type"] == "unknown"

assert first["count"] == 17

webchat_row = next(

(r for r in merged_rows if r.get("platform_id") == "webchat"), None

)

assert webchat_row is not None

assert webchat_row["timestamp"] == "2025-12-13T20:00:00Z"

assert webchat_row["platform_id"] == "webchat"

assert webchat_row["platform_type"] == "unknown"

assert webchat_row["count"] == 17

sourcery-ai

Hey - 我这边发现了 4 个问题，并给了一些整体反馈：

在 _merge_platform_stats_rows 中，你只对聚合键使用了规范化后的时间戳，但在存储的行里仍然保留原始的 timestamp；建议同时把存储行里的 timestamp 更新为规范化后的值，避免持久化数据中出现混用格式。
_merge_platform_stats_rows 里的魔法数字 invalid_count_warn_limit = 5 可以提取到模块级常量或者作为参数，这样更便于调节和理解日志行为。

给 AI Agent 的提示词

Please address the comments from this code review:

## Overall Comments
- 在 `_merge_platform_stats_rows` 中，你只对聚合键使用了规范化后的时间戳，但在存储的行里仍然保留原始的 `timestamp`；建议同时把存储行里的 `timestamp` 更新为规范化后的值，避免持久化数据中出现混用格式。
- `_merge_platform_stats_rows` 里的魔法数字 `invalid_count_warn_limit = 5` 可以提取到模块级常量或者作为参数，这样更便于调节和理解日志行为。

## Individual Comments

### Comment 1
<location path="tests/test_backup.py" line_range="311-320" />
<code_context>
+    def test_merge_platform_stats_rows(self):
</code_context>
<issue_to_address>
**suggestion (testing):** 为 `_normalize_platform_stats_timestamp` 增加更聚焦的测试，用于在不依赖合并逻辑的情况下覆盖时间戳的边界场景。

目前 `_merge_platform_stats_rows` 只覆盖了 `_normalize_platform_stats_timestamp` 的一个简单场景（`Z` 与 `+00:00` 的差异）。为了直接覆盖它的分支和边界情况，请增加有针对性的单元测试来覆盖：

- 带 tzinfo 和不带 tzinfo 的 `datetime` 值（验证归一化为 UTC 的行为）
- 字符串时间戳：带 `Z`、带非 UTC 偏移（例如 `+08:00`）、以及 naive 值
- 空字符串和只包含空白的字符串
- `None`（应返回空字符串）
- 完全非法的字符串（`strip` 后按原样返回）

这将帮助明确归一化的约定，并在不依赖合并行为的前提下防止回归。

Suggested implementation:

```python
from datetime import datetime, timezone

```

```python
    def test_normalize_platform_stats_timestamp_datetime_values(self):
        """datetime 值有无 tzinfo 都会被标准化为 UTC"""
        importer = AstrBotImporter(main_db=MagicMock())

        aware = datetime(2025, 1, 1, 12, 0, 0, tzinfo=timezone.utc)
        naive = datetime(2025, 1, 1, 12, 0, 0)

        aware_norm = importer._normalize_platform_stats_timestamp(aware)
        naive_norm = importer._normalize_platform_stats_timestamp(naive)

        # 两者都应该标准化为 UTC，并且结果一致
        assert isinstance(aware_norm, str)
        assert isinstance(naive_norm, str)
        assert aware_norm == naive_norm
        assert aware_norm.endswith("+00:00")

    def test_normalize_platform_stats_timestamp_string_variants(self):
        """字符串时间戳：Z、非 UTC 偏移以及 naive"""
        importer = AstrBotImporter(main_db=MagicMock())

        ts_z = "2025-01-01T12:00:00Z"
        ts_utc_offset = "2025-01-01T12:00:00+00:00"
        ts_non_utc_offset = "2025-01-01T20:00:00+08:00"
        ts_naive = "2025-01-01T12:00:00"

        z_norm = importer._normalize_platform_stats_timestamp(ts_z)
        utc_norm = importer._normalize_platform_stats_timestamp(ts_utc_offset)
        non_utc_norm = importer._normalize_platform_stats_timestamp(ts_non_utc_offset)
        naive_norm = importer._normalize_platform_stats_timestamp(ts_naive)

        # Z 和 +00:00 应该归一化为同一表示（当前实现在 merge 测试中依赖这一点）
        assert z_norm == utc_norm

        # 非 UTC 偏移应当保持其偏移信息（只做 strip / 形式标准化，不强制改为 UTC）
        assert "+08:00" in non_utc_norm

        # naive 字符串应当被原样保留（除去首尾空白）
        assert naive_norm == ts_naive

    def test_normalize_platform_stats_timestamp_empty_none_invalid(self):
        """空字符串、None、以及完全非法的字符串"""
        importer = AstrBotImporter(main_db=MagicMock())

        assert importer._normalize_platform_stats_timestamp("") == ""
        assert importer._normalize_platform_stats_timestamp("   ") == ""
        assert importer._normalize_platform_stats_timestamp(None) == ""

        invalid = "  not-a-date  "
        invalid_norm = importer._normalize_platform_stats_timestamp(invalid)
        # 非法字符串：strip 后原样返回
        assert invalid_norm == "not-a-date"

    def test_merge_platform_stats_rows(self):
        """测试 platform_stats 重复键会在导入前聚合"""
        importer = AstrBotImporter(main_db=MagicMock())

```

如果 `_normalize_platform_stats_timestamp` 当前的行为与上述断言不完全一致（例如：将所有可解析时间统一转换为 UTC、或将所有 UTC 格式统一为带 `Z` 的表示），则需要根据实际实现调整断言：

1. 如果实现将所有 UTC 时间统一为带 `Z` 的格式，那么上面的 `endswith("+00:00")` 应改为断言结果以 `"Z"` 结尾。
2. 如果实现会把非 UTC 偏移（如 `+08:00`）也转换为 UTC，那么针对 `non_utc_norm` 的断言应改为检查它与 UTC 表达式相等（例如与 `z_norm` 一致）。
3. 如果对 naive 字符串有额外解析逻辑（例如假定为 UTC 并补上偏移），则需要将 `naive_norm == ts_naive` 的断言改为匹配实际规范化结果。

请对照 `AstrBotImporter._normalize_platform_stats_timestamp` 的真实实现，微调这些测试中的预期字符串形式，但保留这些测试用例覆盖的边界场景。
</issue_to_address>

### Comment 2
<location path="tests/test_backup.py" line_range="348-357" />
<code_context>
+    def test_merge_platform_stats_rows_warns_on_invalid_count(self):
</code_context>
<issue_to_address>
**suggestion (testing):** 扩展对非法 `count` 处理的测试覆盖，包括“已有行中 count 非法”的场景以及告警频率上限。

你新增的测试覆盖了“新行 `count` 非法、已有行合法”的情况。实现还需要覆盖以下场景：

- *已有* 行中的 `count` 非法，验证只有合法值会参与合并到最终的 `count`，并且会发出一次告警。
- 超过 `invalid_count_warn_limit` 数量的非法 `count` 行，验证告警次数会被限制（例如断言 `warning_mock.call_count == invalid_count_warn_limit`）。

补充这些用例可以同时覆盖两种非法 `count` 分支以及告警节流的行为。

Suggested implementation:

```python
    def test_merge_platform_stats_rows_warns_on_invalid_count(self, mocker):
        """测试 platform_stats count 非法时会告警并按 0 处理（含已有行与告警上限）"""
        importer = AstrBotImporter(main_db=MagicMock())

        # NOTE: 替换成实际 logger.warning 的导入路径
        warning_mock = mocker.patch("astrbot.backup.logger.warning")

        # 1) 新行 count 非法, 已有行合法: 只累加合法的 count, 且产生一次告警
        rows = [
            {
                "timestamp": "2025-12-13T20:00:00+00:00",
                "platform_id": "webchat",
                "platform_type": "unknown",
                "count": 5,
            },
            {
                "timestamp": "2025-12-13T20:00:00+00:00",
                "platform_id": "webchat",
                "platform_type": "unknown",
                "count": "not-a-number",
            },
        ]

        merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)

        assert duplicate_count == 1
        assert len(merged_rows) == 1
        merged = merged_rows[0]
        assert merged["timestamp"] == "2025-12-13T20:00:00Z"
        assert merged["platform_id"] == "webchat"
        assert merged["platform_type"] == "unknown"
        # 只有合法的 5 参与合并, 非法值按 0 处理
        assert merged["count"] == 5
        assert warning_mock.call_count == 1

        warning_mock.reset_mock()

        # 2) 已有行 count 非法, 新行合法: 同样只累加合法的 count, 且产生一次告警
        rows = [
            {
                "timestamp": "2025-12-13T21:00:00+00:00",
                "platform_id": "webchat",
                "platform_type": "unknown",
                "count": "not-a-number",
            },
            {
                "timestamp": "2025-12-13T21:00:00+00:00",
                "platform_id": "webchat",
                "platform_type": "unknown",
                "count": 7,
            },
        ]

        merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)

        assert duplicate_count == 1
        assert len(merged_rows) == 1
        merged = merged_rows[0]
        assert merged["timestamp"] == "2025-12-13T21:00:00Z"
        assert merged["platform_id"] == "webchat"
        assert merged["platform_type"] == "unknown"
        # 只有合法的 7 参与合并
        assert merged["count"] == 7
        assert warning_mock.call_count == 1

        warning_mock.reset_mock()

        # 3) 超过 invalid_count_warn_limit 的非法 count 只告警到上限
        invalid_count_warn_limit = getattr(importer, "invalid_count_warn_limit", 10)

        many_invalid_rows = [
            {
                "timestamp": f"2025-12-13T22:{i:02d}:00+00:00",
                "platform_id": "webchat",
                "platform_type": "unknown",
                "count": "not-a-number",
            }
            for i in range(invalid_count_warn_limit + 5)
        ]

        importer._merge_platform_stats_rows(many_invalid_rows)

        # 告警条数被限制在 invalid_count_warn_limit 之内
        assert warning_mock.call_count == invalid_count_warn_limit

```

1. 将 `mocker.patch("astrbot.backup.logger.warning")` 中的导入路径替换为实际用于发出非法 `count` 告警的 logger 路径，例如可能是 `tests.test_backup.logger.warning` 或模块内的 `astrbot.backup.importer.logger.warning` 等。
2. 确认 `AstrBotImporter` 实例上确实存在 `invalid_count_warn_limit` 属性；如果该上限定义在模块常量中（如 `INVALID_COUNT_WARN_LIMIT`），则需要改为从常量读取并在断言中使用该常量。
3. 如果现有测试类/模块未使用 `pytest-mock` 的 `mocker` fixture，需要根据项目惯例改用 `monkeypatch` 或 `caplog` 等方式注入并统计 warning 调用次数，同时调整测试函数签名。
</issue_to_address>

### Comment 3
<location path="tests/test_backup.py" line_range="374-376" />
<code_context>
+        assert merged_rows[0]["count"] == 5
+        assert warning_mock.called
+
     @pytest.mark.asyncio
     async def test_import_file_not_exists(self, mock_main_db, tmp_path):
         """测试导入不存在的文件"""
</code_context>
<issue_to_address>
**suggestion (testing):** 增强 replace 模式下失败场景的测试，确保在 `_clear_main_db` 失败后不会继续执行后续导入步骤。

为了更好地防止 replace 模式下出现“部分导入”的情况，可以在 `_clear_main_db` 失败时再断言不会调用任何后续的数据导入流程。例如，将 `_import_main_database`（以及如有的话 `_import_knowledge_bases`）打补丁为 `AsyncMock`，并断言在 `_clear_main_db` 抛出异常后这些方法没有被调用。这样可以确认导入在清库失败时会提前终止，避免主库被部分清空、部分导入的状态。

Suggested implementation:

```python
    @pytest.mark.asyncio
    async def test_import_replace_fails_when_clear_main_db_fails(
        self, mock_main_db, tmp_path
    ):
        """测试 replace 模式下主库清空失败会直接终止导入"""
        zip_path = tmp_path / "valid_backup.zip"
        manifest = {
            "version": "1.0.0",
            "platform_stats": [],
            "knowledge_bases": [],
        }

        importer = BackupImporter(main_db=mock_main_db, mode="replace")

        clear_error = RuntimeError("clear main db failed")

        with (
            patch.object(
                importer,
                "_clear_main_db",
                new=AsyncMock(side_effect=clear_error),
            ) as clear_main_db_mock,
            patch.object(
                importer,
                "_import_main_database",
                new=AsyncMock(),
            ) as import_main_db_mock,
            patch.object(
                importer,
                "_import_knowledge_bases",
                new=AsyncMock(),
            ) as import_kb_mock,
        ):
            result = await importer.import_from_zip(zip_path, manifest)

        # 基本断言：导入失败且错误来自清空主库失败
        assert result.success is False
        assert any("清空主库失败" in err or "clear main db failed" in err for err in result.errors)

        # 关键断言：在 _clear_main_db 失败后不再继续执行后续导入步骤，避免部分导入
        clear_main_db_mock.assert_awaited()
        import_main_db_mock.assert_not_awaited()
        import_kb_mock.assert_not_awaited()

```

1. 确保在 `tests/test_backup.py` 文件顶部导入了 `AsyncMock`，例如：`from unittest.mock import AsyncMock, patch`。（`patch` 似乎已经在使用了；保持现有导入，仅在缺失时补充 `AsyncMock`。）
2. 如果实际的 `BackupImporter` 构造函数或 `import_from_zip` 方法签名与 `BackupImporter(main_db=..., mode="replace")` 或 `import_from_zip(zip_path, manifest)` 不同，请按真实 API 调整测试代码。
3. 如果实现中不存在 `_import_knowledge_bases`（或者方法名不同），可以：
   - 将打补丁和断言的名称改为实际的方法名，或者
   - 去掉对应的 patch / 断言，只保留对 `_import_main_database` 的断言。
4. 如果 `_clear_main_db` 失败时的错误信息不同于 `"清空主库失败"` / `"clear main db failed"`，请将 `any(...)` 中的匹配字符串调整为导入器实际返回的错误消息。
</issue_to_address>

### Comment 4
<location path="astrbot/core/backup/importer.py" line_range="523" />
<code_context>

         return imported

+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** 建议抽取用于“带频率限制的 count 解析日志”和“按类型区分的时间戳规范化”的小工具函数，让 `_merge_platform_stats_rows` 更专注在聚合逻辑本身。

你可以保持当前行为不变，但通过抽取一些小工具函数（用于带节流的 count 解析和时间戳规范化）来减少分支和重复逻辑，这样 `_merge_platform_stats_rows` 的主要循环就可以更聚焦于聚合行为。

### 1. 抽取带告警节流的 count 解析函数

目前对 `existing` 和 `incoming` 的处理分支中都重复了 `try/except` 和告警逻辑。可以抽出一个小的 helper，通过闭包持有告警计数，在保持行为完全一致的前提下简化主循环：

```python
def _make_platform_stats_count_parser(self):
    invalid_count_warned = 0
    invalid_count_warn_limit = 5

    def parse_count(raw: Any, key: tuple[str, str, str]) -> int:
        nonlocal invalid_count_warned
        try:
            return int(raw)
        except (TypeError, ValueError):
            if invalid_count_warned < invalid_count_warn_limit:
                logger.warning(
                    "platform_stats count 非法，已按 0 处理: "
                    f"value={raw!r}, key={key}"
                )
                invalid_count_warned += 1
            return 0

    return parse_count
```

在 `_merge_platform_stats_rows` 中使用它：

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> tuple[list[dict[str, Any]], int]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    timestamp_cache: dict[str, str] = {}
    duplicate_count = 0
    parse_count = self._make_platform_stats_count_parser()

    for row in rows:
        raw_timestamp = row.get("timestamp")
        if isinstance(raw_timestamp, str):
            normalized_timestamp = timestamp_cache.get(raw_timestamp)
            if normalized_timestamp is None:
                normalized_timestamp = self._normalize_platform_stats_timestamp(
                    raw_timestamp
                )
                timestamp_cache[raw_timestamp] = normalized_timestamp
        else:
            normalized_timestamp = self._normalize_platform_stats_timestamp(
                raw_timestamp
            )

        key = (
            normalized_timestamp,
            str(row.get("platform_id")),
            str(row.get("platform_type")),
        )
        existing = merged.get(key)
        if existing is None:
            merged[key] = dict(row)
            continue

        duplicate_count += 1
        existing_count = parse_count(existing.get("count", 0), key)
        incoming_count = parse_count(row.get("count", 0), key)
        existing["count"] = existing_count + incoming_count

    return list(merged.values()), duplicate_count
```

这样可以保持所有行为不变（相同的告警内容、相同的上限、非法时仍然按 0 处理），但主循环会更易阅读和维护。

### 2. 按类型拆分时间戳规范化逻辑

`_normalize_platform_stats_timestamp` 当前逻辑是正确的，但稍显紧凑。你可以在保持返回值语义不变的前提下，把 `datetime` 和字符串的处理分别拆到两个小 helper 中，减少嵌套：

```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str:
    if isinstance(value, datetime):
        return self._normalize_platform_stats_dt(value)
    if isinstance(value, str):
        return self._normalize_platform_stats_str(value)
    if value is None:
        return ""
    return str(value)

def _normalize_platform_stats_dt(self, dt: datetime) -> str:
    if dt.tzinfo is not None:
        dt = dt.astimezone(timezone.utc)
    return dt.isoformat()

def _normalize_platform_stats_str(self, value: str) -> str:
    timestamp = value.strip()
    if not timestamp:
        return ""
    if timestamp.endswith("Z"):
        timestamp = f"{timestamp[:-1]}+00:00"
    try:
        dt = datetime.fromisoformat(timestamp)
        if dt.tzinfo is not None:
            dt = dt.astimezone(timezone.utc)
        return dt.isoformat()
    except ValueError:
        return value.strip()
```

这会保留现有的所有语义（空字符串 vs `None` vs 非法字符串），同时让归一化函数和合并循环都更容易理解和维护。
</issue_to_address>

Sourcery 对开源项目是免费的——如果你觉得这次评审对你有帮助，欢迎分享给更多人 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，你的反馈会用来改进后续的评审质量。}

Original comment in English

Hey - I've found 4 issues, and left some high level feedback:

In _merge_platform_stats_rows, you normalize the timestamp only for the aggregation key but keep the original timestamp value in the stored row; consider updating the stored timestamp to the normalized value as well to avoid mixed formats in persisted data.
The invalid_count_warn_limit = 5 magic number inside _merge_platform_stats_rows could be extracted to a module-level constant or parameter to make the logging behavior easier to tune and understand.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `_merge_platform_stats_rows`, you normalize the timestamp only for the aggregation key but keep the original `timestamp` value in the stored row; consider updating the stored `timestamp` to the normalized value as well to avoid mixed formats in persisted data.
- The `invalid_count_warn_limit = 5` magic number inside `_merge_platform_stats_rows` could be extracted to a module-level constant or parameter to make the logging behavior easier to tune and understand.

## Individual Comments

### Comment 1
<location path="tests/test_backup.py" line_range="311-320" />
<code_context>
+    def test_merge_platform_stats_rows(self):
</code_context>
<issue_to_address>
**suggestion (testing):** Add focused tests for `_normalize_platform_stats_timestamp` to cover timestamp edge cases independently of the merge logic.

Currently `_merge_platform_stats_rows` only exercises a simple `_normalize_platform_stats_timestamp` case (`Z` vs `+00:00`). To cover its branches and edge cases directly, please add targeted unit tests for:

- `datetime` values with and without tzinfo (verify UTC normalization)
- String timestamps with `Z`, with non-UTC offsets (e.g. `+08:00`), and naive values
- Empty and whitespace-only strings
- `None` (should return empty string)
- Completely invalid strings (should be returned as-is after `strip`)

This will clarify the normalization contract and prevent regressions independently of the merge behavior.

Suggested implementation:

```python
from datetime import datetime, timezone

```

```python
    def test_normalize_platform_stats_timestamp_datetime_values(self):
        """datetime 值有无 tzinfo 都会被标准化为 UTC"""
        importer = AstrBotImporter(main_db=MagicMock())

        aware = datetime(2025, 1, 1, 12, 0, 0, tzinfo=timezone.utc)
        naive = datetime(2025, 1, 1, 12, 0, 0)

        aware_norm = importer._normalize_platform_stats_timestamp(aware)
        naive_norm = importer._normalize_platform_stats_timestamp(naive)

        # 两者都应该标准化为 UTC，并且结果一致
        assert isinstance(aware_norm, str)
        assert isinstance(naive_norm, str)
        assert aware_norm == naive_norm
        assert aware_norm.endswith("+00:00")

    def test_normalize_platform_stats_timestamp_string_variants(self):
        """字符串时间戳：Z、非 UTC 偏移以及 naive"""
        importer = AstrBotImporter(main_db=MagicMock())

        ts_z = "2025-01-01T12:00:00Z"
        ts_utc_offset = "2025-01-01T12:00:00+00:00"
        ts_non_utc_offset = "2025-01-01T20:00:00+08:00"
        ts_naive = "2025-01-01T12:00:00"

        z_norm = importer._normalize_platform_stats_timestamp(ts_z)
        utc_norm = importer._normalize_platform_stats_timestamp(ts_utc_offset)
        non_utc_norm = importer._normalize_platform_stats_timestamp(ts_non_utc_offset)
        naive_norm = importer._normalize_platform_stats_timestamp(ts_naive)

        # Z 和 +00:00 应该归一化为同一表示（当前实现在 merge 测试中依赖这一点）
        assert z_norm == utc_norm

        # 非 UTC 偏移应当保持其偏移信息（只做 strip / 形式标准化，不强制改为 UTC）
        assert "+08:00" in non_utc_norm

        # naive 字符串应当被原样保留（除去首尾空白）
        assert naive_norm == ts_naive

    def test_normalize_platform_stats_timestamp_empty_none_invalid(self):
        """空字符串、None、以及完全非法的字符串"""
        importer = AstrBotImporter(main_db=MagicMock())

        assert importer._normalize_platform_stats_timestamp("") == ""
        assert importer._normalize_platform_stats_timestamp("   ") == ""
        assert importer._normalize_platform_stats_timestamp(None) == ""

        invalid = "  not-a-date  "
        invalid_norm = importer._normalize_platform_stats_timestamp(invalid)
        # 非法字符串：strip 后原样返回
        assert invalid_norm == "not-a-date"

    def test_merge_platform_stats_rows(self):
        """测试 platform_stats 重复键会在导入前聚合"""
        importer = AstrBotImporter(main_db=MagicMock())

```

如果 `_normalize_platform_stats_timestamp` 当前的行为与上述断言不完全一致（例如：将所有可解析时间统一转换为 UTC、或将所有 UTC 格式统一为带 `Z` 的表示），则需要根据实际实现调整断言：

1. 如果实现将所有 UTC 时间统一为带 `Z` 的格式，那么上面的 `endswith("+00:00")` 应改为断言结果以 `"Z"` 结尾。
2. 如果实现会把非 UTC 偏移（如 `+08:00`）也转换为 UTC，那么针对 `non_utc_norm` 的断言应改为检查它与 UTC 表达式相等（例如与 `z_norm` 一致）。
3. 如果对 naive 字符串有额外解析逻辑（例如假定为 UTC 并补上偏移），则需要将 `naive_norm == ts_naive` 的断言改为匹配实际规范化结果。

请对照 `AstrBotImporter._normalize_platform_stats_timestamp` 的真实实现，微调这些测试中的预期字符串形式，但保留这些测试用例覆盖的边界场景。
</issue_to_address>

### Comment 2
<location path="tests/test_backup.py" line_range="348-357" />
<code_context>
+    def test_merge_platform_stats_rows_warns_on_invalid_count(self):
</code_context>
<issue_to_address>
**suggestion (testing):** Extend coverage of invalid `count` handling, including existing-row invalid values and the warning rate limit.

The new test covers an invalid `count` in the incoming row with a valid existing row. The implementation also needs coverage for:

- An invalid `count` in the *existing* row, verifying that only the valid value contributes to the merged `count` and a warning is emitted.
- More than `invalid_count_warn_limit` rows with invalid counts, verifying that warnings are capped (e.g., `warning_mock.call_count == invalid_count_warn_limit`).

Adding these cases would exercise both invalid-count branches and the warning rate-limiting behavior.

Suggested implementation:

```python
    def test_merge_platform_stats_rows_warns_on_invalid_count(self, mocker):
        """测试 platform_stats count 非法时会告警并按 0 处理（含已有行与告警上限）"""
        importer = AstrBotImporter(main_db=MagicMock())

        # NOTE: 替换成实际 logger.warning 的导入路径
        warning_mock = mocker.patch("astrbot.backup.logger.warning")

        # 1) 新行 count 非法, 已有行合法: 只累加合法的 count, 且产生一次告警
        rows = [
            {
                "timestamp": "2025-12-13T20:00:00+00:00",
                "platform_id": "webchat",
                "platform_type": "unknown",
                "count": 5,
            },
            {
                "timestamp": "2025-12-13T20:00:00+00:00",
                "platform_id": "webchat",
                "platform_type": "unknown",
                "count": "not-a-number",
            },
        ]

        merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)

        assert duplicate_count == 1
        assert len(merged_rows) == 1
        merged = merged_rows[0]
        assert merged["timestamp"] == "2025-12-13T20:00:00Z"
        assert merged["platform_id"] == "webchat"
        assert merged["platform_type"] == "unknown"
        # 只有合法的 5 参与合并, 非法值按 0 处理
        assert merged["count"] == 5
        assert warning_mock.call_count == 1

        warning_mock.reset_mock()

        # 2) 已有行 count 非法, 新行合法: 同样只累加合法的 count, 且产生一次告警
        rows = [
            {
                "timestamp": "2025-12-13T21:00:00+00:00",
                "platform_id": "webchat",
                "platform_type": "unknown",
                "count": "not-a-number",
            },
            {
                "timestamp": "2025-12-13T21:00:00+00:00",
                "platform_id": "webchat",
                "platform_type": "unknown",
                "count": 7,
            },
        ]

        merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows)

        assert duplicate_count == 1
        assert len(merged_rows) == 1
        merged = merged_rows[0]
        assert merged["timestamp"] == "2025-12-13T21:00:00Z"
        assert merged["platform_id"] == "webchat"
        assert merged["platform_type"] == "unknown"
        # 只有合法的 7 参与合并
        assert merged["count"] == 7
        assert warning_mock.call_count == 1

        warning_mock.reset_mock()

        # 3) 超过 invalid_count_warn_limit 的非法 count 只告警到上限
        invalid_count_warn_limit = getattr(importer, "invalid_count_warn_limit", 10)

        many_invalid_rows = [
            {
                "timestamp": f"2025-12-13T22:{i:02d}:00+00:00",
                "platform_id": "webchat",
                "platform_type": "unknown",
                "count": "not-a-number",
            }
            for i in range(invalid_count_warn_limit + 5)
        ]

        importer._merge_platform_stats_rows(many_invalid_rows)

        # 告警条数被限制在 invalid_count_warn_limit 之内
        assert warning_mock.call_count == invalid_count_warn_limit

```

1. 将 `mocker.patch("astrbot.backup.logger.warning")` 中的导入路径替换为实际用于发出非法 `count` 告警的 logger 路径，例如可能是 `tests.test_backup.logger.warning` 或模块内的 `astrbot.backup.importer.logger.warning` 等。
2. 确认 `AstrBotImporter` 实例上确实存在 `invalid_count_warn_limit` 属性；如果该上限定义在模块常量中（如 `INVALID_COUNT_WARN_LIMIT`），则需要改为从常量读取并在断言中使用该常量。
3. 如果现有测试类/模块未使用 `pytest-mock` 的 `mocker` fixture，需要根据项目惯例改用 `monkeypatch` 或 `caplog` 等方式注入并统计 warning 调用次数，同时调整测试函数签名。
</issue_to_address>

### Comment 3
<location path="tests/test_backup.py" line_range="374-376" />
<code_context>
+        assert merged_rows[0]["count"] == 5
+        assert warning_mock.called
+
     @pytest.mark.asyncio
     async def test_import_file_not_exists(self, mock_main_db, tmp_path):
         """测试导入不存在的文件"""
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen the replace-mode failure test by asserting that no further import steps are executed after `_clear_main_db` fails.

To better protect against partial imports in `replace` mode, also assert that no data-import routines run when `_clear_main_db` fails. For instance, patch `_import_main_database` (and `_import_knowledge_bases` if applicable) as `AsyncMock`s and assert they are not called after `_clear_main_db` raises. This will confirm the import aborts early and avoids a partially cleared/partially imported DB state.

Suggested implementation:

```python
    @pytest.mark.asyncio
    async def test_import_replace_fails_when_clear_main_db_fails(
        self, mock_main_db, tmp_path
    ):
        """测试 replace 模式下主库清空失败会直接终止导入"""
        zip_path = tmp_path / "valid_backup.zip"
        manifest = {
            "version": "1.0.0",
            "platform_stats": [],
            "knowledge_bases": [],
        }

        importer = BackupImporter(main_db=mock_main_db, mode="replace")

        clear_error = RuntimeError("clear main db failed")

        with (
            patch.object(
                importer,
                "_clear_main_db",
                new=AsyncMock(side_effect=clear_error),
            ) as clear_main_db_mock,
            patch.object(
                importer,
                "_import_main_database",
                new=AsyncMock(),
            ) as import_main_db_mock,
            patch.object(
                importer,
                "_import_knowledge_bases",
                new=AsyncMock(),
            ) as import_kb_mock,
        ):
            result = await importer.import_from_zip(zip_path, manifest)

        # 基本断言：导入失败且错误来自清空主库失败
        assert result.success is False
        assert any("清空主库失败" in err or "clear main db failed" in err for err in result.errors)

        # 关键断言：在 _clear_main_db 失败后不再继续执行后续导入步骤，避免部分导入
        clear_main_db_mock.assert_awaited()
        import_main_db_mock.assert_not_awaited()
        import_kb_mock.assert_not_awaited()

```

1. Ensure `AsyncMock` is imported at the top of `tests/test_backup.py`, for example:
   `from unittest.mock import AsyncMock, patch`. (`patch` seems already used; keep existing imports and just add `AsyncMock` if missing.)
2. Adjust the `BackupImporter` construction and `import_from_zip` call if the actual API differs (e.g., if the constructor or method signature is not `BackupImporter(main_db=..., mode="replace")` or `import_from_zip(zip_path, manifest)`).
3. If the implementation does not have `_import_knowledge_bases` (or uses a different name), either:
   - Update the patched attribute name to match the real method, or
   - Remove the corresponding patch/assert and only assert on `_import_main_database`.
4. If the error message for `_clear_main_db` failures differs from `"清空主库失败"` / `"clear main db failed"`, adjust the `any(...)` error assertion string to match the actual message used by the importer.
</issue_to_address>

### Comment 4
<location path="astrbot/core/backup/importer.py" line_range="523" />
<code_context>

         return imported

+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting helpers for count parsing with throttled logging and type-specific timestamp normalization to keep `_merge_platform_stats_rows` focused on aggregation logic.

You can keep the current behavior but reduce branching and duplication by extracting small helpers for count parsing (with throttled logging) and timestamp normalization. This keeps `_merge_platform_stats_rows` focused on aggregation.

### 1. Extract throttled count parsing

Right now, the `existing` and `incoming` branches duplicate `try/except` and warning logic. A tiny helper that closes over the warning counter keeps behavior identical but simplifies the main loop:

```python
def _make_platform_stats_count_parser(self):
    invalid_count_warned = 0
    invalid_count_warn_limit = 5

    def parse_count(raw: Any, key: tuple[str, str, str]) -> int:
        nonlocal invalid_count_warned
        try:
            return int(raw)
        except (TypeError, ValueError):
            if invalid_count_warned < invalid_count_warn_limit:
                logger.warning(
                    "platform_stats count 非法，已按 0 处理: "
                    f"value={raw!r}, key={key}"
                )
                invalid_count_warned += 1
            return 0

    return parse_count
```

Use it in `_merge_platform_stats_rows`:

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> tuple[list[dict[str, Any]], int]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    timestamp_cache: dict[str, str] = {}
    duplicate_count = 0
    parse_count = self._make_platform_stats_count_parser()

    for row in rows:
        raw_timestamp = row.get("timestamp")
        if isinstance(raw_timestamp, str):
            normalized_timestamp = timestamp_cache.get(raw_timestamp)
            if normalized_timestamp is None:
                normalized_timestamp = self._normalize_platform_stats_timestamp(
                    raw_timestamp
                )
                timestamp_cache[raw_timestamp] = normalized_timestamp
        else:
            normalized_timestamp = self._normalize_platform_stats_timestamp(
                raw_timestamp
            )

        key = (
            normalized_timestamp,
            str(row.get("platform_id")),
            str(row.get("platform_type")),
        )
        existing = merged.get(key)
        if existing is None:
            merged[key] = dict(row)
            continue

        duplicate_count += 1
        existing_count = parse_count(existing.get("count", 0), key)
        incoming_count = parse_count(row.get("count", 0), key)
        existing["count"] = existing_count + incoming_count

    return list(merged.values()), duplicate_count
```

Behavior stays the same (same warning text, same limit, same default to `0`) but the core loop is much easier to scan.

### 2. Split timestamp normalization by type

`_normalize_platform_stats_timestamp` is correct but dense. You can keep the same return values while separating datetime and string handling into small helpers, which reduces nesting:

```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str:
    if isinstance(value, datetime):
        return self._normalize_platform_stats_dt(value)
    if isinstance(value, str):
        return self._normalize_platform_stats_str(value)
    if value is None:
        return ""
    return str(value)

def _normalize_platform_stats_dt(self, dt: datetime) -> str:
    if dt.tzinfo is not None:
        dt = dt.astimezone(timezone.utc)
    return dt.isoformat()

def _normalize_platform_stats_str(self, value: str) -> str:
    timestamp = value.strip()
    if not timestamp:
        return ""
    if timestamp.endswith("Z"):
        timestamp = f"{timestamp[:-1]}+00:00"
    try:
        dt = datetime.fromisoformat(timestamp)
        if dt.tzinfo is not None:
            dt = dt.astimezone(timezone.utc)
        return dt.isoformat()
    except ValueError:
        return value.strip()
```

This keeps all existing semantics (empty vs `None` vs invalid string) but makes both the normalization function and the merge loop easier to reason about.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-03-01T09:08:04Z

tests/test_backup.py

+    def test_merge_platform_stats_rows(self):
+        """测试 platform_stats 重复键会在导入前聚合"""
+        importer = AstrBotImporter(main_db=MagicMock())
+        rows = [
+            {
+                "id": 1,
+                "timestamp": "2025-12-13T20:00:00Z",
+                "platform_id": "webchat",
+                "platform_type": "unknown",
+                "count": 14,


suggestion (testing): 为 _normalize_platform_stats_timestamp 增加更聚焦的测试，用于在不依赖合并逻辑的情况下覆盖时间戳的边界场景。

目前 _merge_platform_stats_rows 只覆盖了 _normalize_platform_stats_timestamp 的一个简单场景（Z 与 +00:00 的差异）。为了直接覆盖它的分支和边界情况，请增加有针对性的单元测试来覆盖：

带 tzinfo 和不带 tzinfo 的 datetime 值（验证归一化为 UTC 的行为）

字符串时间戳：带 Z、带非 UTC 偏移（例如 +08:00）、以及 naive 值

空字符串和只包含空白的字符串

None（应返回空字符串）

完全非法的字符串（strip 后按原样返回）

这将帮助明确归一化的约定，并在不依赖合并行为的前提下防止回归。

Suggested implementation:

from datetime import datetime, timezone

def test_normalize_platform_stats_timestamp_datetime_values(self): """datetime 值有无 tzinfo 都会被标准化为 UTC""" importer = AstrBotImporter(main_db=MagicMock()) aware = datetime(2025, 1, 1, 12, 0, 0, tzinfo=timezone.utc) naive = datetime(2025, 1, 1, 12, 0, 0) aware_norm = importer._normalize_platform_stats_timestamp(aware) naive_norm = importer._normalize_platform_stats_timestamp(naive) # 两者都应该标准化为 UTC，并且结果一致 assert isinstance(aware_norm, str) assert isinstance(naive_norm, str) assert aware_norm == naive_norm assert aware_norm.endswith("+00:00") def test_normalize_platform_stats_timestamp_string_variants(self): """字符串时间戳：Z、非 UTC 偏移以及 naive""" importer = AstrBotImporter(main_db=MagicMock()) ts_z = "2025-01-01T12:00:00Z" ts_utc_offset = "2025-01-01T12:00:00+00:00" ts_non_utc_offset = "2025-01-01T20:00:00+08:00" ts_naive = "2025-01-01T12:00:00" z_norm = importer._normalize_platform_stats_timestamp(ts_z) utc_norm = importer._normalize_platform_stats_timestamp(ts_utc_offset) non_utc_norm = importer._normalize_platform_stats_timestamp(ts_non_utc_offset) naive_norm = importer._normalize_platform_stats_timestamp(ts_naive) # Z 和 +00:00 应该归一化为同一表示（当前实现在 merge 测试中依赖这一点） assert z_norm == utc_norm # 非 UTC 偏移应当保持其偏移信息（只做 strip / 形式标准化，不强制改为 UTC） assert "+08:00" in non_utc_norm # naive 字符串应当被原样保留（除去首尾空白） assert naive_norm == ts_naive def test_normalize_platform_stats_timestamp_empty_none_invalid(self): """空字符串、None、以及完全非法的字符串""" importer = AstrBotImporter(main_db=MagicMock()) assert importer._normalize_platform_stats_timestamp("") == "" assert importer._normalize_platform_stats_timestamp(" ") == "" assert importer._normalize_platform_stats_timestamp(None) == "" invalid = " not-a-date " invalid_norm = importer._normalize_platform_stats_timestamp(invalid) # 非法字符串：strip 后原样返回 assert invalid_norm == "not-a-date" def test_merge_platform_stats_rows(self): """测试 platform_stats 重复键会在导入前聚合""" importer = AstrBotImporter(main_db=MagicMock())

如果 _normalize_platform_stats_timestamp 当前的行为与上述断言不完全一致（例如：将所有可解析时间统一转换为 UTC、或将所有 UTC 格式统一为带 Z 的表示），则需要根据实际实现调整断言：

如果实现将所有 UTC 时间统一为带 Z 的格式，那么上面的 endswith("+00:00") 应改为断言结果以 "Z" 结尾。

如果实现会把非 UTC 偏移（如 +08:00）也转换为 UTC，那么针对 non_utc_norm 的断言应改为检查它与 UTC 表达式相等（例如与 z_norm 一致）。

如果对 naive 字符串有额外解析逻辑（例如假定为 UTC 并补上偏移），则需要将 naive_norm == ts_naive 的断言改为匹配实际规范化结果。

请对照 AstrBotImporter._normalize_platform_stats_timestamp 的真实实现，微调这些测试中的预期字符串形式，但保留这些测试用例覆盖的边界场景。

Original comment in English

suggestion (testing): Add focused tests for _normalize_platform_stats_timestamp to cover timestamp edge cases independently of the merge logic.

Currently _merge_platform_stats_rows only exercises a simple _normalize_platform_stats_timestamp case (Z vs +00:00). To cover its branches and edge cases directly, please add targeted unit tests for:

datetime values with and without tzinfo (verify UTC normalization)

String timestamps with Z, with non-UTC offsets (e.g. +08:00), and naive values

Empty and whitespace-only strings

None (should return empty string)

Completely invalid strings (should be returned as-is after strip)

This will clarify the normalization contract and prevent regressions independently of the merge behavior.

Suggested implementation:

from datetime import datetime, timezone

def test_normalize_platform_stats_timestamp_datetime_values(self): """datetime 值有无 tzinfo 都会被标准化为 UTC""" importer = AstrBotImporter(main_db=MagicMock()) aware = datetime(2025, 1, 1, 12, 0, 0, tzinfo=timezone.utc) naive = datetime(2025, 1, 1, 12, 0, 0) aware_norm = importer._normalize_platform_stats_timestamp(aware) naive_norm = importer._normalize_platform_stats_timestamp(naive) # 两者都应该标准化为 UTC，并且结果一致 assert isinstance(aware_norm, str) assert isinstance(naive_norm, str) assert aware_norm == naive_norm assert aware_norm.endswith("+00:00") def test_normalize_platform_stats_timestamp_string_variants(self): """字符串时间戳：Z、非 UTC 偏移以及 naive""" importer = AstrBotImporter(main_db=MagicMock()) ts_z = "2025-01-01T12:00:00Z" ts_utc_offset = "2025-01-01T12:00:00+00:00" ts_non_utc_offset = "2025-01-01T20:00:00+08:00" ts_naive = "2025-01-01T12:00:00" z_norm = importer._normalize_platform_stats_timestamp(ts_z) utc_norm = importer._normalize_platform_stats_timestamp(ts_utc_offset) non_utc_norm = importer._normalize_platform_stats_timestamp(ts_non_utc_offset) naive_norm = importer._normalize_platform_stats_timestamp(ts_naive) # Z 和 +00:00 应该归一化为同一表示（当前实现在 merge 测试中依赖这一点） assert z_norm == utc_norm # 非 UTC 偏移应当保持其偏移信息（只做 strip / 形式标准化，不强制改为 UTC） assert "+08:00" in non_utc_norm # naive 字符串应当被原样保留（除去首尾空白） assert naive_norm == ts_naive def test_normalize_platform_stats_timestamp_empty_none_invalid(self): """空字符串、None、以及完全非法的字符串""" importer = AstrBotImporter(main_db=MagicMock()) assert importer._normalize_platform_stats_timestamp("") == "" assert importer._normalize_platform_stats_timestamp(" ") == "" assert importer._normalize_platform_stats_timestamp(None) == "" invalid = " not-a-date " invalid_norm = importer._normalize_platform_stats_timestamp(invalid) # 非法字符串：strip 后原样返回 assert invalid_norm == "not-a-date" def test_merge_platform_stats_rows(self): """测试 platform_stats 重复键会在导入前聚合""" importer = AstrBotImporter(main_db=MagicMock())

如果 _normalize_platform_stats_timestamp 当前的行为与上述断言不完全一致（例如：将所有可解析时间统一转换为 UTC、或将所有 UTC 格式统一为带 Z 的表示），则需要根据实际实现调整断言：

如果实现将所有 UTC 时间统一为带 Z 的格式，那么上面的 endswith("+00:00") 应改为断言结果以 "Z" 结尾。

如果实现会把非 UTC 偏移（如 +08:00）也转换为 UTC，那么针对 non_utc_norm 的断言应改为检查它与 UTC 表达式相等（例如与 z_norm 一致）。

如果对 naive 字符串有额外解析逻辑（例如假定为 UTC 并补上偏移），则需要将 naive_norm == ts_naive 的断言改为匹配实际规范化结果。

请对照 AstrBotImporter._normalize_platform_stats_timestamp 的真实实现，微调这些测试中的预期字符串形式，但保留这些测试用例覆盖的边界场景。

sourcery-ai · 2026-03-01T09:08:04Z

tests/test_backup.py

+    def test_merge_platform_stats_rows_warns_on_invalid_count(self):
+        """测试 platform_stats count 非法时会告警并按 0 处理"""
+        importer = AstrBotImporter(main_db=MagicMock())
+        rows = [
+            {
+                "timestamp": "2025-12-13T20:00:00+00:00",
+                "platform_id": "webchat",
+                "platform_type": "unknown",
+                "count": 5,
+            },


suggestion (testing): 扩展对非法 count 处理的测试覆盖，包括“已有行中 count 非法”的场景以及告警频率上限。

你新增的测试覆盖了“新行 count 非法、已有行合法”的情况。实现还需要覆盖以下场景：

已有行中的 count 非法，验证只有合法值会参与合并到最终的 count，并且会发出一次告警。

超过 invalid_count_warn_limit 数量的非法 count 行，验证告警次数会被限制（例如断言 warning_mock.call_count == invalid_count_warn_limit）。

补充这些用例可以同时覆盖两种非法 count 分支以及告警节流的行为。

Suggested implementation:

def test_merge_platform_stats_rows_warns_on_invalid_count(self, mocker): """测试 platform_stats count 非法时会告警并按 0 处理（含已有行与告警上限）""" importer = AstrBotImporter(main_db=MagicMock()) # NOTE: 替换成实际 logger.warning 的导入路径 warning_mock = mocker.patch("astrbot.backup.logger.warning") # 1) 新行 count 非法, 已有行合法: 只累加合法的 count, 且产生一次告警 rows = [ { "timestamp": "2025-12-13T20:00:00+00:00", "platform_id": "webchat", "platform_type": "unknown", "count": 5, }, { "timestamp": "2025-12-13T20:00:00+00:00", "platform_id": "webchat", "platform_type": "unknown", "count": "not-a-number", }, ] merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows) assert duplicate_count == 1 assert len(merged_rows) == 1 merged = merged_rows[0] assert merged["timestamp"] == "2025-12-13T20:00:00Z" assert merged["platform_id"] == "webchat" assert merged["platform_type"] == "unknown" # 只有合法的 5 参与合并, 非法值按 0 处理 assert merged["count"] == 5 assert warning_mock.call_count == 1 warning_mock.reset_mock() # 2) 已有行 count 非法, 新行合法: 同样只累加合法的 count, 且产生一次告警 rows = [ { "timestamp": "2025-12-13T21:00:00+00:00", "platform_id": "webchat", "platform_type": "unknown", "count": "not-a-number", }, { "timestamp": "2025-12-13T21:00:00+00:00", "platform_id": "webchat", "platform_type": "unknown", "count": 7, }, ] merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows) assert duplicate_count == 1 assert len(merged_rows) == 1 merged = merged_rows[0] assert merged["timestamp"] == "2025-12-13T21:00:00Z" assert merged["platform_id"] == "webchat" assert merged["platform_type"] == "unknown" # 只有合法的 7 参与合并 assert merged["count"] == 7 assert warning_mock.call_count == 1 warning_mock.reset_mock() # 3) 超过 invalid_count_warn_limit 的非法 count 只告警到上限 invalid_count_warn_limit = getattr(importer, "invalid_count_warn_limit", 10) many_invalid_rows = [ { "timestamp": f"2025-12-13T22:{i:02d}:00+00:00", "platform_id": "webchat", "platform_type": "unknown", "count": "not-a-number", } for i in range(invalid_count_warn_limit + 5) ] importer._merge_platform_stats_rows(many_invalid_rows) # 告警条数被限制在 invalid_count_warn_limit 之内 assert warning_mock.call_count == invalid_count_warn_limit

将 mocker.patch("astrbot.backup.logger.warning") 中的导入路径替换为实际用于发出非法 count 告警的 logger 路径，例如可能是 tests.test_backup.logger.warning 或模块内的 astrbot.backup.importer.logger.warning 等。

确认 AstrBotImporter 实例上确实存在 invalid_count_warn_limit 属性；如果该上限定义在模块常量中（如 INVALID_COUNT_WARN_LIMIT），则需要改为从常量读取并在断言中使用该常量。

如果现有测试类/模块未使用 pytest-mock 的 mocker fixture，需要根据项目惯例改用 monkeypatch 或 caplog 等方式注入并统计 warning 调用次数，同时调整测试函数签名。

Original comment in English

suggestion (testing): Extend coverage of invalid count handling, including existing-row invalid values and the warning rate limit.

The new test covers an invalid count in the incoming row with a valid existing row. The implementation also needs coverage for:

An invalid count in the existing row, verifying that only the valid value contributes to the merged count and a warning is emitted.

More than invalid_count_warn_limit rows with invalid counts, verifying that warnings are capped (e.g., warning_mock.call_count == invalid_count_warn_limit).

Adding these cases would exercise both invalid-count branches and the warning rate-limiting behavior.

Suggested implementation:

def test_merge_platform_stats_rows_warns_on_invalid_count(self, mocker): """测试 platform_stats count 非法时会告警并按 0 处理（含已有行与告警上限）""" importer = AstrBotImporter(main_db=MagicMock()) # NOTE: 替换成实际 logger.warning 的导入路径 warning_mock = mocker.patch("astrbot.backup.logger.warning") # 1) 新行 count 非法, 已有行合法: 只累加合法的 count, 且产生一次告警 rows = [ { "timestamp": "2025-12-13T20:00:00+00:00", "platform_id": "webchat", "platform_type": "unknown", "count": 5, }, { "timestamp": "2025-12-13T20:00:00+00:00", "platform_id": "webchat", "platform_type": "unknown", "count": "not-a-number", }, ] merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows) assert duplicate_count == 1 assert len(merged_rows) == 1 merged = merged_rows[0] assert merged["timestamp"] == "2025-12-13T20:00:00Z" assert merged["platform_id"] == "webchat" assert merged["platform_type"] == "unknown" # 只有合法的 5 参与合并, 非法值按 0 处理 assert merged["count"] == 5 assert warning_mock.call_count == 1 warning_mock.reset_mock() # 2) 已有行 count 非法, 新行合法: 同样只累加合法的 count, 且产生一次告警 rows = [ { "timestamp": "2025-12-13T21:00:00+00:00", "platform_id": "webchat", "platform_type": "unknown", "count": "not-a-number", }, { "timestamp": "2025-12-13T21:00:00+00:00", "platform_id": "webchat", "platform_type": "unknown", "count": 7, }, ] merged_rows, duplicate_count = importer._merge_platform_stats_rows(rows) assert duplicate_count == 1 assert len(merged_rows) == 1 merged = merged_rows[0] assert merged["timestamp"] == "2025-12-13T21:00:00Z" assert merged["platform_id"] == "webchat" assert merged["platform_type"] == "unknown" # 只有合法的 7 参与合并 assert merged["count"] == 7 assert warning_mock.call_count == 1 warning_mock.reset_mock() # 3) 超过 invalid_count_warn_limit 的非法 count 只告警到上限 invalid_count_warn_limit = getattr(importer, "invalid_count_warn_limit", 10) many_invalid_rows = [ { "timestamp": f"2025-12-13T22:{i:02d}:00+00:00", "platform_id": "webchat", "platform_type": "unknown", "count": "not-a-number", } for i in range(invalid_count_warn_limit + 5) ] importer._merge_platform_stats_rows(many_invalid_rows) # 告警条数被限制在 invalid_count_warn_limit 之内 assert warning_mock.call_count == invalid_count_warn_limit

将 mocker.patch("astrbot.backup.logger.warning") 中的导入路径替换为实际用于发出非法 count 告警的 logger 路径，例如可能是 tests.test_backup.logger.warning 或模块内的 astrbot.backup.importer.logger.warning 等。

确认 AstrBotImporter 实例上确实存在 invalid_count_warn_limit 属性；如果该上限定义在模块常量中（如 INVALID_COUNT_WARN_LIMIT），则需要改为从常量读取并在断言中使用该常量。

如果现有测试类/模块未使用 pytest-mock 的 mocker fixture，需要根据项目惯例改用 monkeypatch 或 caplog 等方式注入并统计 warning 调用次数，同时调整测试函数签名。

sourcery-ai · 2026-03-01T09:08:04Z

tests/test_backup.py

    @pytest.mark.asyncio
    async def test_import_file_not_exists(self, mock_main_db, tmp_path):
        """测试导入不存在的文件"""


suggestion (testing): 增强 replace 模式下失败场景的测试，确保在 _clear_main_db 失败后不会继续执行后续导入步骤。

为了更好地防止 replace 模式下出现“部分导入”的情况，可以在 _clear_main_db 失败时再断言不会调用任何后续的数据导入流程。例如，将 _import_main_database（以及如有的话 _import_knowledge_bases）打补丁为 AsyncMock，并断言在 _clear_main_db 抛出异常后这些方法没有被调用。这样可以确认导入在清库失败时会提前终止，避免主库被部分清空、部分导入的状态。

Suggested implementation:

@pytest.mark.asyncio async def test_import_replace_fails_when_clear_main_db_fails( self, mock_main_db, tmp_path ): """测试 replace 模式下主库清空失败会直接终止导入""" zip_path = tmp_path / "valid_backup.zip" manifest = { "version": "1.0.0", "platform_stats": [], "knowledge_bases": [], } importer = BackupImporter(main_db=mock_main_db, mode="replace") clear_error = RuntimeError("clear main db failed") with ( patch.object( importer, "_clear_main_db", new=AsyncMock(side_effect=clear_error), ) as clear_main_db_mock, patch.object( importer, "_import_main_database", new=AsyncMock(), ) as import_main_db_mock, patch.object( importer, "_import_knowledge_bases", new=AsyncMock(), ) as import_kb_mock, ): result = await importer.import_from_zip(zip_path, manifest) # 基本断言：导入失败且错误来自清空主库失败 assert result.success is False assert any("清空主库失败" in err or "clear main db failed" in err for err in result.errors) # 关键断言：在 _clear_main_db 失败后不再继续执行后续导入步骤，避免部分导入 clear_main_db_mock.assert_awaited() import_main_db_mock.assert_not_awaited() import_kb_mock.assert_not_awaited()

确保在 tests/test_backup.py 文件顶部导入了 AsyncMock，例如：from unittest.mock import AsyncMock, patch。（patch 似乎已经在使用了；保持现有导入，仅在缺失时补充 AsyncMock。）

如果实际的 BackupImporter 构造函数或 import_from_zip 方法签名与 BackupImporter(main_db=..., mode="replace") 或 import_from_zip(zip_path, manifest) 不同，请按真实 API 调整测试代码。

如果实现中不存在 _import_knowledge_bases（或者方法名不同），可以：

将打补丁和断言的名称改为实际的方法名，或者

去掉对应的 patch / 断言，只保留对 _import_main_database 的断言。

如果 _clear_main_db 失败时的错误信息不同于 "清空主库失败" / "clear main db failed"，请将 any(...) 中的匹配字符串调整为导入器实际返回的错误消息。

Original comment in English

suggestion (testing): Strengthen the replace-mode failure test by asserting that no further import steps are executed after _clear_main_db fails.

To better protect against partial imports in replace mode, also assert that no data-import routines run when _clear_main_db fails. For instance, patch _import_main_database (and _import_knowledge_bases if applicable) as AsyncMocks and assert they are not called after _clear_main_db raises. This will confirm the import aborts early and avoids a partially cleared/partially imported DB state.

Suggested implementation:

@pytest.mark.asyncio async def test_import_replace_fails_when_clear_main_db_fails( self, mock_main_db, tmp_path ): """测试 replace 模式下主库清空失败会直接终止导入""" zip_path = tmp_path / "valid_backup.zip" manifest = { "version": "1.0.0", "platform_stats": [], "knowledge_bases": [], } importer = BackupImporter(main_db=mock_main_db, mode="replace") clear_error = RuntimeError("clear main db failed") with ( patch.object( importer, "_clear_main_db", new=AsyncMock(side_effect=clear_error), ) as clear_main_db_mock, patch.object( importer, "_import_main_database", new=AsyncMock(), ) as import_main_db_mock, patch.object( importer, "_import_knowledge_bases", new=AsyncMock(), ) as import_kb_mock, ): result = await importer.import_from_zip(zip_path, manifest) # 基本断言：导入失败且错误来自清空主库失败 assert result.success is False assert any("清空主库失败" in err or "clear main db failed" in err for err in result.errors) # 关键断言：在 _clear_main_db 失败后不再继续执行后续导入步骤，避免部分导入 clear_main_db_mock.assert_awaited() import_main_db_mock.assert_not_awaited() import_kb_mock.assert_not_awaited()

Ensure AsyncMock is imported at the top of tests/test_backup.py, for example:
from unittest.mock import AsyncMock, patch. (patch seems already used; keep existing imports and just add AsyncMock if missing.)

Adjust the BackupImporter construction and import_from_zip call if the actual API differs (e.g., if the constructor or method signature is not BackupImporter(main_db=..., mode="replace") or import_from_zip(zip_path, manifest)).

If the implementation does not have _import_knowledge_bases (or uses a different name), either:

Update the patched attribute name to match the real method, or

Remove the corresponding patch/assert and only assert on _import_main_database.

If the error message for _clear_main_db failures differs from "清空主库失败" / "clear main db failed", adjust the any(...) error assertion string to match the actual message used by the importer.

astrbot/core/backup/importer.py

- 告警上限魔法数字提取为模块常量 PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT - 抽取 parse_count 内联函数，消除重复的 try/except 分支 - 存储行的 timestamp 同步写入规范化值，避免落库格式混用 - 补充测试：已有行 count 非法、告警限流、replace 模式中断断言

zouyonghe · 2026-03-01T10:14:12Z

@sourcery-ai review

sourcery-ai

Hey - 我已经审查了你的更改，看起来很棒！

Sourcery 对开源项目是免费的——如果你觉得我们的评审有帮助，请考虑分享给更多人 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续的评审。}

Original comment in English

Hey - I've reviewed your changes and they look great!

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

zouyonghe · 2026-03-01T10:27:33Z

@sourcery-ai review

sourcery-ai

Hey - 我在这里给出了一些总体反馈：

在 _normalize_platform_stats_timestamp 中，空/无效时间戳会被规范化为空字符串，然后作为去重键的一部分使用，这会把所有这类行聚合到一起；请考虑这些行是否应该保持区分（例如跳过规范化或使用原始值），或者显式跳过这些行，以避免出现非预期的合并。
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT 是一个全局常量，而计数器的作用域仅限于单次 _merge_platform_stats_rows 调用；如果你预期这个函数会在一次导入的多个阶段或不同的导入之间复用，你可能需要进一步明确该限制是按导入、按表，还是全局的，并据此调整实现。

给 AI 代理的提示

Please address the comments from this code review:

## Overall Comments
- In `_normalize_platform_stats_timestamp`, empty/invalid timestamps are normalized to an empty string and then used as part of the deduplication key, which will aggregate all such rows together; consider whether these should instead be left distinct (e.g., by skipping normalization or using the original value) or explicitly skipped to avoid unintended merging.
- The `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` is a global constant and the counter is scoped to a single `_merge_platform_stats_rows` call; if you expect this function to be used in multiple stages of one import or across different imports, you may want to clarify whether the limit should be per-import, per-table, or global and adjust the implementation accordingly.

Sourcery 对开源项目免费——如果你觉得我们的评审有帮助，欢迎分享 ✨

_{帮我变得更有用！请对每条评论点 👍 或 👎，我会根据你的反馈改进后续评审。}

Original comment in English

Hey - I've left some high level feedback:

In _normalize_platform_stats_timestamp, empty/invalid timestamps are normalized to an empty string and then used as part of the deduplication key, which will aggregate all such rows together; consider whether these should instead be left distinct (e.g., by skipping normalization or using the original value) or explicitly skipped to avoid unintended merging.
The PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT is a global constant and the counter is scoped to a single _merge_platform_stats_rows call; if you expect this function to be used in multiple stages of one import or across different imports, you may want to clarify whether the limit should be per-import, per-table, or global and adjust the implementation accordingly.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `_normalize_platform_stats_timestamp`, empty/invalid timestamps are normalized to an empty string and then used as part of the deduplication key, which will aggregate all such rows together; consider whether these should instead be left distinct (e.g., by skipping normalization or using the original value) or explicitly skipped to avoid unintended merging.
- The `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` is a global constant and the counter is scoped to a single `_merge_platform_stats_rows` call; if you expect this function to be used in multiple stages of one import or across different imports, you may want to clarify whether the limit should be per-import, per-table, or global and adjust the implementation accordingly.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

zouyonghe · 2026-03-01T10:31:36Z

Addressed both review comments in follow-up commit f091b85:

Invalid/empty timestamp values are no longer used for dedup aggregation keys. They are treated as distinct rows, so malformed timestamps cannot be merged accidentally.
Clarified warning-limit scope in code/docs: PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT is per _merge_platform_stats_rows invocation (rate-limited within one merge call).

Also added test coverage for invalid timestamp behavior:

test_merge_platform_stats_rows_keeps_invalid_timestamps_distinct

Local verification:

uv run pytest tests/test_backup.py -q
uv run ruff format .
uv run ruff check .

zouyonghe · 2026-03-01T10:31:44Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 1 个问题，并留下了一些高层反馈：

在 _normalize_platform_stats_timestamp 中，能够成功解析的 naive 时间戳会原样返回，没有显式的 UTC 偏移，这意味着对于逻辑上等价的时间点，你最终可能得到 Z/+00:00 和 naive 值的混合；建议将 naive 时间戳视为 UTC，并始终规范化为带有显式 UTC 偏移的形式（例如 dt.replace(tzinfo=timezone.utc).isoformat()），以确保键和值序列化结果的一致性。

提供给 AI Agent 的提示

Please address the comments from this code review:

## Overall Comments
- In `_normalize_platform_stats_timestamp`, naive timestamps that successfully parse are returned as-is without an explicit UTC offset, which means you can end up with a mix of `Z/+00:00` and naive values for logically equivalent instants; consider treating naive timestamps as UTC and always normalizing to an explicit UTC offset (e.g., `dt.replace(tzinfo=timezone.utc).isoformat()`) to ensure consistent keys and serialized values.

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>

         return imported

+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** 考虑将 `_merge_platform_stats_rows` 重构为更小的辅助方法，用于计数解析、时间戳规范化/缓存以及无效 key 处理，从而在不改变行为的前提下，让整体逻辑更扁平、更容易理解。

你可以通过把 `_merge_platform_stats_rows` 拆分为几个小的辅助函数，并去掉嵌套函数 / `nonlocal` 状态，在不改变行为的前提下降低认知负担。

### 1. 提取一个用于计数解析的纯辅助函数（不使用 `nonlocal`）

将告警 + 解析移动到一个独立的方法中，它保持告警频率限制，但自身是自包含的：

```python
def _parse_platform_stats_count(
    self,
    raw_count: Any,
    key: tuple[str, str, str],
    warned_count: int,
) -> tuple[int, int]:
    if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
        try:
            return int(raw_count), warned_count
        except (TypeError, ValueError):
            return 0, warned_count

    try:
        return int(raw_count), warned_count
    except (TypeError, ValueError):
        logger.warning(
            "platform_stats count 非法，已按 0 处理: "
            f"value={raw_count!r}, key={key}"
        )
        return 0, warned_count + 1
```

在 `_merge_platform_stats_rows` 中的用法会变得更加清晰且易于测试：

```python
invalid_count_warned = 0

# ...
normalized_row = dict(row)
normalized_row["timestamp"] = normalized_timestamp
normalized_row["count"], invalid_count_warned = self._parse_platform_stats_count(
    normalized_row.get("count", 0),
    key,
    invalid_count_warned,
)
```

以及：

```python
existing_count, invalid_count_warned = self._parse_platform_stats_count(
    existing.get("count", 0),
    key,
    invalid_count_warned,
)
incoming_count, invalid_count_warned = self._parse_platform_stats_count(
    row.get("count", 0),
    key,
    invalid_count_warned,
)
existing["count"] = existing_count + incoming_count
```

这样就可以去掉内部函数和 `nonlocal`，同时保持告警频率限制的行为完全一致。

### 2. 将时间戳规范化 + 缓存提取到一个辅助函数中

把缓存和有效性逻辑隔离出来，使主循环更加线性：

```python
def _normalize_platform_stats_timestamp_cached(
    self,
    raw_timestamp: Any,
    cache: dict[str, tuple[str, bool]],
) -> tuple[str, bool]:
    if isinstance(raw_timestamp, str):
        cached = cache.get(raw_timestamp)
        if cached is not None:
            return cached
        result = self._normalize_platform_stats_timestamp(raw_timestamp)
        cache[raw_timestamp] = result
        return result
    return self._normalize_platform_stats_timestamp(raw_timestamp)
```

然后 `_merge_platform_stats_rows` 的循环会更容易浏览：

```python
for row_index, row in enumerate(rows):
    raw_timestamp = row.get("timestamp")
    normalized_timestamp, is_timestamp_valid = (
        self._normalize_platform_stats_timestamp_cached(
            raw_timestamp, timestamp_cache
        )
    )
    timestamp_for_key = (
        normalized_timestamp
        if is_timestamp_valid
        else f"__invalid_timestamp_row_{row_index}"
    )
    key = (
        timestamp_for_key,
        str(row.get("platform_id")),
        str(row.get("platform_type")),
    )
    # ... rest unchanged
```

### 3. 可选：明确“无效时间戳” key 的构造意图

你可以通过包装一层方法，让这个合成 key 的意图更清晰：

```python
def _platform_stats_key_timestamp(
    self,
    normalized_timestamp: str,
    is_valid: bool,
    row_index: int,
) -> str:
    if is_valid:
        return normalized_timestamp
    # 保证无效时间戳不被合并：使用行号加前缀作为唯一 key
    return f"__invalid_timestamp_row_{row_index}"
```

随后：

```python
timestamp_for_key = self._platform_stats_key_timestamp(
    normalized_timestamp,
    is_timestamp_valid,
    row_index,
)
```

这样可以在保持完全相同行为（基于行号保证唯一性）的前提下，使“为什么要这么做”更加直观。

综合来看，这些修改在保留所有语义（缓存、告警频率限制、无效行处理）的同时，把 `_merge_platform_stats_rows` 扁平化为一条清晰的流程：规范化时间戳、构建 key、解析计数、执行合并。
</issue_to_address>

Sourcery 对开源项目是免费的——如果你觉得我们的评审有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续的评审。}

Original comment in English

Hey - I've found 1 issue, and left some high level feedback:

In _normalize_platform_stats_timestamp, naive timestamps that successfully parse are returned as-is without an explicit UTC offset, which means you can end up with a mix of Z/+00:00 and naive values for logically equivalent instants; consider treating naive timestamps as UTC and always normalizing to an explicit UTC offset (e.g., dt.replace(tzinfo=timezone.utc).isoformat()) to ensure consistent keys and serialized values.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `_normalize_platform_stats_timestamp`, naive timestamps that successfully parse are returned as-is without an explicit UTC offset, which means you can end up with a mix of `Z/+00:00` and naive values for logically equivalent instants; consider treating naive timestamps as UTC and always normalizing to an explicit UTC offset (e.g., `dt.replace(tzinfo=timezone.utc).isoformat()`) to ensure consistent keys and serialized values.

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>

         return imported

+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring `_merge_platform_stats_rows` into smaller helper methods for count parsing, timestamp normalization/caching, and invalid-key handling to make the logic flatter and easier to follow without changing behavior.

You can reduce the cognitive load without changing behavior by splitting `_merge_platform_stats_rows` into small helpers and removing the nested function / nonlocal state.

### 1. Extract a pure helper for count parsing (no `nonlocal`)

Move the warning + parsing into a dedicated method that keeps the rate limit but is self-contained:

```python
def _parse_platform_stats_count(
    self,
    raw_count: Any,
    key: tuple[str, str, str],
    warned_count: int,
) -> tuple[int, int]:
    if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
        try:
            return int(raw_count), warned_count
        except (TypeError, ValueError):
            return 0, warned_count

    try:
        return int(raw_count), warned_count
    except (TypeError, ValueError):
        logger.warning(
            "platform_stats count 非法，已按 0 处理: "
            f"value={raw_count!r}, key={key}"
        )
        return 0, warned_count + 1
```

Usage in `_merge_platform_stats_rows` becomes explicit and testable:

```python
invalid_count_warned = 0

# ...
normalized_row = dict(row)
normalized_row["timestamp"] = normalized_timestamp
normalized_row["count"], invalid_count_warned = self._parse_platform_stats_count(
    normalized_row.get("count", 0),
    key,
    invalid_count_warned,
)
```

and

```python
existing_count, invalid_count_warned = self._parse_platform_stats_count(
    existing.get("count", 0),
    key,
    invalid_count_warned,
)
incoming_count, invalid_count_warned = self._parse_platform_stats_count(
    row.get("count", 0),
    key,
    invalid_count_warned,
)
existing["count"] = existing_count + incoming_count
```

This drops the inner function + `nonlocal`, while keeping the warning limit behavior identical.

### 2. Extract timestamp normalization + caching into a helper

Isolate the cache and validity logic to make the main loop more linear:

```python
def _normalize_platform_stats_timestamp_cached(
    self,
    raw_timestamp: Any,
    cache: dict[str, tuple[str, bool]],
) -> tuple[str, bool]:
    if isinstance(raw_timestamp, str):
        cached = cache.get(raw_timestamp)
        if cached is not None:
            return cached
        result = self._normalize_platform_stats_timestamp(raw_timestamp)
        cache[raw_timestamp] = result
        return result
    return self._normalize_platform_stats_timestamp(raw_timestamp)
```

Then `_merge_platform_stats_rows` loop becomes easier to scan:

```python
for row_index, row in enumerate(rows):
    raw_timestamp = row.get("timestamp")
    normalized_timestamp, is_timestamp_valid = (
        self._normalize_platform_stats_timestamp_cached(
            raw_timestamp, timestamp_cache
        )
    )
    timestamp_for_key = (
        normalized_timestamp
        if is_timestamp_valid
        else f"__invalid_timestamp_row_{row_index}"
    )
    key = (
        timestamp_for_key,
        str(row.get("platform_id")),
        str(row.get("platform_type")),
    )
    # ... rest unchanged
```

### 3. Optional: clarify the “invalid timestamp” key shaping

You can make the intent of the synthetic key explicit by wrapping it:

```python
def _platform_stats_key_timestamp(
    self,
    normalized_timestamp: str,
    is_valid: bool,
    row_index: int,
) -> str:
    if is_valid:
        return normalized_timestamp
    # 保证无效时间戳不被合并：使用行号加前缀作为唯一 key
    return f"__invalid_timestamp_row_{row_index}"
```

Then:

```python
timestamp_for_key = self._platform_stats_key_timestamp(
    normalized_timestamp,
    is_timestamp_valid,
    row_index,
)
```

This keeps the exact behavior (row-index-based uniqueness) but makes *why* it’s done more obvious.

Together, these changes keep all semantics (caching, warning rate-limiting, invalid row handling) but flatten `_merge_platform_stats_rows` into a straightforward read: normalize timestamp, build key, parse counts, merge.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/backup/importer.py

zouyonghe · 2026-03-01T10:36:17Z

Addressed latest review feedback in commit 7a590f2:

Naive timestamps are now treated as UTC and always serialized with explicit +00:00 in _normalize_platform_stats_timestamp, so dedup keys/serialized values are consistent across Z/+00:00/naive forms.
Refactored _merge_platform_stats_rows into smaller helpers to reduce complexity and remove nested nonlocal state:

_parse_platform_stats_count
_normalize_platform_stats_timestamp_cached
_platform_stats_key_timestamp

Added/updated tests:

test_merge_platform_stats_rows (covers naive timestamp dedup with +00:00 forms)
test_normalize_platform_stats_timestamp_treats_naive_as_utc

Local verification:

uv run pytest tests/test_backup.py -q
uv run ruff format .
uv run ruff check .

zouyonghe · 2026-03-01T10:36:25Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 1 个问题

给 AI Agents 的提示

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>

         return imported

+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** 请考虑通过在 `_merge_platform_stats_rows` 中使用局部闭包，而不是多个辅助方法和到处传递状态变量，来简化新的 platform_stats 合并逻辑。

你可以保留所有新的行为，但把这些辅助方法“压平”到本地闭包中，以减少状态传递和间接层级。

### 1. 保持 `_merge_platform_stats_rows` 作为主要调度函数，使用闭包

与其在多个辅助函数之间传递 `invalid_count_warned` 和缓存，不如把它们保留在 `_merge_platform_stats_rows` 的局部作用域中，并使用小的内部函数：

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> tuple[list[dict[str, Any]], int]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    timestamp_cache: dict[str, tuple[str, bool]] = {}
    invalid_count_warned = 0
    duplicate_count = 0

    def normalize_ts(raw_timestamp: Any) -> tuple[str, bool]:
        if isinstance(raw_timestamp, str):
            cached = timestamp_cache.get(raw_timestamp)
            if cached is not None:
                return cached
        result = self._normalize_platform_stats_timestamp(raw_timestamp)
        if isinstance(raw_timestamp, str):
            timestamp_cache[raw_timestamp] = result
        return result

    def parse_count(raw_count: Any, key: tuple[str, str, str]) -> int:
        nonlocal invalid_count_warned
        try:
            return int(raw_count)
        except (TypeError, ValueError):
            if invalid_count_warned < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
                logger.warning(
                    "platform_stats count 非法，已按 0 处理: "
                    f"value={raw_count!r}, key={key}"
                )
                invalid_count_warned += 1
            return 0

    for row_index, row in enumerate(rows):
        normalized_timestamp, is_timestamp_valid = normalize_ts(row.get("timestamp"))

        # keep existing key format, just build inline
        if is_timestamp_valid:
            timestamp_for_key = normalized_timestamp
        else:
            timestamp_for_key = f"__invalid_timestamp_row_{row_index}"

        key = (
            timestamp_for_key,
            str(row.get("platform_id")),
            str(row.get("platform_type")),
        )
        existing = merged.get(key)
        if existing is None:
            normalized_row = dict(row)
            normalized_row["timestamp"] = normalized_timestamp
            normalized_row["count"] = parse_count(normalized_row.get("count", 0), key)
            merged[key] = normalized_row
            continue

        duplicate_count += 1
        existing["count"] = (
            parse_count(existing.get("count", 0), key)
            + parse_count(row.get("count", 0), key)
        )

    return list(merged.values()), duplicate_count
```

这样可以保持：

- 相同的合并语义
- 相同的 `__invalid_timestamp_row_{row_index}` 键格式
- 相同的限频告警行为，但无需在各处传递计数器

### 2. 将时间戳辅助函数收敛为一个对外方法

在使用上述闭包后，`_normalize_platform_stats_timestamp_cached` 和 `_platform_stats_key_timestamp` 就不再需要了。你可以只保留一个归一化辅助函数：

```python
def _normalize_platform_stats_timestamp(self, value: Any) -> tuple[str, bool]:
    if isinstance(value, datetime):
        dt = value
        if dt.tzinfo is None:
            dt = dt.replace(tzinfo=timezone.utc)
        else:
            dt = dt.astimezone(timezone.utc)
        return dt.isoformat(), True

    if isinstance(value, str):
        timestamp = value.strip()
        if not timestamp:
            return "", False
        if timestamp.endswith("Z"):
            timestamp = f"{timestamp[:-1]}+00:00"
        try:
            dt = datetime.fromisoformat(timestamp)
            if dt.tzinfo is None:
                dt = dt.replace(tzinfo=timezone.utc)
            else:
                dt = dt.astimezone(timezone.utc)
            return dt.isoformat(), True
        except ValueError:
            return value.strip(), False

    if value is None:
        return "", False
    return str(value), False
```

然后你就可以安全地删除：

- `_normalize_platform_stats_timestamp_cached`
- `_platform_stats_key_timestamp`
- `_parse_platform_stats_count`

所有功能（合并、归一化、限频告警、无效时间戳分离）都会被保留，但逻辑会更加内聚，也更容易理解。
</issue_to_address>

Sourcery 对开源项目免费 —— 如果你觉得我们的代码审查有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续的代码审查。}

Original comment in English

Hey - I've found 1 issue

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>

         return imported

+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the new platform_stats merging logic by using local closures inside `_merge_platform_stats_rows` instead of multiple helper methods and threaded state variables.

You can keep all the new behavior but flatten the helpers to reduce state‑threading and indirection.

### 1. Keep `_merge_platform_stats_rows` as the main orchestration, use closures

Instead of threading `invalid_count_warned` and the cache through multiple helpers, keep them local to `_merge_platform_stats_rows` and use small inner functions:

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> tuple[list[dict[str, Any]], int]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    timestamp_cache: dict[str, tuple[str, bool]] = {}
    invalid_count_warned = 0
    duplicate_count = 0

    def normalize_ts(raw_timestamp: Any) -> tuple[str, bool]:
        if isinstance(raw_timestamp, str):
            cached = timestamp_cache.get(raw_timestamp)
            if cached is not None:
                return cached
        result = self._normalize_platform_stats_timestamp(raw_timestamp)
        if isinstance(raw_timestamp, str):
            timestamp_cache[raw_timestamp] = result
        return result

    def parse_count(raw_count: Any, key: tuple[str, str, str]) -> int:
        nonlocal invalid_count_warned
        try:
            return int(raw_count)
        except (TypeError, ValueError):
            if invalid_count_warned < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
                logger.warning(
                    "platform_stats count 非法，已按 0 处理: "
                    f"value={raw_count!r}, key={key}"
                )
                invalid_count_warned += 1
            return 0

    for row_index, row in enumerate(rows):
        normalized_timestamp, is_timestamp_valid = normalize_ts(row.get("timestamp"))

        # keep existing key format, just build inline
        if is_timestamp_valid:
            timestamp_for_key = normalized_timestamp
        else:
            timestamp_for_key = f"__invalid_timestamp_row_{row_index}"

        key = (
            timestamp_for_key,
            str(row.get("platform_id")),
            str(row.get("platform_type")),
        )
        existing = merged.get(key)
        if existing is None:
            normalized_row = dict(row)
            normalized_row["timestamp"] = normalized_timestamp
            normalized_row["count"] = parse_count(normalized_row.get("count", 0), key)
            merged[key] = normalized_row
            continue

        duplicate_count += 1
        existing["count"] = (
            parse_count(existing.get("count", 0), key)
            + parse_count(row.get("count", 0), key)
        )

    return list(merged.values()), duplicate_count
```

This keeps:

- Same merging semantics
- Same `__invalid_timestamp_row_{row_index}` key shape
- Same rate‑limited warnings, but without passing counters around

### 2. Collapse timestamp helpers into one public method

With the closure above, `_normalize_platform_stats_timestamp_cached` and `_platform_stats_key_timestamp` become unnecessary. You can keep a single normalization helper:

```python
def _normalize_platform_stats_timestamp(self, value: Any) -> tuple[str, bool]:
    if isinstance(value, datetime):
        dt = value
        if dt.tzinfo is None:
            dt = dt.replace(tzinfo=timezone.utc)
        else:
            dt = dt.astimezone(timezone.utc)
        return dt.isoformat(), True

    if isinstance(value, str):
        timestamp = value.strip()
        if not timestamp:
            return "", False
        if timestamp.endswith("Z"):
            timestamp = f"{timestamp[:-1]}+00:00"
        try:
            dt = datetime.fromisoformat(timestamp)
            if dt.tzinfo is None:
                dt = dt.replace(tzinfo=timezone.utc)
            else:
                dt = dt.astimezone(timezone.utc)
            return dt.isoformat(), True
        except ValueError:
            return value.strip(), False

    if value is None:
        return "", False
    return str(value), False
```

Then you can safely delete:

- `_normalize_platform_stats_timestamp_cached`
- `_platform_stats_key_timestamp`
- `_parse_platform_stats_count`

All functionality (merging, normalization, warning rate‑limiting, invalid‑timestamp separation) is preserved, but the logic is more localized and easier to follow.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/backup/importer.py

zouyonghe · 2026-03-01T10:40:51Z

Addressed this complexity review in commit 9df6a66:

Inlined platform_stats merge support logic back into local closures inside _merge_platform_stats_rows:
- local normalize_ts(...) closure for timestamp normalization + cache
- local parse_count(...) closure for warning-limited count parsing
Removed now-unnecessary helper methods:
- _parse_platform_stats_count
- _normalize_platform_stats_timestamp_cached
- _platform_stats_key_timestamp
Kept behavior unchanged (including invalid timestamp key format _invalid_timestamp_row{row_index}, warning limit behavior, and dedup semantics).

Validation run:

uv run pytest tests/test_backup.py -q
uv run ruff check .

zouyonghe · 2026-03-01T10:40:56Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 1 个问题

给 AI Agents 的提示

Please address the comments from this code review:

## 单独评论

### 评论 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>

         return imported

+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** 考虑将 `_merge_platform_stats_rows` 重构为一个更扁平、基于小辅助函数的实现，在保留现有行为的同时，避免使用内部函数、合成键以及额外的计数器。

你可以通过以下方式简化 `_merge_platform_stats_rows` 并保持所有行为不变：

* 移除内部函数和闭包状态。
* 去掉每次调用的 `timestamp_cache`（直接做归一化；重新解析少量时间戳成本很低）。
* 避免通过行索引为无效时间戳生成合成键，而是简单地“对无效时间戳从不合并”。
* 通过长度差来推导 `duplicate_count`，而不是在逻辑中传递一个额外计数器。
* 将“安全的 int 转换 + 限流日志”抽成一个小的辅助函数，该函数同时返回解析后的值和更新后的告警计数。

一个具体示例草图：

```python
# keep the existing constant
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5

def _parse_platform_stats_count(
    self,
    raw_count: Any,
    invalid_count_warned: int,
    key: tuple[str, str, str],
) -> tuple[int, int]:
    """Safe int parse with per-call rate limited warning."""
    try:
        return int(raw_count), invalid_count_warned
    except (TypeError, ValueError):
        if invalid_count_warned < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
            logger.warning(
                "platform_stats count 非法，已按 0 处理: "
                f"value={raw_count!r}, key={key}"
            )
            invalid_count_warned += 1
        return 0, invalid_count_warned
```

然后，`_merge_platform_stats_rows` 就可以变成一个没有内部函数或合成键的扁平循环：

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    result: list[dict[str, Any]] = []
    invalid_count_warned = 0

    for row in rows:
        # copy so we don't mutate caller's data
        normalized_row = dict(row)

        raw_ts = normalized_row.get("timestamp")
        normalized_ts, ts_valid = self._normalize_platform_stats_timestamp(raw_ts)
        normalized_row["timestamp"] = normalized_ts

        key_for_log = (
            normalized_ts if ts_valid else "<invalid_timestamp>",
            str(normalized_row.get("platform_id")),
            str(normalized_row.get("platform_type")),
        )
        count, invalid_count_warned = self._parse_platform_stats_count(
            normalized_row.get("count", 0),
            invalid_count_warned,
            key_for_log,
        )
        normalized_row["count"] = count

        # Invalid timestamps: keep as distinct rows (no merging)
        if not ts_valid:
            result.append(normalized_row)
            continue

        key = (
            normalized_ts,
            key_for_log[1],
            key_for_log[2],
        )
        existing = merged.get(key)
        if existing is None:
            merged[key] = normalized_row
            result.append(normalized_row)
        else:
            existing["count"] += count

    return result
```

这样，你就可以避免返回 `duplicate_count`，并在记录日志时再计算它：

```python
if table_name == "platform_stats":
    normalized_rows = self._merge_platform_stats_rows(rows)
    duplicate_count = len(rows) - len(normalized_rows)
    if duplicate_count > 0:
        logger.warning(
            f"检测到 platform_stats 重复键 {duplicate_count} 条，已在导入前聚合"
        )
else:
    normalized_rows = rows
```

这样可以保持：

* 按 `(normalized_timestamp, platform_id, platform_type)` 聚合。
* 无效时间戳保留为未合并的独立行。
* 每次函数调用级别的限流日志记录，用于无效的 `count`。
* 基于合并后行数来记录重复数据的日志。

但同时移除了内部函数、闭包状态、时间戳缓存，以及基于行索引的合成键，使得整体逻辑更容易理解。
</issue_to_address>

Sourcery 对开源项目免费 —— 如果你觉得我们的评审有帮助，请考虑分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续评审。}

Original comment in English

Hey - I've found 1 issue

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>

         return imported

+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> tuple[list[dict[str, Any]], int]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring `_merge_platform_stats_rows` into a flatter helper-based implementation that avoids inner functions, synthetic keys, and extra counters while preserving existing behavior.

You can simplify `_merge_platform_stats_rows` and keep all behavior by:

* Removing inner functions and closure state.
* Dropping the per-call `timestamp_cache` (normalize directly; re-parsing a few timestamps is cheap).
* Avoiding synthetic keys with row indices for invalid timestamps by simply “never merging” invalid timestamps.
* Deriving `duplicate_count` from the length difference instead of threading an extra counter through the logic.
* Isolating “safe int + rate-limited logging” as a small helper that returns both the parsed value and updated warning counter.

A concrete sketch:

```python
# keep the existing constant
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5

def _parse_platform_stats_count(
    self,
    raw_count: Any,
    invalid_count_warned: int,
    key: tuple[str, str, str],
) -> tuple[int, int]:
    """Safe int parse with per-call rate limited warning."""
    try:
        return int(raw_count), invalid_count_warned
    except (TypeError, ValueError):
        if invalid_count_warned < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
            logger.warning(
                "platform_stats count 非法，已按 0 处理: "
                f"value={raw_count!r}, key={key}"
            )
            invalid_count_warned += 1
        return 0, invalid_count_warned
```

Then `_merge_platform_stats_rows` becomes a flat loop without inner functions or synthetic keys:

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    result: list[dict[str, Any]] = []
    invalid_count_warned = 0

    for row in rows:
        # copy so we don't mutate caller's data
        normalized_row = dict(row)

        raw_ts = normalized_row.get("timestamp")
        normalized_ts, ts_valid = self._normalize_platform_stats_timestamp(raw_ts)
        normalized_row["timestamp"] = normalized_ts

        key_for_log = (
            normalized_ts if ts_valid else "<invalid_timestamp>",
            str(normalized_row.get("platform_id")),
            str(normalized_row.get("platform_type")),
        )
        count, invalid_count_warned = self._parse_platform_stats_count(
            normalized_row.get("count", 0),
            invalid_count_warned,
            key_for_log,
        )
        normalized_row["count"] = count

        # Invalid timestamps: keep as distinct rows (no merging)
        if not ts_valid:
            result.append(normalized_row)
            continue

        key = (
            normalized_ts,
            key_for_log[1],
            key_for_log[2],
        )
        existing = merged.get(key)
        if existing is None:
            merged[key] = normalized_row
            result.append(normalized_row)
        else:
            existing["count"] += count

    return result
```

With that, you can avoid returning `duplicate_count` and compute it when logging:

```python
if table_name == "platform_stats":
    normalized_rows = self._merge_platform_stats_rows(rows)
    duplicate_count = len(rows) - len(normalized_rows)
    if duplicate_count > 0:
        logger.warning(
            f"检测到 platform_stats 重复键 {duplicate_count} 条，已在导入前聚合"
        )
else:
    normalized_rows = rows
```

This keeps:

* Aggregation by `(normalized_timestamp, platform_id, platform_type)`.
* Invalid timestamps preserved as non-merged rows.
* Per-function-call rate-limited logging for invalid `count`.
* Duplicate logging based on the number of merged rows.

But it removes the inner functions, closure state, timestamp cache, and row-index-based synthetic keys, making the logic much easier to follow.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/backup/importer.py

zouyonghe · 2026-03-01T10:45:13Z

Addressed this review in commit 5bd71e4:

Refactored _merge_platform_stats_rows into a flatter loop with no internal helper closures.
Removed synthetic invalid-timestamp key strategy from merge logic; invalid timestamps are now simply kept as distinct rows (never merged).
Removed per-call duplicate counter from merge logic; duplicate count is now derived at call site via len(rows) - len(normalized_rows) for logging.
Added a small explicit helper _parse_platform_stats_count(...) to handle safe int parse + per-call warning rate limit.
Removed timestamp cache from merge path as requested.

Behavior preserved:

Valid timestamp rows still merge by (normalized_timestamp, platform_id, platform_type).
Invalid timestamp rows remain unmerged.
Invalid count warning remains rate-limited per merge invocation.

Local verification:

uv run pytest tests/test_backup.py -q
uv run ruff check .

zouyonghe · 2026-03-01T10:45:25Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 1 个问题

给 AI Agents 的提示词

请根据本次代码审查中的评论进行修改：

## 单条评论

### 评论 1
<location path="astrbot/core/backup/importer.py" line_range="203" />
<code_context>
         self.kb_manager = kb_manager
         self.config_path = config_path
         self.kb_root_dir = kb_root_dir
+        self._main_table_preprocessors: dict[
+            str, Callable[[list[dict[str, Any]]], list[dict[str, Any]]]
+        ] = {
</code_context>
<issue_to_address>
**issue (complexity):** 建议重构新增的预处理与合并逻辑：将单表的特殊分支内联、为计数和时间戳处理拆出小的辅助函数，并在可行的情况下移除基于环境变量的限制，以简化控制流和状态管理。

通过少量有针对性的重构，你可以在保留全部新行为的前提下降低复杂度。

### 1. 移除主表预处理器注册表

目前只有一个表是特殊情况，用 dict 做注册表收益不大，反而增加了一层间接调用。

可以不在 `__init__` 中保存 `self._main_table_preprocessors` 并做字典查询，而是在 `_preprocess_main_table_rows` 中直接内联这个分支逻辑，同时删掉这个属性：

```python
def _preprocess_main_table_rows(
    self, table_name: str, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    if table_name == "platform_stats":
        normalized_rows = self._merge_platform_stats_rows(rows)
        duplicate_count = len(rows) - len(normalized_rows)
        if duplicate_count > 0:
            logger.warning(
                f"检测到 {table_name} 重复键 {duplicate_count} 条，已在导入前聚合"
            )
        return normalized_rows

    return rows
```

这样可以保持当前行为不变，同时去掉注册表抽象和一层间接调用。

### 2. 去掉 `nonlocal`，拆分 `_merge_platform_stats_rows` 的职责

你可以将计数解析和行归一化抽取为小的辅助函数，从而避免使用 `nonlocal` 状态。

**a) 抽取不使用闭包的计数解析函数：**

```python
def _parse_platform_stats_count(
    self,
    raw_count: Any,
    key_for_log: tuple[str, str, str],
    warned_count: int,
) -> tuple[int, int]:
    if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
        try:
            return int(raw_count), warned_count
        except (TypeError, ValueError):
            return 0, warned_count

    try:
        return int(raw_count), warned_count
    except (TypeError, ValueError):
        logger.warning(
            "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
            raw_count,
            key_for_log,
        )
        return 0, warned_count + 1
```

然后 `_merge_platform_stats_rows` 就可以更简单，只使用一个普通的 `invalid_count_warned` 整数：

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    non_mergeable: list[dict[str, Any]] = []
    invalid_count_warned = 0

    for row in rows:
        normalized_row, timestamp, is_timestamp_valid = (
            self._normalize_platform_stats_row(row)
        )

        platform_id = normalized_row.get("platform_id")
        platform_type = normalized_row.get("platform_type")
        key_for_log = (
            timestamp if is_timestamp_valid else "<invalid_timestamp>",
            repr(platform_id),
            repr(platform_type),
        )

        count, invalid_count_warned = self._parse_platform_stats_count(
            normalized_row.get("count", 0),
            key_for_log,
            invalid_count_warned,
        )
        normalized_row["count"] = count

        if not is_timestamp_valid:
            non_mergeable.append(normalized_row)
            continue

        if not isinstance(platform_id, str) or not isinstance(platform_type, str):
            non_mergeable.append(normalized_row)
            continue

        key = (timestamp, platform_id, platform_type)
        existing = merged.get(key)
        if existing is None:
            merged[key] = normalized_row
        else:
            existing["count"] += count

    return [*non_mergeable, *merged.values()]
```

**b) 抽取负责行归一化（时间戳 + 拷贝）的辅助函数：**

```python
def _normalize_platform_stats_row(
    self, row: dict[str, Any]
) -> tuple[dict[str, Any], str, bool]:
    normalized_row = dict(row)
    normalized_timestamp, is_timestamp_valid = self._normalize_platform_stats_timestamp(
        normalized_row.get("timestamp")
    )
    normalized_row["timestamp"] = normalized_timestamp
    return normalized_row, normalized_timestamp, is_timestamp_valid
```

这样可以分离职责，使 `_merge_platform_stats_rows` 的逻辑更易理解。

### 3. 简化时间戳归一化逻辑，使其返回 `str | None`

与其返回 `(value, bool)`，不如直接返回归一化后的时间戳字符串或 `None`。调用方可以直接将 `None` 视为不可合并，从而避免单独维护一个布尔标记。

```python
from typing import Optional

def _normalize_platform_stats_timestamp(self, value: Any) -> Optional[str]:
    if isinstance(value, datetime):
        return self._to_utc_iso(value)

    if isinstance(value, str):
        timestamp = value.strip()
        if not timestamp:
            return None
        if timestamp.endswith("Z"):
            timestamp = f"{timestamp[:-1]}+00:00"
        try:
            return self._to_utc_iso(datetime.fromisoformat(timestamp))
        except ValueError:
            # keep original string but mark as invalid at caller if desired
            return None

    if value is None:
        return None

    # non-string unexpected type; treat as invalid
    return None
```

然后在 `_merge_platform_stats_rows` 中：

```python
timestamp = self._normalize_platform_stats_timestamp(row.get("timestamp"))
normalized_row = dict(row)
normalized_row["timestamp"] = timestamp or ""

is_timestamp_valid = timestamp is not None
```

当 `timestamp is None` 时，你可以通过设置存储的值，来保持现有行为（将时间戳非法的行视为不可合并，并保留原始/空的时间戳值）。

### 4. 考虑简化基于环境变量的告警上限

如果并不强依赖可配置性，可以保留告警限流的概念，但移除环境变量加载和模块级的相关逻辑：

```python
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5  # 简单的固定常量
```

这样其它逻辑（按调用计数）保持不变，但可以删除 `_load_platform_stats_invalid_count_warn_limit` 和环境变量处理相关代码，从而降低本模块的理解成本。如果确实需要环境变量，也可以将解析逻辑内联到一个小辅助函数中，并与常量放在一起以提高可读性：

```python
def _get_invalid_count_warn_limit() -> int:
    raw = os.getenv("ASTRBOT_PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT")
    if not raw:
        return 5
    try:
        value = int(raw)
        return value if value >= 0 else 5
    except ValueError:
        return 5

PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = _get_invalid_count_warn_limit()
```

这些改动可以保持现有行为（去重合并、时间戳归一化以及告警节流）不变，同时减少间接层级、分支数量以及共享可变状态。
</issue_to_address>

Sourcery 对开源项目免费 —— 如果你喜欢我们的审查结果，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈持续改进代码审查。}

Original comment in English

Hey - I've found 1 issue

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="203" />
<code_context>
         self.kb_manager = kb_manager
         self.config_path = config_path
         self.kb_root_dir = kb_root_dir
+        self._main_table_preprocessors: dict[
+            str, Callable[[list[dict[str, Any]]], list[dict[str, Any]]]
+        ] = {
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the new preprocessing and merging logic to inline the single-table special case, extract small helpers for count and timestamp handling, and optionally drop the env-based limit to simplify control flow and state management.

You can keep all the new behavior while reducing complexity with a few targeted refactors.

### 1. Drop the registry for main table preprocessors

Since there is only one special-case table, the dict-based registry doesn’t buy much yet and adds indirection.

Instead of storing `self._main_table_preprocessors` in `__init__` and doing a dict lookup, you can inline the decision in `_preprocess_main_table_rows` and drop the attribute entirely:

```python
def _preprocess_main_table_rows(
    self, table_name: str, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    if table_name == "platform_stats":
        normalized_rows = self._merge_platform_stats_rows(rows)
        duplicate_count = len(rows) - len(normalized_rows)
        if duplicate_count > 0:
            logger.warning(
                f"检测到 {table_name} 重复键 {duplicate_count} 条，已在导入前聚合"
            )
        return normalized_rows

    return rows
```

This preserves behavior but removes the registry abstraction and a layer of indirection.

### 2. Remove `nonlocal` and split `_merge_platform_stats_rows` responsibilities

You can factor out count parsing and row normalization into small helpers and avoid `nonlocal` state.

**a) Extract count parsing without closures:**

```python
def _parse_platform_stats_count(
    self,
    raw_count: Any,
    key_for_log: tuple[str, str, str],
    warned_count: int,
) -> tuple[int, int]:
    if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
        try:
            return int(raw_count), warned_count
        except (TypeError, ValueError):
            return 0, warned_count

    try:
        return int(raw_count), warned_count
    except (TypeError, ValueError):
        logger.warning(
            "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
            raw_count,
            key_for_log,
        )
        return 0, warned_count + 1
```

Then `_merge_platform_stats_rows` becomes simpler and uses a plain `invalid_count_warned` int:

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    non_mergeable: list[dict[str, Any]] = []
    invalid_count_warned = 0

    for row in rows:
        normalized_row, timestamp, is_timestamp_valid = (
            self._normalize_platform_stats_row(row)
        )

        platform_id = normalized_row.get("platform_id")
        platform_type = normalized_row.get("platform_type")
        key_for_log = (
            timestamp if is_timestamp_valid else "<invalid_timestamp>",
            repr(platform_id),
            repr(platform_type),
        )

        count, invalid_count_warned = self._parse_platform_stats_count(
            normalized_row.get("count", 0),
            key_for_log,
            invalid_count_warned,
        )
        normalized_row["count"] = count

        if not is_timestamp_valid:
            non_mergeable.append(normalized_row)
            continue

        if not isinstance(platform_id, str) or not isinstance(platform_type, str):
            non_mergeable.append(normalized_row)
            continue

        key = (timestamp, platform_id, platform_type)
        existing = merged.get(key)
        if existing is None:
            merged[key] = normalized_row
        else:
            existing["count"] += count

    return [*non_mergeable, *merged.values()]
```

**b) Extract row normalization (timestamp + cloning) into a helper:**

```python
def _normalize_platform_stats_row(
    self, row: dict[str, Any]
) -> tuple[dict[str, Any], str, bool]:
    normalized_row = dict(row)
    normalized_timestamp, is_timestamp_valid = self._normalize_platform_stats_timestamp(
        normalized_row.get("timestamp")
    )
    normalized_row["timestamp"] = normalized_timestamp
    return normalized_row, normalized_timestamp, is_timestamp_valid
```

This separates concerns and makes `_merge_platform_stats_rows` easier to follow.

### 3. Simplify timestamp normalization to return `str | None`

Instead of `(value, bool)`, you can return a normalized timestamp string or `None`. The call site then directly treats `None` as non-mergeable, avoiding separate boolean tracking.

```python
from typing import Optional

def _normalize_platform_stats_timestamp(self, value: Any) -> Optional[str]:
    if isinstance(value, datetime):
        return self._to_utc_iso(value)

    if isinstance(value, str):
        timestamp = value.strip()
        if not timestamp:
            return None
        if timestamp.endswith("Z"):
            timestamp = f"{timestamp[:-1]}+00:00"
        try:
            return self._to_utc_iso(datetime.fromisoformat(timestamp))
        except ValueError:
            # keep original string but mark as invalid at caller if desired
            return None

    if value is None:
        return None

    # non-string unexpected type; treat as invalid
    return None
```

Then in `_merge_platform_stats_rows`:

```python
timestamp = self._normalize_platform_stats_timestamp(row.get("timestamp"))
normalized_row = dict(row)
normalized_row["timestamp"] = timestamp or ""

is_timestamp_valid = timestamp is not None
```

You can preserve the existing behavior (keeping invalid timestamps as non-mergeable rows with the original/empty value) by setting the stored value appropriately when `timestamp is None`.

### 4. Consider simplifying the env-driven warning limit

If configurability is not strictly required, you can keep the rate-limiting concept but drop the env loader and module-level plumbing:

```python
PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5  # simple fixed constant
```

All other logic (per-invocation counting) remains the same, but you remove `_load_platform_stats_invalid_count_warn_limit` and the env variable handling, which reduces cognitive load in this module. If you do still need the env, you could at least inline the parsing into a short helper and keep it near the constant for readability:

```python
def _get_invalid_count_warn_limit() -> int:
    raw = os.getenv("ASTRBOT_PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT")
    if not raw:
        return 5
    try:
        value = int(raw)
        return value if value >= 0 else 5
    except ValueError:
        return 5

PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = _get_invalid_count_warn_limit()
```

These changes keep the same behavior (duplicate merging, timestamp normalization, and warning throttling) but reduce indirection, branching, and shared mutable state.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/backup/importer.py

zouyonghe · 2026-03-01T11:04:21Z

Addressed this review in commit ec57b98:

Removed preprocessor registry from init and inlined single-table branch in _preprocess_main_table_rows (platform_stats only).
Refactored _merge_platform_stats_rows to avoid nonlocal state and split responsibilities:
- _parse_platform_stats_count(raw_count, key_for_log, warned_count)
- _normalize_platform_stats_row(row)
Simplified timestamp normalizer API:
- _normalize_platform_stats_timestamp now returns str | None instead of (value, is_valid)
- invalid timestamps are represented as None and remain non-mergeable
- _normalize_platform_stats_row preserves stored timestamp behavior for invalid inputs (trimmed string / empty string / str(value)).
Removed env-based warn-limit loading and kept fixed constant:
- PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5

All existing behavior around dedup, invalid timestamp non-merge, and warning rate limiting is preserved.

Local verification:

uv run pytest tests/test_backup.py -q
uv run ruff check .

zouyonghe · 2026-03-01T11:05:00Z

@sourcery-ai review

SourceryAI · 2026-03-01T11:05:06Z

Hi @zouyonghe! 👋

Only authors and team members can run @sourcery-ai commands on public repos.

If you are a team member, install the @sourcery-ai bot to get access ✨

sourcery-ai

Hey - 我发现了 1 个问题

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+            return normalized_rows
+        return rows
+
+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** 建议将时间戳标准化和计数解析内联为 `_merge_platform_stats_rows` 内部的小型本地辅助函数，这样 `platform_stats` 的导入路径就可以变成单一的线性流程，而不需要额外的辅助方法或元组传递。

你可以通过将大部分逻辑收拢到 `_merge_platform_stats_rows` 中，并使用小型本地辅助函数 `build_key` 和 `parse_count`，来简化 `platform_stats` 这条路径。这样可以保留现有的行为（标准化、不可合并行的处理、警告次数上限），同时去掉元组穿透和类级别的辅助函数。

### 1. 扁平化标准化和 key 构建

与其让 `_normalize_platform_stats_row`、`_normalize_platform_stats_timestamp` 和 `_to_utc_iso` 返回 `(row, ts, bool)`，不如在 `_merge_platform_stats_rows` 中内联这些逻辑，使用一个 `build_key(row)`，在不可合并时返回 `None`：

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    non_mergeable: list[dict[str, Any]] = []
    invalid_count_warned = 0

    def normalize_timestamp(value: Any) -> str | None:
        if isinstance(value, datetime):
            return self._to_utc_iso(value)
        if isinstance(value, str):
            ts = value.strip()
            if not ts:
                return None
            if ts.endswith("Z"):
                ts = f"{ts[:-1]}+00:00"
            try:
                return self._to_utc_iso(datetime.fromisoformat(ts))
            except ValueError:
                return None
        return None

    def build_key(row: dict[str, Any]) -> tuple[str, str, str] | None:
        ts = normalize_timestamp(row.get("timestamp"))
        pid = row.get("platform_id")
        ptype = row.get("platform_type")
        if ts is None or not isinstance(pid, str) or not isinstance(ptype, str):
            return None
        # keep normalized timestamp on row so later code sees it
        row["timestamp"] = ts
        return ts, pid, ptype
```

这样一来，你就不再需要 `_normalize_platform_stats_row` 或 `is_timestamp_valid` 标志；不可合并的行就是那些 `build_key(row) is None` 的行，并原样保留下来：

```python
    for raw in rows:
        row = dict(raw)
        key = build_key(row)
        key_for_log = (
            row.get("timestamp"),
            repr(row.get("platform_id")),
            repr(row.get("platform_type")),
        )
        count, invalid_count_warned = parse_count(
            row.get("count", 0),
            key_for_log,
            invalid_count_warned,
        )
        row["count"] = count

        if key is None:
            non_mergeable.append(row)
            continue

        existing = merged.get(key)
        if existing is None:
            merged[key] = row
        else:
            existing["count"] += count

    return [*non_mergeable, *merged.values()]
```

### 2. 简化计数解析和警告上限

你可以保持相同的警告语义，但通过将 `_parse_platform_stats_count` 简化为一个仅处理 `raw_count` 和简单整型计数器的小型本地辅助函数来简化其 API 和用法：

```python
    def parse_count(
        raw_count: Any,
        key_for_log: tuple[Any, Any, Any],
        warned_count: int,
    ) -> tuple[int, int]:
        if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
            try:
                return int(raw_count), warned_count
            except (TypeError, ValueError):
                return 0, warned_count
        try:
            return int(raw_count), warned_count
        except (TypeError, ValueError):
            logger.warning(
                "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
                raw_count,
                key_for_log,
            )
            return 0, warned_count + 1
```

如果你采用这些本地辅助函数：

- 你可以删除 `_normalize_platform_stats_row`、`_normalize_platform_stats_timestamp` 和 `_parse_platform_stats_count`。
- `_merge_platform_stats_rows` 中的控制流会变得线性：`row -> key or None -> parse_count -> aggregate or non_mergeable`。
- 针对非法时间戳的重写将消失；非法时间戳只会导致 `key=None`，从而保留原始的 `row` 透传。
</issue_to_address>

Sourcery 对开源项目是免费的——如果你喜欢我们的代码审查，请考虑分享给他人 ✨

_{帮我变得更有用！请在每条评论上点击 👍 或 👎，我会根据反馈改进后续的代码审查。}

Original comment in English

Hey - I've found 1 issue

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+            return normalized_rows
+        return rows
+
+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider inlining timestamp normalization and count parsing as small local helpers inside `_merge_platform_stats_rows` so the `platform_stats` import path becomes a single linear flow without extra helper methods or tuple plumbing.

You can simplify the `platform_stats` path by collapsing most of the logic into `_merge_platform_stats_rows` and using a small local `build_key` and `parse_count` helper. This keeps behavior (normalization, non‑mergeable handling, warn limit) but removes tuple threading and class‑level helpers.

### 1. Flatten normalization + key building

Instead of `_normalize_platform_stats_row` + `_normalize_platform_stats_timestamp` + `_to_utc_iso` returning `(row, ts, bool)`, inline this in `_merge_platform_stats_rows` with a single `build_key(row)` that returns `None` when non‑mergeable:

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    non_mergeable: list[dict[str, Any]] = []
    invalid_count_warned = 0

    def normalize_timestamp(value: Any) -> str | None:
        if isinstance(value, datetime):
            return self._to_utc_iso(value)
        if isinstance(value, str):
            ts = value.strip()
            if not ts:
                return None
            if ts.endswith("Z"):
                ts = f"{ts[:-1]}+00:00"
            try:
                return self._to_utc_iso(datetime.fromisoformat(ts))
            except ValueError:
                return None
        return None

    def build_key(row: dict[str, Any]) -> tuple[str, str, str] | None:
        ts = normalize_timestamp(row.get("timestamp"))
        pid = row.get("platform_id")
        ptype = row.get("platform_type")
        if ts is None or not isinstance(pid, str) or not isinstance(ptype, str):
            return None
        # keep normalized timestamp on row so later code sees it
        row["timestamp"] = ts
        return ts, pid, ptype
```

With this, you no longer need `_normalize_platform_stats_row` or the `is_timestamp_valid` flag; non‑mergeable rows are simply those with `build_key(row) is None` and are left as‑is:

```python
    for raw in rows:
        row = dict(raw)
        key = build_key(row)
        key_for_log = (
            row.get("timestamp"),
            repr(row.get("platform_id")),
            repr(row.get("platform_type")),
        )
        count, invalid_count_warned = parse_count(
            row.get("count", 0),
            key_for_log,
            invalid_count_warned,
        )
        row["count"] = count

        if key is None:
            non_mergeable.append(row)
            continue

        existing = merged.get(key)
        if existing is None:
            merged[key] = row
        else:
            existing["count"] += count

    return [*non_mergeable, *merged.values()]
```

### 2. Simplify count parsing + warning limit

You can keep the same warning semantics but simplify `_parse_platform_stats_count`’s API and usage by making it a small local helper that only deals with `raw_count` and a simple integer counter:

```python
    def parse_count(
        raw_count: Any,
        key_for_log: tuple[Any, Any, Any],
        warned_count: int,
    ) -> tuple[int, int]:
        if warned_count >= PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
            try:
                return int(raw_count), warned_count
            except (TypeError, ValueError):
                return 0, warned_count
        try:
            return int(raw_count), warned_count
        except (TypeError, ValueError):
            logger.warning(
                "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
                raw_count,
                key_for_log,
            )
            return 0, warned_count + 1
```

If you adopt these local helpers:

- You can delete `_normalize_platform_stats_row`, `_normalize_platform_stats_timestamp`, and `_parse_platform_stats_count`.
- Control flow in `_merge_platform_stats_rows` becomes linear: `row -> key or None -> parse_count -> aggregate or non_mergeable`.
- Timestamp rewriting for invalid values disappears; invalid timestamps just result in `key=None`, preserving the original `row` for pass‑through.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/backup/importer.py

zouyonghe · 2026-03-01T11:11:32Z

Addressed this comment in commit 628defa:

Inlined timestamp normalization and key construction into local helpers inside _merge_platform_stats_rows:
- normalize_timestamp(...)
- build_key(...)
- parse_count(...)
Removed class-level helper methods and tuple-plumbing:
- _parse_platform_stats_count
- _normalize_platform_stats_row
- _normalize_platform_stats_timestamp
Kept behavior for dedup + warning rate-limit + non-mergeable handling.
Updated test to validate naive timestamp normalization through _merge_platform_stats_rows path (instead of removed helper method).

Local verification:

uv run pytest tests/test_backup.py -q
uv run ruff check .

zouyonghe · 2026-03-01T11:11:34Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 2 个问题

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="64-67" />
<code_context>

 CMD_CONFIG_FILE_PATH = os.path.join(get_astrbot_data_path(), "cmd_config.json")
 KB_PATH = get_astrbot_knowledge_base_path()
+PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5


</code_context>
<issue_to_address>
**suggestion (bug_risk):** 对无效 `count` 警告做频率限制会在达到上限后把后续无效值静默强制为 0，这可能掩盖数据问题。

一旦 `invalid_count_warned` 达到 `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT`，后续无效的 `count` 值会被静默地强制为 `0`。这会隐藏持续存在的数据问题（例如导出损坏或 schema 变更）。请在达到限制时发出一次性的汇总警告（例如：“后续无效 count 已被抑制”），或者继续以较低的日志级别（如 `debug`）记录后续无效值，这样既能保持日志量受控，又能让持续性问题仍然可观测。
</issue_to_address>

### Comment 2
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+            return normalized_rows
+        return rows
+
+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** 建议将 `_merge_platform_stats_rows` 中的嵌套辅助函数提取成私有方法，避免隐藏式修改，并拆分解析与日志逻辑，使函数更扁平、更易理解和扩展。

在保持当前行为不变的前提下，你可以通过抽取内部辅助函数、移除隐藏式修改，并将“解析”和“日志”职责分离来降低复杂度。

### 1. 将嵌套辅助函数移到私有方法中

这会让 `_merge_platform_stats_rows` 更易扫读，也方便复用和测试。

```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str | None:
    if isinstance(value, datetime):
        return self._to_utc_iso(value)
    if isinstance(value, str):
        timestamp = value.strip()
        if not timestamp:
            return None
        if timestamp.endswith("Z"):
            timestamp = f"{timestamp[:-1]}+00:00"
        try:
            return self._to_utc_iso(datetime.fromisoformat(timestamp))
        except ValueError:
            return None
    return None

def _build_platform_stats_key(
    self, row: dict[str, Any]
) -> tuple[str, str, str] | None:
    normalized_timestamp = self._normalize_platform_stats_timestamp(row.get("timestamp"))
    platform_id = row.get("platform_id")
    platform_type = row.get("platform_type")
    if (
        normalized_timestamp is None
        or not isinstance(platform_id, str)
        or not isinstance(platform_type, str)
    ):
        return None
    return (normalized_timestamp, platform_id, platform_type), normalized_timestamp
```

这样 `_merge_platform_stats_rows` 主要就变成了一个编排函数。

### 2. 在构建 key 时避免隐藏式修改

当前的 `build_key` 会作为副作用修改 `row["timestamp"]`。相反，可以返回归一化后的时间戳，让调用方在合适的位置显式更新该字段：

```python
key_and_ts = self._build_platform_stats_key(normalized_row)
if key_and_ts is None:
    non_mergeable.append(normalized_row)
    continue

key, normalized_timestamp = key_and_ts
normalized_row["timestamp"] = normalized_timestamp
```

这样可以在调用点清楚地看到“哪些内容被修改了”。

### 3. 将解析和受限日志拆开

你可以通过拆分解析和日志逻辑来移除 `warned_count` 在函数间传递的线程式依赖：

```python
def _parse_platform_stats_count_value(self, raw_count: Any) -> int:
    try:
        return int(raw_count)
    except (TypeError, ValueError):
        return 0

def _log_invalid_platform_stats_count(
    self,
    raw_count: Any,
    key_for_log: tuple[Any, Any, Any],
    warned_count: int,
) -> int:
    if warned_count < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
        logger.warning(
            "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
            raw_count,
            key_for_log,
        )
        return warned_count + 1
    return warned_count
```

在 `_merge_platform_stats_rows` 中使用：

```python
warned = invalid_count_warned
raw_count = normalized_row.get("count", 0)
count = self._parse_platform_stats_count_value(raw_count)
if count == 0 and raw_count not in (0, "0"):
    warned = self._log_invalid_platform_stats_count(raw_count, key_for_log, warned)
invalid_count_warned = warned
normalized_row["count"] = count
```

这样既保持 `_merge_platform_stats_rows` 的状态（`invalid_count_warned`）是局部的，又不需要在嵌套辅助函数之间传进传出该状态，并让每个函数都只专注于单一职责。

这些改动应当在保持所有现有行为（包括限频日志和合并语义）不变的同时，使代码结构更扁平、更不易产生意外，并更容易扩展。
</issue_to_address>

Sourcery 对开源项目免费——如果你觉得我们的评审有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会利用这些反馈来改进后续的评审。}

Original comment in English

Hey - I've found 2 issues

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="64-67" />
<code_context>

 CMD_CONFIG_FILE_PATH = os.path.join(get_astrbot_data_path(), "cmd_config.json")
 KB_PATH = get_astrbot_knowledge_base_path()
+PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT = 5


</code_context>
<issue_to_address>
**suggestion (bug_risk):** Rate limiting of invalid `count` warnings silently coerces later invalid values to 0, which may hide data issues.

Once `invalid_count_warned` hits `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT`, later invalid `count` values are silently coerced to `0`. This hides ongoing data issues (e.g., broken exports or schema changes). Please either emit a one-time summary warning when the limit is reached (e.g., “further invalid counts suppressed”) or continue logging later invalid values at a lower level (such as `debug`) so persistent issues remain observable while keeping log volume bounded.
</issue_to_address>

### Comment 2
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+            return normalized_rows
+        return rows
+
+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the nested helpers in `_merge_platform_stats_rows` into private methods, avoiding hidden mutation, and splitting parsing from logging to make the function flatter and easier to understand and extend.

You can keep the current behaviour but reduce complexity by extracting the inner helpers, removing hidden mutation, and separating “parse vs log” concerns.

### 1. Move nested helpers to private methods

This makes `_merge_platform_stats_rows` easier to scan and allows reuse/testing.

```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str | None:
    if isinstance(value, datetime):
        return self._to_utc_iso(value)
    if isinstance(value, str):
        timestamp = value.strip()
        if not timestamp:
            return None
        if timestamp.endswith("Z"):
            timestamp = f"{timestamp[:-1]}+00:00"
        try:
            return self._to_utc_iso(datetime.fromisoformat(timestamp))
        except ValueError:
            return None
    return None

def _build_platform_stats_key(
    self, row: dict[str, Any]
) -> tuple[str, str, str] | None:
    normalized_timestamp = self._normalize_platform_stats_timestamp(row.get("timestamp"))
    platform_id = row.get("platform_id")
    platform_type = row.get("platform_type")
    if (
        normalized_timestamp is None
        or not isinstance(platform_id, str)
        or not isinstance(platform_type, str)
    ):
        return None
    return (normalized_timestamp, platform_id, platform_type), normalized_timestamp
```

Then `_merge_platform_stats_rows` becomes mostly orchestration.

### 2. Avoid hidden mutation in key building

Currently `build_key` mutates `row["timestamp"]` as a side effect. Instead, return the normalized timestamp and let the caller update the row explicitly:

```python
key_and_ts = self._build_platform_stats_key(normalized_row)
if key_and_ts is None:
    non_mergeable.append(normalized_row)
    continue

key, normalized_timestamp = key_and_ts
normalized_row["timestamp"] = normalized_timestamp
```

This keeps the “what is being changed” visible at the call site.

### 3. Separate parsing from rate‑limited logging

You can remove the `warned_count` threading by splitting parsing and logging:

```python
def _parse_platform_stats_count_value(self, raw_count: Any) -> int:
    try:
        return int(raw_count)
    except (TypeError, ValueError):
        return 0

def _log_invalid_platform_stats_count(
    self,
    raw_count: Any,
    key_for_log: tuple[Any, Any, Any],
    warned_count: int,
) -> int:
    if warned_count < PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT:
        logger.warning(
            "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
            raw_count,
            key_for_log,
        )
        return warned_count + 1
    return warned_count
```

Usage in `_merge_platform_stats_rows`:

```python
warned = invalid_count_warned
raw_count = normalized_row.get("count", 0)
count = self._parse_platform_stats_count_value(raw_count)
if count == 0 and raw_count not in (0, "0"):
    warned = self._log_invalid_platform_stats_count(raw_count, key_for_log, warned)
invalid_count_warned = warned
normalized_row["count"] = count
```

This keeps `_merge_platform_stats_rows` state local (`invalid_count_warned`) but removes the need to pass it in/out of a nested helper and keeps each function single‑purpose.

These changes should preserve all behaviour (including rate‑limited logging and merging semantics) while making the code flatter, less surprising, and easier to extend.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/backup/importer.py

zouyonghe · 2026-03-01T11:16:43Z

@sourcery-ai review

zouyonghe · 2026-03-01T11:16:57Z

Addressed the two review comments in commit 960c4db:

Observability after warning cap

Added a one-time summary warning when invalid count warnings reach the cap:
- "platform_stats 非法 count 告警已达到上限 (...)，后续将抑制"
This keeps log volume bounded while still signaling that invalid values are continuing to be coerced.

Complexity refactor for _merge_platform_stats_rows

Moved nested helper responsibilities into private methods (no nested functions):
- _normalize_platform_stats_timestamp
- _build_platform_stats_key
- _parse_platform_stats_count_value
- _log_invalid_platform_stats_count
Removed hidden mutation in key building (timestamp mutation is explicit in merge flow).
Split parse and logging responsibilities as requested.

Tests updated:

adjusted invalid-count warning limit test to assert one extra summary warning after cap.

Local verification:

uv run pytest tests/test_backup.py -q
uv run ruff check .

sourcery-ai

Hey - 我发现了 1 个问题，并给出了一些整体性反馈：

_preprocess_main_table_rows 函数目前对硬编码的表名 "platform_stats" 做了特殊处理；可以考虑把这种映射集中起来（例如使用 table_name → preprocessor 的字典），这样以后新增按表定制的预处理逻辑时，就不需要在这个方法里继续增加分支。
在 _merge_platform_stats_rows 中，返回的列表是 [ *non_mergeable, *merged.values() ]，这会改变相对于输入的行顺序；如果这个表的导入顺序是有意义的，可能需要在聚合计数的同时保留原始顺序。
_log_invalid_platform_stats_count 中关于非法计数告警的限流逻辑比较复杂；可以考虑抽取一个小的辅助对象/状态，或者简化两个输出抑制日志的分支，使得行为更容易理解，也在以后需要修改逻辑时更不容易出错。

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `_preprocess_main_table_rows` function currently special-cases the hard-coded table name "platform_stats`; consider centralizing this mapping (e.g., a dict of table_name → preprocessor) so additional table-specific preprocessing can be added without further branching in this method.
- In `_merge_platform_stats_rows`, the returned list is `[ *non_mergeable, *merged.values() ]`, which changes row ordering relative to the input; if import order is significant for this table, you may want to preserve the original order while still aggregating counts.
- The invalid-count warning rate limiting in `_log_invalid_platform_stats_count` is fairly intricate; extracting a small helper object/state or simplifying the two branches that emit the suppression log could make the behavior easier to reason about and less error-prone if the logic needs to change later.

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+            return normalized_rows
+        return rows
+
+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying `_merge_platform_stats_rows` by using a local closure for invalid-count warnings and inlining the key-building logic to reduce helper functions and threaded state.

You can keep the current behavior but reduce complexity and state‑threading by:

### 1. Encapsulate invalid‑count warning state in a local helper

You don’t need to pass `invalid_count_warned` and `suppression_warned` through multiple calls. A simple closure in `_merge_platform_stats_rows` keeps state local and uses a single counter:

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    non_mergeable: list[dict[str, Any]] = []

    invalid_count_warnings = 0

    def log_invalid_count(raw_count: Any, key_for_log: tuple[Any, Any, Any]) -> None:
        nonlocal invalid_count_warnings
        limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT

        if invalid_count_warnings < limit:
            logger.warning(
                "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
                raw_count,
                key_for_log,
            )
            invalid_count_warnings += 1
            if invalid_count_warnings == limit:
                logger.warning(
                    "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制", limit
                )

    for row in rows:
        normalized_row = dict(row)
        raw_timestamp = normalized_row.get("timestamp")
        platform_id = normalized_row.get("platform_id")
        platform_type = normalized_row.get("platform_type")

        # ... timestamp normalization + key building (see below) ...

        count, is_valid_count = self._parse_platform_stats_count_value(
            normalized_row.get("count", 0)
        )
        if not is_valid_count:
            key_for_log = (
                normalized_row.get("timestamp"),
                repr(platform_id),
                repr(platform_type),
            )
            log_invalid_count(normalized_row.get("count", 0), key_for_log)
        normalized_row["count"] = count

        # ... merge logic ...
```

This preserves the warning limit and suppression behavior but removes the need for `_log_invalid_platform_stats_count`, `warned_count`, and `suppression_warned` as threaded arguments.

### 2. Inline key building to avoid tiny cross‑calling helpers

`_build_platform_stats_key` is small and only used once. Inlining makes control flow easier to follow and removes the need to compute `key_for_log` separately:

```python
for row in rows:
    normalized_row = dict(row)
    raw_timestamp = normalized_row.get("timestamp")
    platform_id = normalized_row.get("platform_id")
    platform_type = normalized_row.get("platform_type")

    normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
    if normalized_timestamp is not None:
        normalized_row["timestamp"] = normalized_timestamp
    elif isinstance(raw_timestamp, str):
        normalized_row["timestamp"] = raw_timestamp.strip()
    elif raw_timestamp is None:
        normalized_row["timestamp"] = ""
    else:
        normalized_row["timestamp"] = str(raw_timestamp)

    if (
        normalized_timestamp is None
        or not isinstance(platform_id, str)
        or not isinstance(platform_type, str)
    ):
        key: tuple[str, str, str] | None = None
    else:
        key = (normalized_timestamp, platform_id, platform_type)

    # rest of loop unchanged: count parsing, invalid count logging, merge/non-merge paths
```

With this, you can safely delete `_build_platform_stats_key`. The behavior (merge key conditions and “non‑mergeable” treatment) stays identical but the number of tiny helpers and back‑and‑forth arguments is reduced.

### 3. Optional: keep timestamp normalization helper, but use it more linearly

`_normalize_platform_stats_timestamp` / `_to_utc_iso` are fine as shared utilities; the main readability win comes from (1) and (2). If you want to reduce indirection further while preserving behavior, consider placing the timestamp normalization logic as a clearly commented block in `_merge_platform_stats_rows` instead of two separate helpers, but that’s less critical than simplifying the warning and key‑building paths above.
</issue_to_address>

Sourcery 对开源项目免费——如果你觉得我们的 Review 有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续的 Review。}

Original comment in English

Hey - I've found 1 issue, and left some high level feedback:

The _preprocess_main_table_rows function currently special-cases the hard-coded table name "platform_stats`; consider centralizing this mapping (e.g., a dict of table_name → preprocessor) so additional table-specific preprocessing can be added without further branching in this method.
In _merge_platform_stats_rows, the returned list is [ *non_mergeable, *merged.values() ], which changes row ordering relative to the input; if import order is significant for this table, you may want to preserve the original order while still aggregating counts.
The invalid-count warning rate limiting in _log_invalid_platform_stats_count is fairly intricate; extracting a small helper object/state or simplifying the two branches that emit the suppression log could make the behavior easier to reason about and less error-prone if the logic needs to change later.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `_preprocess_main_table_rows` function currently special-cases the hard-coded table name "platform_stats`; consider centralizing this mapping (e.g., a dict of table_name → preprocessor) so additional table-specific preprocessing can be added without further branching in this method.
- In `_merge_platform_stats_rows`, the returned list is `[ *non_mergeable, *merged.values() ]`, which changes row ordering relative to the input; if import order is significant for this table, you may want to preserve the original order while still aggregating counts.
- The invalid-count warning rate limiting in `_log_invalid_platform_stats_count` is fairly intricate; extracting a small helper object/state or simplifying the two branches that emit the suppression log could make the behavior easier to reason about and less error-prone if the logic needs to change later.

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="538" />
<code_context>
+            return normalized_rows
+        return rows
+
+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying `_merge_platform_stats_rows` by using a local closure for invalid-count warnings and inlining the key-building logic to reduce helper functions and threaded state.

You can keep the current behavior but reduce complexity and state‑threading by:

### 1. Encapsulate invalid‑count warning state in a local helper

You don’t need to pass `invalid_count_warned` and `suppression_warned` through multiple calls. A simple closure in `_merge_platform_stats_rows` keeps state local and uses a single counter:

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    non_mergeable: list[dict[str, Any]] = []

    invalid_count_warnings = 0

    def log_invalid_count(raw_count: Any, key_for_log: tuple[Any, Any, Any]) -> None:
        nonlocal invalid_count_warnings
        limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT

        if invalid_count_warnings < limit:
            logger.warning(
                "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
                raw_count,
                key_for_log,
            )
            invalid_count_warnings += 1
            if invalid_count_warnings == limit:
                logger.warning(
                    "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制", limit
                )

    for row in rows:
        normalized_row = dict(row)
        raw_timestamp = normalized_row.get("timestamp")
        platform_id = normalized_row.get("platform_id")
        platform_type = normalized_row.get("platform_type")

        # ... timestamp normalization + key building (see below) ...

        count, is_valid_count = self._parse_platform_stats_count_value(
            normalized_row.get("count", 0)
        )
        if not is_valid_count:
            key_for_log = (
                normalized_row.get("timestamp"),
                repr(platform_id),
                repr(platform_type),
            )
            log_invalid_count(normalized_row.get("count", 0), key_for_log)
        normalized_row["count"] = count

        # ... merge logic ...
```

This preserves the warning limit and suppression behavior but removes the need for `_log_invalid_platform_stats_count`, `warned_count`, and `suppression_warned` as threaded arguments.

### 2. Inline key building to avoid tiny cross‑calling helpers

`_build_platform_stats_key` is small and only used once. Inlining makes control flow easier to follow and removes the need to compute `key_for_log` separately:

```python
for row in rows:
    normalized_row = dict(row)
    raw_timestamp = normalized_row.get("timestamp")
    platform_id = normalized_row.get("platform_id")
    platform_type = normalized_row.get("platform_type")

    normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
    if normalized_timestamp is not None:
        normalized_row["timestamp"] = normalized_timestamp
    elif isinstance(raw_timestamp, str):
        normalized_row["timestamp"] = raw_timestamp.strip()
    elif raw_timestamp is None:
        normalized_row["timestamp"] = ""
    else:
        normalized_row["timestamp"] = str(raw_timestamp)

    if (
        normalized_timestamp is None
        or not isinstance(platform_id, str)
        or not isinstance(platform_type, str)
    ):
        key: tuple[str, str, str] | None = None
    else:
        key = (normalized_timestamp, platform_id, platform_type)

    # rest of loop unchanged: count parsing, invalid count logging, merge/non-merge paths
```

With this, you can safely delete `_build_platform_stats_key`. The behavior (merge key conditions and “non‑mergeable” treatment) stays identical but the number of tiny helpers and back‑and‑forth arguments is reduced.

### 3. Optional: keep timestamp normalization helper, but use it more linearly

`_normalize_platform_stats_timestamp` / `_to_utc_iso` are fine as shared utilities; the main readability win comes from (1) and (2). If you want to reduce indirection further while preserving behavior, consider placing the timestamp normalization logic as a clearly commented block in `_merge_platform_stats_rows` instead of two separate helpers, but that’s less critical than simplifying the warning and key‑building paths above.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/backup/importer.py

zouyonghe · 2026-03-01T11:30:56Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 1 个问题，并给出了一些整体层面的反馈：

在 log_invalid_count 中，新引入的 PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT 行为在限制值为 0 时有点不直观（在第一个非法值时仍然会打印一条“已达到上限 (0)”的日志）；建议简化这一分支逻辑，让 0 明确表示“完全不告警”，或者在文档中说明这种特殊情况的行为。
_preprocess_main_table_rows 中对每张存在重复 platform_stats 键的表都打 warning，在大批量导入时可能会非常吵；可以考虑把这条日志降级为 info，或者增加一个阈值再打印日志，以避免在正常运行时产生日志噪音。

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The new `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` behavior in `log_invalid_count` is a bit unintuitive when the limit is 0 (it still logs a '已达到上限 (0)' message on the first invalid value); consider simplifying this branch so that 0 cleanly means 'no warnings' or documenting the special-case behavior.
- The warning in `_preprocess_main_table_rows` for every table with duplicate `platform_stats` keys could get noisy on large imports; you might want to downgrade this to `info` or add a threshold before logging to avoid log spam in normal operation.

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>

         return imported

+    def _preprocess_main_table_rows(
+        self, table_name: str, rows: list[dict[str, Any]]
+    ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** 考虑将新增的 `platform_stats` 行预处理和合并逻辑重构为更清晰、层次更扁平的辅助函数，并使用更简单的条件逻辑，以便在不改变行为的前提下让整体流程更容易理解和维护。

你可以保留目前所有的新行为，只是对几个关键热点做一下简化，让代码更易读。

### 1. 用直接条件替换只处理单一情况的分发字典

当前 `_preprocess_main_table_rows` 使用了只包含一个条目的字典；此处改用直接的条件判断会更简单，也能让执行流程更清晰：

```python
def _preprocess_main_table_rows(
    self, table_name: str, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    if table_name == "platform_stats":
        normalized_rows = self._merge_platform_stats_rows(rows)
        duplicate_count = len(rows) - len(normalized_rows)
        if duplicate_count > 0:
            logger.warning(
                "检测到 %s 重复键 %d 条，已在导入前聚合",
                table_name,
                duplicate_count,
            )
        return normalized_rows
    return rows
```

当真正需要支持多张表时，再引入分发表也不迟。

---

### 2. 在保持语义不变的前提下简化非法 count 的限流逻辑

你可以保持当前所有行为（包括 `limit <= 0` 时“全部抑制，但打印一次提示”的语义），但通过理顺控制流程并把辅助函数从嵌套作用域中抽取出来，使代码更易理解。

当前行为总结（需要原样保留）：
- `limit <= 0`：遇到第一个非法值时，仅打印一次“已达到上限 / 将抑制后续告警”的提示；不对每个非法值单独打印告警。
- `limit > 0`：对前 `limit` 个非法值逐个打印告警，当达到 `limit` 时再打印一条“已达到上限 / 将抑制后续告警”的提示，之后保持静默。

更清晰的实现方式示例：

```python
def _log_platform_stats_invalid_count(
    self,
    raw_count: Any,
    key_for_log: tuple[Any, Any, Any],
    invalid_count_warnings: int,
) -> int:
    limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT

    # limit <= 0: only log the suppression message once
    if limit <= 0:
        if invalid_count_warnings == 0:
            logger.warning(
                "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                limit,
            )
            return 1
        return invalid_count_warnings

    # limit > 0: log up to `limit` invalids, then one suppression message
    if invalid_count_warnings < limit:
        logger.warning(
            "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
            raw_count,
            key_for_log,
        )
        invalid_count_warnings += 1
        if invalid_count_warnings == limit:
            logger.warning(
                "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                limit,
            )
            # keep counting above limit to avoid repeated suppression logs
            invalid_count_warnings += 1

    return invalid_count_warnings


def _parse_platform_stats_count(
    self,
    raw_count: Any,
    key_for_log: tuple[Any, Any, Any],
    invalid_count_warnings: int,
) -> tuple[int, int]:
    try:
        return int(raw_count), invalid_count_warnings
    except (TypeError, ValueError):
        invalid_count_warnings = self._log_platform_stats_invalid_count(
            raw_count, key_for_log, invalid_count_warnings
        )
        return 0, invalid_count_warnings
```

然后 `_merge_platform_stats_rows` 就可以变得更扁平，并避免内部定义函数：

```python
invalid_count_warnings = 0

for row in rows:
    # ... build normalized_row, key_for_log ...

    count, invalid_count_warnings = self._parse_platform_stats_count(
        normalized_row.get("count", 0),
        key_for_log,
        invalid_count_warnings,
    )
    normalized_row["count"] = count

    # ... merge logic ...
```

这样可以消除闭包语义，同时使限流行为更加显式。

---

### 3. 抽取行归一化 / 合并键计算逻辑

目前 `_merge_platform_stats_rows` 的主循环混合了归一化、日志和合并逻辑。通过抽取一个返回 `(merge_key | None, normalized_row)` 的辅助函数，可以让控制流更加线性清晰：

```python
def _normalize_platform_stats_row(
    self, row: dict[str, Any]
) -> tuple[tuple[str, str, str] | None, dict[str, Any], tuple[Any, Any, Any]]:
    normalized_row = dict(row)
    raw_timestamp = normalized_row.get("timestamp")
    normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
    platform_id = normalized_row.get("platform_id")
    platform_type = normalized_row.get("platform_type")

    # normalize timestamp field for storage/logging
    if normalized_timestamp is not None:
        normalized_row["timestamp"] = normalized_timestamp
    elif isinstance(raw_timestamp, str):
        normalized_row["timestamp"] = raw_timestamp.strip()
    elif raw_timestamp is None:
        normalized_row["timestamp"] = ""
    else:
        normalized_row["timestamp"] = str(raw_timestamp)

    key_for_log = (
        normalized_row.get("timestamp"),
        repr(platform_id),
        repr(platform_type),
    )

    # decide whether this row is mergeable
    if (
        normalized_timestamp is None
        or not isinstance(platform_id, str)
        or not isinstance(platform_type, str)
    ):
        merge_key: tuple[str, str, str] | None = None
    else:
        merge_key = (normalized_timestamp, platform_id, platform_type)

    return merge_key, normalized_row, key_for_log
```

随后 `_merge_platform_stats_rows` 可以简化为：

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    result: list[dict[str, Any]] = []
    invalid_count_warnings = 0

    for row in rows:
        merge_key, normalized_row, key_for_log = self._normalize_platform_stats_row(row)

        count, invalid_count_warnings = self._parse_platform_stats_count(
            normalized_row.get("count", 0),
            key_for_log,
            invalid_count_warnings,
        )
        normalized_row["count"] = count

        if merge_key is None:
            result.append(normalized_row)
            continue

        existing = merged.get(merge_key)
        if existing is None:
            merged[merge_key] = normalized_row
            result.append(normalized_row)
        else:
            existing["count"] += count

    return result
```

这样可以保持当前所有语义（包括哪些行会被合并、哪些会单独保留），同时让逐行处理和合并决策的逻辑更容易阅读和维护。
</issue_to_address>

Sourcery 对开源项目是免费的——如果你觉得这次 Review 有帮助，欢迎分享给更多人 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈来改进后续的 Review。}

Original comment in English

Hey - I've found 1 issue, and left some high level feedback:

The new PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT behavior in log_invalid_count is a bit unintuitive when the limit is 0 (it still logs a '已达到上限 (0)' message on the first invalid value); consider simplifying this branch so that 0 cleanly means 'no warnings' or documenting the special-case behavior.
The warning in _preprocess_main_table_rows for every table with duplicate platform_stats keys could get noisy on large imports; you might want to downgrade this to info or add a threshold before logging to avoid log spam in normal operation.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The new `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` behavior in `log_invalid_count` is a bit unintuitive when the limit is 0 (it still logs a '已达到上限 (0)' message on the first invalid value); consider simplifying this branch so that 0 cleanly means 'no warnings' or documenting the special-case behavior.
- The warning in `_preprocess_main_table_rows` for every table with duplicate `platform_stats` keys could get noisy on large imports; you might want to downgrade this to `info` or add a threshold before logging to avoid log spam in normal operation.

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="525" />
<code_context>

         return imported

+    def _preprocess_main_table_rows(
+        self, table_name: str, rows: list[dict[str, Any]]
+    ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the new preprocessing and merge logic for `platform_stats` rows into clearer, flatter helpers and simpler conditionals to make the flow easier to follow and maintain without changing behavior.

You can keep all the new behavior but simplify a few hotspots to make the code easier to follow.

### 1. Replace single‑case dispatch dict

`_preprocess_main_table_rows` currently uses a dict with a single entry. A direct conditional is simpler and makes the flow clearer:

```python
def _preprocess_main_table_rows(
    self, table_name: str, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    if table_name == "platform_stats":
        normalized_rows = self._merge_platform_stats_rows(rows)
        duplicate_count = len(rows) - len(normalized_rows)
        if duplicate_count > 0:
            logger.warning(
                "检测到 %s 重复键 %d 条，已在导入前聚合",
                table_name,
                duplicate_count,
            )
        return normalized_rows
    return rows
```

You can always reintroduce a dispatch map when there are multiple tables.

---

### 2. Simplify invalid count rate limiting (while preserving semantics)

You can keep the current behavior (including the `limit <= 0` “suppress all, but log once” case) but make the control flow clearer and move the helpers out of the nested scope.

Current behavior recap (to preserve exactly):
- `limit <= 0`: on the first invalid value, log only the “reached limit / suppress” message once; no per‑value warnings.
- `limit > 0`: log per invalid value up to `limit`, then log a single “reached limit / suppress” message, then stay silent.

A clearer implementation:

```python
def _log_platform_stats_invalid_count(
    self,
    raw_count: Any,
    key_for_log: tuple[Any, Any, Any],
    invalid_count_warnings: int,
) -> int:
    limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT

    # limit <= 0: only log the suppression message once
    if limit <= 0:
        if invalid_count_warnings == 0:
            logger.warning(
                "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                limit,
            )
            return 1
        return invalid_count_warnings

    # limit > 0: log up to `limit` invalids, then one suppression message
    if invalid_count_warnings < limit:
        logger.warning(
            "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
            raw_count,
            key_for_log,
        )
        invalid_count_warnings += 1
        if invalid_count_warnings == limit:
            logger.warning(
                "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                limit,
            )
            # keep counting above limit to avoid repeated suppression logs
            invalid_count_warnings += 1

    return invalid_count_warnings


def _parse_platform_stats_count(
    self,
    raw_count: Any,
    key_for_log: tuple[Any, Any, Any],
    invalid_count_warnings: int,
) -> tuple[int, int]:
    try:
        return int(raw_count), invalid_count_warnings
    except (TypeError, ValueError):
        invalid_count_warnings = self._log_platform_stats_invalid_count(
            raw_count, key_for_log, invalid_count_warnings
        )
        return 0, invalid_count_warnings
```

Then `_merge_platform_stats_rows` becomes flatter and avoids inner functions:

```python
invalid_count_warnings = 0

for row in rows:
    # ... build normalized_row, key_for_log ...

    count, invalid_count_warnings = self._parse_platform_stats_count(
        normalized_row.get("count", 0),
        key_for_log,
        invalid_count_warnings,
    )
    normalized_row["count"] = count

    # ... merge logic ...
```

This removes closure semantics and makes the rate‑limiting behavior explicit.

---

### 3. Extract row normalization/merge‑key computation

The main loop in `_merge_platform_stats_rows` mixes normalization, logging, and merging. Extracting a helper that returns `(merge_key | None, normalized_row)` makes the control flow linear:

```python
def _normalize_platform_stats_row(
    self, row: dict[str, Any]
) -> tuple[tuple[str, str, str] | None, dict[str, Any], tuple[Any, Any, Any]]:
    normalized_row = dict(row)
    raw_timestamp = normalized_row.get("timestamp")
    normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)
    platform_id = normalized_row.get("platform_id")
    platform_type = normalized_row.get("platform_type")

    # normalize timestamp field for storage/logging
    if normalized_timestamp is not None:
        normalized_row["timestamp"] = normalized_timestamp
    elif isinstance(raw_timestamp, str):
        normalized_row["timestamp"] = raw_timestamp.strip()
    elif raw_timestamp is None:
        normalized_row["timestamp"] = ""
    else:
        normalized_row["timestamp"] = str(raw_timestamp)

    key_for_log = (
        normalized_row.get("timestamp"),
        repr(platform_id),
        repr(platform_type),
    )

    # decide whether this row is mergeable
    if (
        normalized_timestamp is None
        or not isinstance(platform_id, str)
        or not isinstance(platform_type, str)
    ):
        merge_key: tuple[str, str, str] | None = None
    else:
        merge_key = (normalized_timestamp, platform_id, platform_type)

    return merge_key, normalized_row, key_for_log
```

Then `_merge_platform_stats_rows` simplifies to:

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    result: list[dict[str, Any]] = []
    invalid_count_warnings = 0

    for row in rows:
        merge_key, normalized_row, key_for_log = self._normalize_platform_stats_row(row)

        count, invalid_count_warnings = self._parse_platform_stats_count(
            normalized_row.get("count", 0),
            key_for_log,
            invalid_count_warnings,
        )
        normalized_row["count"] = count

        if merge_key is None:
            result.append(normalized_row)
            continue

        existing = merged.get(merge_key)
        if existing is None:
            merged[merge_key] = normalized_row
            result.append(normalized_row)
        else:
            existing["count"] += count

    return result
```

This keeps all existing semantics (including when rows are mergeable vs kept separate) but makes the per‑row logic and merge decisions easier to scan and maintain.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/backup/importer.py

zouyonghe · 2026-03-01T11:34:50Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 1 个问题，并留下了一些整体性的反馈：

新的 platform_stats 聚合日志使用的是 logger.info，而示例日志和说明中使用的是 warning 级别；建议将其改为 logger.warning（或者相应更新说明），以保持行为和预期的一致性。
硬编码的 PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT 可以改为可配置参数（例如通过配置文件或环境变量），这样运维人员可以在不同环境中根据需要调整日志量，而不必修改代码。

面向 AI Agent 的提示词

Please address the comments from this code review:

## Overall Comments
- The new `platform_stats` aggregation logs use `logger.info` while the sample logs and description refer to a warning level; consider switching this to `logger.warning` (or adjusting the description) so the behavior and expectations stay consistent.
- The hard-coded `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` could be made configurable (e.g., via settings or environment) to allow operators to tune log volume in different environments without code changes.

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="578" />
<code_context>
+
+        return result
+
+    def _normalize_platform_stats_row(
+        self, row: dict[str, Any]
+    ) -> tuple[tuple[str, str, str] | None, dict[str, Any], tuple[Any, Any, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** 考虑通过合并强耦合的辅助函数、返回更简单的值、并将解析和日志逻辑本地化，来简化 platform_stats 的辅助方法，从而减少间接调用和基于元组的“协议”复杂度。

你可以在保持现有行为不变的前提下，通过合并部分强耦合的 helper，并简化它们之间的“协议”，来减少间接层次。

### 1. 简化 `_normalize_platform_stats_row` 的返回类型

目前它返回一个三元组 `(merge_key | None, normalized_row, key_for_log)`，这是造成“元组协议”复杂度的主要来源。

你可以改为：

- 只返回归一化后的行（normalized row）。
- 让 `_merge_platform_stats_rows` 基于归一化后的行同时计算 merge key 和 log key。

这样可以让“归一化”逻辑保持可复用，同时把“什么是 key”这部分逻辑集中在一个地方。

```python
def _normalize_platform_stats_row(
    self, row: dict[str, Any]
) -> dict[str, Any]:
    normalized_row = dict(row)
    raw_timestamp = normalized_row.get("timestamp")
    normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)

    if normalized_timestamp is not None:
        normalized_row["timestamp"] = normalized_timestamp
    elif isinstance(raw_timestamp, str):
        normalized_row["timestamp"] = raw_timestamp.strip()
    elif raw_timestamp is None:
        normalized_row["timestamp"] = ""
    else:
        normalized_row["timestamp"] = str(raw_timestamp)

    return normalized_row
```

然后由 `_merge_platform_stats_rows` 直接构建 key 并实现合并规则：

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    result: list[dict[str, Any]] = []
    invalid_count_warnings = 0

    for row in rows:
        normalized_row = self._normalize_platform_stats_row(row)

        timestamp = normalized_row.get("timestamp")
        platform_id = normalized_row.get("platform_id")
        platform_type = normalized_row.get("platform_type")

        key_for_log = (timestamp, repr(platform_id), repr(platform_type))

        count, invalid_count_warnings = self._parse_platform_stats_count(
            normalized_row.get("count", 0),
            key_for_log,
            invalid_count_warnings,
        )
        normalized_row["count"] = count

        if (
            not isinstance(timestamp, str)
            or not isinstance(platform_id, str)
            or not isinstance(platform_type, str)
        ):
            result.append(normalized_row)
            continue

        merge_key = (timestamp, platform_id, platform_type)

        existing = merged.get(merge_key)
        if existing is None:
            merged[merge_key] = normalized_row
            result.append(normalized_row)
        else:
            existing["count"] += count

    return result
```

这样可以在保留所有当前语义（合并规则和日志 key）的前提下，去掉三元组返回和在 normalizer 内构造 key 的逻辑。

### 2. 将非法 count 的告警逻辑本地化

当前的 `invalid_count_warnings` 协议（通过 `_parse_platform_stats_count` 和 `_log_platform_stats_invalid_count` 传递，并带有 `+1` 的计数）比较难以理解。

你可以在保持频率限制行为完全一致的情况下：

- 让 `_parse_platform_stats_count` 只负责解析。
- 把所有计数和频率限制逻辑都放到 `_merge_platform_stats_rows` 里集中处理。

例如：

```python
def _parse_platform_stats_count(self, raw_count: Any) -> int | None:
    try:
        return int(raw_count)
    except (TypeError, ValueError):
        return None
```

然后：

```python
def _merge_platform_stats_rows(...):
    ...
    invalid_count_warnings = 0
    limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT
    suppression_logged = False

    for row in rows:
        ...
        parsed = self._parse_platform_stats_count(normalized_row.get("count", 0))
        if parsed is None:
            if limit > 0:
                if invalid_count_warnings < limit:
                    logger.warning(
                        "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
                        normalized_row.get("count", 0),
                        key_for_log,
                    )
                    invalid_count_warnings += 1
                    if invalid_count_warnings == limit and not suppression_logged:
                        logger.warning(
                            "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                            limit,
                        )
                        suppression_logged = True
            elif not suppression_logged:
                logger.warning(
                    "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                    limit,
                )
                suppression_logged = True

            count = 0
        else:
            count = parsed
        normalized_row["count"] = count
        ...
```

这样保持了现有行为（上限、只记录一次抑制日志、`limit <= 0` 的语义），但不再需要在 helper 之间来回传递计数器，也无需理解 `_log_platform_stats_invalid_count` 的“状态机”。

### 3. 将 `_to_utc_iso` 内联到 `_normalize_platform_stats_timestamp` 中

这两个方法是强耦合且只在这里使用。内联后可以在不改变行为的情况下减少一次跳转：

```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str | None:
    def _to_utc_iso(dt: datetime) -> str:
        if dt.tzinfo is None:
            dt = dt.replace(tzinfo=timezone.utc)
        else:
            dt = dt.astimezone(timezone.utc)
        return dt.isoformat()

    if isinstance(value, datetime):
        return _to_utc_iso(value)
    if isinstance(value, str):
        timestamp = value.strip()
        if not timestamp:
            return None
        if timestamp.endswith("Z"):
            timestamp = f"{timestamp[:-1]}+00:00"
        try:
            return _to_utc_iso(datetime.fromisoformat(timestamp))
        except ValueError:
            return None
    return None
```

这样可以让时间戳处理逻辑更加自包含，同时减少在阅读代码时不同方法之间的跳转。
</issue_to_address>

Sourcery 对开源项目是免费的——如果你觉得这些 Review 有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续的 Review。}

Original comment in English

Hey - I've found 1 issue, and left some high level feedback:

The new platform_stats aggregation logs use logger.info while the sample logs and description refer to a warning level; consider switching this to logger.warning (or adjusting the description) so the behavior and expectations stay consistent.
The hard-coded PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT could be made configurable (e.g., via settings or environment) to allow operators to tune log volume in different environments without code changes.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The new `platform_stats` aggregation logs use `logger.info` while the sample logs and description refer to a warning level; consider switching this to `logger.warning` (or adjusting the description) so the behavior and expectations stay consistent.
- The hard-coded `PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT` could be made configurable (e.g., via settings or environment) to allow operators to tune log volume in different environments without code changes.

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="578" />
<code_context>
+
+        return result
+
+    def _normalize_platform_stats_row(
+        self, row: dict[str, Any]
+    ) -> tuple[tuple[str, str, str] | None, dict[str, Any], tuple[Any, Any, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the platform_stats helpers by collapsing tightly coupled functions, returning simpler values, and localizing parsing and logging logic to reduce indirection and tuple-based protocols.

You can keep all behavior but reduce indirection by collapsing some of the tightly‑coupled helpers and simplifying the “protocols” between them.

### 1. Simplify `_normalize_platform_stats_row`’s return type

Right now you return a triple `(merge_key | None, normalized_row, key_for_log)`. That’s the main source of “tuple protocol” complexity.

You can instead:

- Return just the normalized row.
- Let `_merge_platform_stats_rows` compute both the merge key and the log key from the normalized row.

This keeps normalization reusable and localizes the “what’s a key?” logic to one place.

```python
def _normalize_platform_stats_row(
    self, row: dict[str, Any]
) -> dict[str, Any]:
    normalized_row = dict(row)
    raw_timestamp = normalized_row.get("timestamp")
    normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)

    if normalized_timestamp is not None:
        normalized_row["timestamp"] = normalized_timestamp
    elif isinstance(raw_timestamp, str):
        normalized_row["timestamp"] = raw_timestamp.strip()
    elif raw_timestamp is None:
        normalized_row["timestamp"] = ""
    else:
        normalized_row["timestamp"] = str(raw_timestamp)

    return normalized_row
```

Then `_merge_platform_stats_rows` builds the keys directly and contains the merging rules:

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    result: list[dict[str, Any]] = []
    invalid_count_warnings = 0

    for row in rows:
        normalized_row = self._normalize_platform_stats_row(row)

        timestamp = normalized_row.get("timestamp")
        platform_id = normalized_row.get("platform_id")
        platform_type = normalized_row.get("platform_type")

        key_for_log = (timestamp, repr(platform_id), repr(platform_type))

        count, invalid_count_warnings = self._parse_platform_stats_count(
            normalized_row.get("count", 0),
            key_for_log,
            invalid_count_warnings,
        )
        normalized_row["count"] = count

        if (
            not isinstance(timestamp, str)
            or not isinstance(platform_id, str)
            or not isinstance(platform_type, str)
        ):
            result.append(normalized_row)
            continue

        merge_key = (timestamp, platform_id, platform_type)

        existing = merged.get(merge_key)
        if existing is None:
            merged[merge_key] = normalized_row
            result.append(normalized_row)
        else:
            existing["count"] += count

    return result
```

This preserves all current semantics (merging rules and logging key) but eliminates the triple return and “key” construction inside the normalizer.

### 2. Localize invalid count warning logic

The current `invalid_count_warnings` protocol (threading through `_parse_platform_stats_count` and `_log_platform_stats_invalid_count`, with the `+1` bump) is hard to follow.

You can keep rate limiting exactly the same but:

- Make `_parse_platform_stats_count` only parse.
- Do all counting and rate limiting in `_merge_platform_stats_rows`, in one place.

For example:

```python
def _parse_platform_stats_count(self, raw_count: Any) -> int | None:
    try:
        return int(raw_count)
    except (TypeError, ValueError):
        return None
```

Then:

```python
def _merge_platform_stats_rows(...):
    ...
    invalid_count_warnings = 0
    limit = PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT
    suppression_logged = False

    for row in rows:
        ...
        parsed = self._parse_platform_stats_count(normalized_row.get("count", 0))
        if parsed is None:
            if limit > 0:
                if invalid_count_warnings < limit:
                    logger.warning(
                        "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
                        normalized_row.get("count", 0),
                        key_for_log,
                    )
                    invalid_count_warnings += 1
                    if invalid_count_warnings == limit and not suppression_logged:
                        logger.warning(
                            "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                            limit,
                        )
                        suppression_logged = True
            elif not suppression_logged:
                logger.warning(
                    "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                    limit,
                )
                suppression_logged = True

            count = 0
        else:
            count = parsed
        normalized_row["count"] = count
        ...
```

This keeps the behavior (limit, one suppression log, `limit <= 0` semantics) but removes the need to pass a counter in/out of helpers and to understand `_log_platform_stats_invalid_count`’s state machine.

### 3. Inline `_to_utc_iso` into `_normalize_platform_stats_timestamp`

These two are tightly coupled and only used here. Inlining removes one more hop without changing behavior:

```python
def _normalize_platform_stats_timestamp(self, value: Any) -> str | None:
    def _to_utc_iso(dt: datetime) -> str:
        if dt.tzinfo is None:
            dt = dt.replace(tzinfo=timezone.utc)
        else:
            dt = dt.astimezone(timezone.utc)
        return dt.isoformat()

    if isinstance(value, datetime):
        return _to_utc_iso(value)
    if isinstance(value, str):
        timestamp = value.strip()
        if not timestamp:
            return None
        if timestamp.endswith("Z"):
            timestamp = f"{timestamp[:-1]}+00:00"
        try:
            return _to_utc_iso(datetime.fromisoformat(timestamp))
        except ValueError:
            return None
    return None
```

This keeps timestamp handling self-contained and reduces cross-method jumping when reading the code.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/backup/importer.py

zouyonghe · 2026-03-01T11:40:47Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 1 个问题

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="568" />
<code_context>
+            return normalized_rows
+        return rows
+
+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** 请考虑将“非法 count 告警节流逻辑”和“行归一化 / count 解析逻辑”抽取到辅助函数中，这样 `_merge_platform_stats_rows` 就可以只关注分组和合并的核心逻辑。

你可以在不改变行为的前提下，通过以下方式降低新的 `platform_stats` 处理逻辑在本地范围内的复杂度：

1. 将告警节流逻辑从 `_merge_platform_stats_rows` 中抽离出来
2. 将归一化 + count 解析折叠到一个单独的辅助函数中，让 `_merge_platform_stats_rows` 主要只负责分组 / 合并

### 1. 将限流告警抽取为辅助类

目前 `_merge_platform_stats_rows` 同时在处理 `invalid_count_warnings`、`suppression_logged` 以及基于环境变量的上限，这些都可以放到一个带少量状态的辅助类里，这样合并循环只需要“请求发出告警”，而不需要关心具体机制：

```python
class _InvalidCountWarnLimiter:
    def __init__(self, limit: int) -> None:
        self.limit = limit
        self._count = 0
        self._suppression_logged = False

    def warn_invalid_count(self, value: Any, key_for_log: tuple[Any, ...]) -> None:
        if self.limit > 0:
            if self._count < self.limit:
                logger.warning(
                    "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
                    value,
                    key_for_log,
                )
                self._count += 1
                if self._count == self.limit and not self._suppression_logged:
                    logger.warning(
                        "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                        self.limit,
                    )
                    self._suppression_logged = True
        elif not self._suppression_logged:
            logger.warning(
                "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                self.limit,
            )
            self._suppression_logged = True
```

然后 `_merge_platform_stats_rows` 会变得更简单：

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    result: list[dict[str, Any]] = []
    warn_limiter = _InvalidCountWarnLimiter(PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT)

    for row in rows:
        normalized_row, normalized_timestamp, count = self._normalize_platform_stats_entry(row, warn_limiter)
        platform_id = normalized_row.get("platform_id")
        platform_type = normalized_row.get("platform_type")

        if (
            normalized_timestamp is None
            or not isinstance(platform_id, str)
            or not isinstance(platform_type, str)
        ):
            result.append(normalized_row)
            continue

        merge_key = (normalized_timestamp, platform_id, platform_type)
        existing = merged.get(merge_key)
        if existing is None:
            merged[merge_key] = normalized_row
            result.append(normalized_row)
        else:
            existing["count"] += count

    return result
```

### 2. 合并行归一化和 count 解析

你已经有 `_normalize_platform_stats_row`、`_normalize_platform_stats_timestamp` 和 `_parse_platform_stats_count`，当前的归一化和解析散落在多处调用中，并且存在一些重复归一化。你可以在保持现有行为的基础上，通过一个单一的辅助函数，让契约更加清晰，该函数：

- 返回一个完全归一化后的行
- 返回归一化后的时间戳（或 `None`）
- 处理 count 的解析和告警（通过限流器）

示例：

```python
def _normalize_platform_stats_entry(
    self,
    row: dict[str, Any],
    warn_limiter: _InvalidCountWarnLimiter,
) -> tuple[dict[str, Any], str | None, int]:
    normalized_row = dict(row)

    # timestamp normalization
    raw_timestamp = normalized_row.get("timestamp")
    normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)

    if normalized_timestamp is not None:
        normalized_row["timestamp"] = normalized_timestamp
    elif isinstance(raw_timestamp, str):
        normalized_row["timestamp"] = raw_timestamp.strip()
    elif raw_timestamp is None:
        normalized_row["timestamp"] = ""
    else:
        normalized_row["timestamp"] = str(raw_timestamp)

    # count parsing with warning
    raw_count = normalized_row.get("count", 0)
    try:
        count = int(raw_count)
    except (TypeError, ValueError):
        key_for_log = (
            normalized_row.get("timestamp"),
            repr(normalized_row.get("platform_id")),
            repr(normalized_row.get("platform_type")),
        )
        warn_limiter.warn_invalid_count(raw_count, key_for_log)
        count = 0

    normalized_row["count"] = count
    return normalized_row, normalized_timestamp, count
```

这样可以保持所有现有行为（相同的 timestamp 回退逻辑、非法 count 默认仍为 `0`、相同的告警限流），但同时：

- `_merge_platform_stats_rows` 只需要关心：“给我一个归一化的行、它的归一化时间戳以及一个数值类型的 count”
- count 告警节流被隔离出来，并且可复用
- 每行的时间戳归一化只做一次，且有清晰的返回约定（`normalized_timestamp` 用于合并 key，`"timestamp"` 字段用于存储 / 日志）

你可以保持 `_normalize_platform_stats_timestamp` 现有实现不变；它已经足够内聚且易读。
</issue_to_address>

Sourcery 对开源项目免费——如果你觉得我们的评审有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点击 👍 或 👎，我会根据你的反馈改进后续评审。}

Original comment in English

Hey - I've found 1 issue

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/backup/importer.py" line_range="568" />
<code_context>
+            return normalized_rows
+        return rows
+
+    def _merge_platform_stats_rows(
+        self, rows: list[dict[str, Any]]
+    ) -> list[dict[str, Any]]:
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the invalid-count warning throttling and row normalization/count parsing into helpers so `_merge_platform_stats_rows` focuses only on grouping and merging logic.

You can reduce the localized complexity in the new `platform_stats` handling without changing behavior by:

1. Extracting the warning throttling out of `_merge_platform_stats_rows`
2. Collapsing normalization + count parsing into a single helper so `_merge_platform_stats_rows` is mostly about grouping/merging

### 1. Extract rate-limited warning into a helper

Right now `_merge_platform_stats_rows` is juggling `invalid_count_warnings`, `suppression_logged`, and the env-driven limit. That can be pushed into a small stateful helper so the merge loop only “asks to warn” and doesn’t care about the mechanics:

```python
class _InvalidCountWarnLimiter:
    def __init__(self, limit: int) -> None:
        self.limit = limit
        self._count = 0
        self._suppression_logged = False

    def warn_invalid_count(self, value: Any, key_for_log: tuple[Any, ...]) -> None:
        if self.limit > 0:
            if self._count < self.limit:
                logger.warning(
                    "platform_stats count 非法，已按 0 处理: value=%r, key=%s",
                    value,
                    key_for_log,
                )
                self._count += 1
                if self._count == self.limit and not self._suppression_logged:
                    logger.warning(
                        "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                        self.limit,
                    )
                    self._suppression_logged = True
        elif not self._suppression_logged:
            logger.warning(
                "platform_stats 非法 count 告警已达到上限 (%d)，后续将抑制",
                self.limit,
            )
            self._suppression_logged = True
```

Then `_merge_platform_stats_rows` becomes simpler:

```python
def _merge_platform_stats_rows(
    self, rows: list[dict[str, Any]]
) -> list[dict[str, Any]]:
    merged: dict[tuple[str, str, str], dict[str, Any]] = {}
    result: list[dict[str, Any]] = []
    warn_limiter = _InvalidCountWarnLimiter(PLATFORM_STATS_INVALID_COUNT_WARN_LIMIT)

    for row in rows:
        normalized_row, normalized_timestamp, count = self._normalize_platform_stats_entry(row, warn_limiter)
        platform_id = normalized_row.get("platform_id")
        platform_type = normalized_row.get("platform_type")

        if (
            normalized_timestamp is None
            or not isinstance(platform_id, str)
            or not isinstance(platform_type, str)
        ):
            result.append(normalized_row)
            continue

        merge_key = (normalized_timestamp, platform_id, platform_type)
        existing = merged.get(merge_key)
        if existing is None:
            merged[merge_key] = normalized_row
            result.append(normalized_row)
        else:
            existing["count"] += count

    return result
```

### 2. Consolidate row normalization and count parsing

You already have `_normalize_platform_stats_row`, `_normalize_platform_stats_timestamp`, and `_parse_platform_stats_count`, with normalization and parsing split across multiple calls and some re-normalization. You can keep behavior but present a clearer contract by having a single helper that:

- Returns a fully normalized row
- Returns the normalized timestamp (or `None`)
- Handles count parsing and warning (via the limiter)

Example:

```python
def _normalize_platform_stats_entry(
    self,
    row: dict[str, Any],
    warn_limiter: _InvalidCountWarnLimiter,
) -> tuple[dict[str, Any], str | None, int]:
    normalized_row = dict(row)

    # timestamp normalization
    raw_timestamp = normalized_row.get("timestamp")
    normalized_timestamp = self._normalize_platform_stats_timestamp(raw_timestamp)

    if normalized_timestamp is not None:
        normalized_row["timestamp"] = normalized_timestamp
    elif isinstance(raw_timestamp, str):
        normalized_row["timestamp"] = raw_timestamp.strip()
    elif raw_timestamp is None:
        normalized_row["timestamp"] = ""
    else:
        normalized_row["timestamp"] = str(raw_timestamp)

    # count parsing with warning
    raw_count = normalized_row.get("count", 0)
    try:
        count = int(raw_count)
    except (TypeError, ValueError):
        key_for_log = (
            normalized_row.get("timestamp"),
            repr(normalized_row.get("platform_id")),
            repr(normalized_row.get("platform_type")),
        )
        warn_limiter.warn_invalid_count(raw_count, key_for_log)
        count = 0

    normalized_row["count"] = count
    return normalized_row, normalized_timestamp, count
```

This keeps all existing behavior (same timestamp fallback, same default `0` for invalid counts, same rate limiting) but:

- `_merge_platform_stats_rows` only cares about: “give me a normalized row, its normalized timestamp, and a numeric count”
- Count warning throttling is isolated and reusable
- Timestamp normalization is done once per row with a clear return contract (`normalized_timestamp` used for merge key, `"timestamp"` field used for storage/logging)

You can keep `_normalize_platform_stats_timestamp` as-is; it’s already self-contained and readable.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/backup/importer.py

zouyonghe · 2026-03-01T11:44:49Z

@sourcery-ai review

sourcery-ai

嘿，我已经审查了你的修改，一切看起来都很棒！

Sourcery 对开源项目是免费的——如果你喜欢我们的代码审查，请考虑分享给更多人 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续的代码审查。}

Original comment in English

Hey - I've reviewed your changes and they look great!

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot-doc-agent · 2026-03-01T12:39:27Z

No docs changes were generated in this run (docs repo had no updates).

Docs repo: AstrBotDevs/AstrBot-docs
Trigger: PR merged

AI change summary (not committed):

本次 PR 为备份导入模块的内部 bug 修复，不涉及用户可见的功能变更
修复内容：replace 模式下主库清空失败时中止导入、聚合 platform_stats 重复键、非法 count 值容错处理
当前文档仓库无备份/恢复相关文档，无需新增
无文件改动

Experimental bot notice:

This output is generated by AstrBot-Doc-Agent for review only.
It does not represent the final documentation form.

auto-assign bot requested review from LIghtJUNction and Soulter March 1, 2026 09:05

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 1, 2026

dosubot bot added the area:core The bug / feature is about astrbot's core, backend label Mar 1, 2026

gemini-code-assist bot reviewed Mar 1, 2026

View reviewed changes

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

fix: normalize invalid platform_stats count for non-duplicate rows

a50727d

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

fix: avoid merging invalid platform_stats timestamps

f091b85

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/backup/importer.py Show resolved Hide resolved

refactor: simplify platform stats merge and normalize naive UTC

7a590f2

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/backup/importer.py Show resolved Hide resolved

refactor: inline platform stats merge helpers

9df6a66

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/backup/importer.py Show resolved Hide resolved

refactor: flatten platform stats merge flow

5bd71e4

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/backup/importer.py Outdated Show resolved Hide resolved

refactor: simplify platform stats merge helpers

ec57b98

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/backup/importer.py Show resolved Hide resolved

refactor: inline platform stats merge normalization

628defa

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/backup/importer.py Outdated Show resolved Hide resolved

astrbot/core/backup/importer.py Show resolved Hide resolved

refactor: extract platform stats merge helpers

960c4db

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/backup/importer.py Show resolved Hide resolved

refactor: simplify platform stats preprocessing flow

d21cf58

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/backup/importer.py Show resolved Hide resolved

refactor: flatten platform stats preprocess helpers

60cedc7

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Mar 1, 2026

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/backup/importer.py Outdated Show resolved Hide resolved

refactor: streamline platform stats merge helpers

ae446ed

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/backup/importer.py Show resolved Hide resolved

refactor: isolate platform stats warning limiter

1cb1348

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

zouyonghe merged commit 4abea2b into AstrBotDevs:master Mar 1, 2026
6 checks passed

-        first = merged_rows[0]
-        assert first["timestamp"] == "2025-12-13T20:00:00Z"
-        assert first["platform_id"] == "webchat"
-        assert first["platform_type"] == "unknown"
-        assert first["count"] == 17
+        webchat_row = next(
+            (r for r in merged_rows if r.get("platform_id") == "webchat"), None
+        )
+        assert webchat_row is not None
+        assert webchat_row["timestamp"] == "2025-12-13T20:00:00Z"
+        assert webchat_row["platform_id"] == "webchat"
+        assert webchat_row["platform_type"] == "unknown"
+        assert webchat_row["count"] == 17

Uh oh!

Conversation

Clhikari commented Mar 1, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changed Files / 改动文件

Motivation / 变更动机

Modifications / 改动点

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Clhikari commented Mar 1, 2026 •

edited by sourcery-ai bot

Loading