fix(primaryStorage): rollback persisted records on controller build failure#3868
fix(primaryStorage): rollback persisted records on controller build failure#3868zstack-robot-1 wants to merge 1 commit into5.5.6from
Conversation
… failure When AddExternalPrimaryStorage receives an invalid JSON config, ExternalPrimaryStorageFactory.createPrimaryStorage persisted ExternalPrimaryStorageVO/PrimaryStorageVO/PrimaryStorageOutputProtocolRefVO before invoking saveControllerIfNeed -> buildControllerSvc. An exception thrown during JSON parsing left those rows in the DB. The dirty VO then made buildPsController fail every time, breaking the PS service so QueryPrimaryStorage returned 503 permanently. Wrap saveControllerIfNeed in try/catch and remove the persisted records before rethrowing. Resolves: ZSTAC-84817 Change-Id: Icf6c648133d7866edf35940d56a28f74f4c64817
概览在外部主存储工厂的控制器初始化过程中添加了错误处理机制。当控制器构建失败时,捕获异常、清理已持久化的数据库记录、清除内存缓存,然后重新抛出异常。同时添加了集成测试以验证此失败场景的正确行为和数据库回滚。 变更
预估代码审查工作量🎯 2 (简单) | ⏱️ ~12 分钟 诗篇
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (3 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@test/src/test/groovy/org/zstack/test/integration/storage/primary/addon/zbs/ZbsPrimaryStorageCase.groovy`:
- Around line 729-753: Replace the Chinese comment block above the test method
testAddExternalPrimaryStorageWithMalformedJsonShouldRollback with an English
comment that explains: this test verifies AddExternalPrimaryStorage rolls back
when config contains malformed JSON (JSONObjectUtil.toObject throws), and no
ExternalPrimaryStorageVO/PrimaryStorageVO/PrimaryStorageOutputProtocolRefVO
records remain to avoid dirty VO causing buildPsController failures and
QueryPrimaryStorage 503s; update any inline Chinese in the same comment region
to concise, correct English while keeping references to ExternalPrimaryStorageVO
and the failure expectations intact.
- Around line 733-755: The test only checks ExternalPrimaryStorageVO but misses
asserting rollback for PrimaryStorageVO and PrimaryStorageOutputProtocolRefVO;
capture counts for PrimaryStorageVO and PrimaryStorageOutputProtocolRefVO (using
Q.New(PrimaryStorageVO.class).count() and
Q.New(PrimaryStorageOutputProtocolRefVO.class).count()) before the failing
addExternalPrimaryStorage call, then after the expect(AssertionError.class)
block assert those counts are unchanged and also assert no rows exist with name
"zbs-bad-json" in PrimaryStorageVO and no protocol refs tied to that primary
storage in PrimaryStorageOutputProtocolRefVO, so failures don't leave residual
rows across those related tables.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
Run ID: 1e13331a-5bab-4deb-bbc0-e3ac9ea06519
📒 Files selected for processing (2)
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.javatest/src/test/groovy/org/zstack/test/integration/storage/primary/addon/zbs/ZbsPrimaryStorageCase.groovy
| try { | ||
| saveControllerIfNeed(lvo); | ||
| } catch (Throwable t) { | ||
| logger.warn(String.format("failed to build controller for primary storage[uuid:%s, identity:%s], rolling back persisted records", | ||
| lvo.getUuid(), identity), t); | ||
| try { | ||
| dbf.remove(ref); | ||
| dbf.remove(lvo); | ||
| } catch (Throwable cleanupEx) { | ||
| logger.warn(String.format("failed to roll back persisted records for primary storage[uuid:%s]", lvo.getUuid()), cleanupEx); | ||
| } | ||
| controllers.remove(lvo.getUuid()); | ||
| nodes.remove(lvo.getUuid()); | ||
| throw t; | ||
| } |
There was a problem hiding this comment.
这次补偿只覆盖“当前新增”的脏记录,修复前已经落库的坏数据仍会继续打挂服务。
Line 324 失败时这里只回滚本次 createPrimaryStorage() 刚写入的记录,但 buildPsController()(Line 180-187)和 nodeLeft() 触发的重建路径(Line 1077)仍会直接消费库里已有的 ExternalPrimaryStorageVO。如果现场在修复前已经因为同类 malformed JSON 留下过脏记录,升级后启动/切主时还是会在这些路径上反复抛异常,QueryPrimaryStorage 依旧可能不可用。建议补一个启动期的隔离/跳过坏记录,或者增加显式的数据修复逻辑。
| // AddExternalPrimaryStorage 收到非法 JSON config 导致 buildControllerSvc 抛异常时, | ||
| // ExternalPrimaryStorageVO/PrimaryStorageVO/PrimaryStorageOutputProtocolRefVO 不应残留在 DB 中。 | ||
| // 否则 buildPsController 会被脏 VO 持续打挂,QueryPrimaryStorage 永久 503。 | ||
| void testAddExternalPrimaryStorageWithMalformedJsonShouldRollback() { | ||
| long psCountBefore = Q.New(ExternalPrimaryStorageVO.class).count() | ||
|
|
||
| // malformed JSON — JSONObjectUtil.toObject(config, Config.class) 会抛 RuntimeException | ||
| expect(AssertionError.class) { | ||
| addExternalPrimaryStorage { | ||
| zoneUuid = zone.uuid | ||
| name = "zbs-bad-json" | ||
| identity = "zbs" | ||
| defaultOutputProtocol = "CBD" | ||
| config = "{this is not valid json" | ||
| url = "" | ||
| } | ||
| } | ||
|
|
||
| // 失败后不应在 DB 中遗留任何 zbs-bad-json 相关记录 | ||
| assert Q.New(ExternalPrimaryStorageVO.class).count() == psCountBefore | ||
| assert !Q.New(ExternalPrimaryStorageVO.class) | ||
| .eq(ExternalPrimaryStorageVO_.name, "zbs-bad-json") | ||
| .isExists() | ||
|
|
||
| // 失败之后 QueryPrimaryStorage 仍可正常返回(不被脏数据打挂) |
There was a problem hiding this comment.
请把新增注释改成英文。
Line 729-753 新增的是中文注释,不符合仓库规范。
可直接替换的英文注释
- // AddExternalPrimaryStorage 收到非法 JSON config 导致 buildControllerSvc 抛异常时,
- // ExternalPrimaryStorageVO/PrimaryStorageVO/PrimaryStorageOutputProtocolRefVO 不应残留在 DB 中。
- // 否则 buildPsController 会被脏 VO 持续打挂,QueryPrimaryStorage 永久 503。
+ // When AddExternalPrimaryStorage receives malformed JSON config and buildControllerSvc throws,
+ // ExternalPrimaryStorageVO / PrimaryStorageVO / PrimaryStorageOutputProtocolRefVO must not remain in the database.
+ // Otherwise buildPsController keeps failing on the dirty VO and QueryPrimaryStorage can return 503 permanently.
@@
- // malformed JSON — JSONObjectUtil.toObject(config, Config.class) 会抛 RuntimeException
+ // Malformed JSON: JSONObjectUtil.toObject(config, Config.class) should throw RuntimeException.
@@
- // 失败后不应在 DB 中遗留任何 zbs-bad-json 相关记录
+ // No zbs-bad-json related records should remain in the database after the failure.
@@
- // 失败之后 QueryPrimaryStorage 仍可正常返回(不被脏数据打挂)
+ // QueryPrimaryStorage should still return successfully after the failure.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // AddExternalPrimaryStorage 收到非法 JSON config 导致 buildControllerSvc 抛异常时, | |
| // ExternalPrimaryStorageVO/PrimaryStorageVO/PrimaryStorageOutputProtocolRefVO 不应残留在 DB 中。 | |
| // 否则 buildPsController 会被脏 VO 持续打挂,QueryPrimaryStorage 永久 503。 | |
| void testAddExternalPrimaryStorageWithMalformedJsonShouldRollback() { | |
| long psCountBefore = Q.New(ExternalPrimaryStorageVO.class).count() | |
| // malformed JSON — JSONObjectUtil.toObject(config, Config.class) 会抛 RuntimeException | |
| expect(AssertionError.class) { | |
| addExternalPrimaryStorage { | |
| zoneUuid = zone.uuid | |
| name = "zbs-bad-json" | |
| identity = "zbs" | |
| defaultOutputProtocol = "CBD" | |
| config = "{this is not valid json" | |
| url = "" | |
| } | |
| } | |
| // 失败后不应在 DB 中遗留任何 zbs-bad-json 相关记录 | |
| assert Q.New(ExternalPrimaryStorageVO.class).count() == psCountBefore | |
| assert !Q.New(ExternalPrimaryStorageVO.class) | |
| .eq(ExternalPrimaryStorageVO_.name, "zbs-bad-json") | |
| .isExists() | |
| // 失败之后 QueryPrimaryStorage 仍可正常返回(不被脏数据打挂) | |
| // When AddExternalPrimaryStorage receives malformed JSON config and buildControllerSvc throws, | |
| // ExternalPrimaryStorageVO / PrimaryStorageVO / PrimaryStorageOutputProtocolRefVO must not remain in the database. | |
| // Otherwise buildPsController keeps failing on the dirty VO and QueryPrimaryStorage can return 503 permanently. | |
| void testAddExternalPrimaryStorageWithMalformedJsonShouldRollback() { | |
| long psCountBefore = Q.New(ExternalPrimaryStorageVO.class).count() | |
| // Malformed JSON: JSONObjectUtil.toObject(config, Config.class) should throw RuntimeException. | |
| expect(AssertionError.class) { | |
| addExternalPrimaryStorage { | |
| zoneUuid = zone.uuid | |
| name = "zbs-bad-json" | |
| identity = "zbs" | |
| defaultOutputProtocol = "CBD" | |
| config = "{this is not valid json" | |
| url = "" | |
| } | |
| } | |
| // No zbs-bad-json related records should remain in the database after the failure. | |
| assert Q.New(ExternalPrimaryStorageVO.class).count() == psCountBefore | |
| assert !Q.New(ExternalPrimaryStorageVO.class) | |
| .eq(ExternalPrimaryStorageVO_.name, "zbs-bad-json") | |
| .isExists() | |
| // QueryPrimaryStorage should still return successfully after the failure. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@test/src/test/groovy/org/zstack/test/integration/storage/primary/addon/zbs/ZbsPrimaryStorageCase.groovy`
around lines 729 - 753, Replace the Chinese comment block above the test method
testAddExternalPrimaryStorageWithMalformedJsonShouldRollback with an English
comment that explains: this test verifies AddExternalPrimaryStorage rolls back
when config contains malformed JSON (JSONObjectUtil.toObject throws), and no
ExternalPrimaryStorageVO/PrimaryStorageVO/PrimaryStorageOutputProtocolRefVO
records remain to avoid dirty VO causing buildPsController failures and
QueryPrimaryStorage 503s; update any inline Chinese in the same comment region
to concise, correct English while keeping references to ExternalPrimaryStorageVO
and the failure expectations intact.
| long psCountBefore = Q.New(ExternalPrimaryStorageVO.class).count() | ||
|
|
||
| // malformed JSON — JSONObjectUtil.toObject(config, Config.class) 会抛 RuntimeException | ||
| expect(AssertionError.class) { | ||
| addExternalPrimaryStorage { | ||
| zoneUuid = zone.uuid | ||
| name = "zbs-bad-json" | ||
| identity = "zbs" | ||
| defaultOutputProtocol = "CBD" | ||
| config = "{this is not valid json" | ||
| url = "" | ||
| } | ||
| } | ||
|
|
||
| // 失败后不应在 DB 中遗留任何 zbs-bad-json 相关记录 | ||
| assert Q.New(ExternalPrimaryStorageVO.class).count() == psCountBefore | ||
| assert !Q.New(ExternalPrimaryStorageVO.class) | ||
| .eq(ExternalPrimaryStorageVO_.name, "zbs-bad-json") | ||
| .isExists() | ||
|
|
||
| // 失败之后 QueryPrimaryStorage 仍可正常返回(不被脏数据打挂) | ||
| def psList = queryPrimaryStorage {} as List | ||
| assert psList != null |
There was a problem hiding this comment.
这个回归用例还没有覆盖基表和协议映射表的回滚。
现在只校验了 ExternalPrimaryStorageVO。如果后面有人只删掉子表、遗漏 PrimaryStorageVO 或 PrimaryStorageOutputProtocolRefVO,这个用例仍然会通过,但数据库里还是会残留脏数据。既然本次修复显式补偿了这些记录,建议把相关表的 count 前后也一起断言。
可参考的补充断言
void testAddExternalPrimaryStorageWithMalformedJsonShouldRollback() {
long psCountBefore = Q.New(ExternalPrimaryStorageVO.class).count()
+ long primaryStorageCountBefore = Q.New(org.zstack.header.storage.primary.PrimaryStorageVO.class).count()
+ long protocolRefCountBefore = Q.New(org.zstack.header.storage.primary.PrimaryStorageOutputProtocolRefVO.class).count()
expect(AssertionError.class) {
addExternalPrimaryStorage {
zoneUuid = zone.uuid
name = "zbs-bad-json"
@@
}
assert Q.New(ExternalPrimaryStorageVO.class).count() == psCountBefore
+ assert Q.New(org.zstack.header.storage.primary.PrimaryStorageVO.class).count() == primaryStorageCountBefore
+ assert Q.New(org.zstack.header.storage.primary.PrimaryStorageOutputProtocolRefVO.class).count() == protocolRefCountBefore
assert !Q.New(ExternalPrimaryStorageVO.class)
.eq(ExternalPrimaryStorageVO_.name, "zbs-bad-json")
.isExists()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@test/src/test/groovy/org/zstack/test/integration/storage/primary/addon/zbs/ZbsPrimaryStorageCase.groovy`
around lines 733 - 755, The test only checks ExternalPrimaryStorageVO but misses
asserting rollback for PrimaryStorageVO and PrimaryStorageOutputProtocolRefVO;
capture counts for PrimaryStorageVO and PrimaryStorageOutputProtocolRefVO (using
Q.New(PrimaryStorageVO.class).count() and
Q.New(PrimaryStorageOutputProtocolRefVO.class).count()) before the failing
addExternalPrimaryStorage call, then after the expect(AssertionError.class)
block assert those counts are unchanged and also assert no rows exist with name
"zbs-bad-json" in PrimaryStorageVO and no protocol refs tied to that primary
storage in PrimaryStorageOutputProtocolRefVO, so failures don't leave residual
rows across those related tables.
Summary
AddExternalPrimaryStorage with malformed JSON config left dirty rows in DB, breaking PrimaryStorage service permanently (QueryPrimaryStorage 503).
Root Cause
ExternalPrimaryStorageFactory.createPrimaryStoragepersistsExternalPrimaryStorageVO/PrimaryStorageVO/PrimaryStorageOutputProtocolRefVObefore invokingsaveControllerIfNeed → buildControllerSvc → ZbsStorageController.reloadDbInfo → JSONObjectUtil.toObject. ARuntimeExceptionfrom JSON parsing left those rows persisted with no rollback. The dirty VO then made every subsequentbuildPsController()throw the same parse error, so the PrimaryStorage service stayed unhealthy andQueryPrimaryStoragekept returning 503.Changes
storage/ExternalPrimaryStorageFactory.java: wrapsaveControllerIfNeedin try/catch; onThrowable,dbf.remove(ref) + dbf.remove(lvo), clearcontrollers/nodesmap entries, then rethrow.test/.../ZbsPrimaryStorageCase.groovy: new SubCasetestAddExternalPrimaryStorageWithMalformedJsonShouldRollbackasserts no leftover rows after malformed-JSON Add and that QueryPrimaryStorage still works.Testing
mvn compile -pl storage -am -Dmaven.test.skippasses locally.ZbsPrimaryStorageCaseruns the new SubCase.Resolves: ZSTAC-84817
sync from gitlab !9743