Skip to content

[ZSTAC-83890] Ceph HA: sibling-fence VM before HA recreate@@3#3896

Closed
MatheMatrix wants to merge 1 commit into5.4.8from
sync/yingzhe.hu/fix/ZSTAC-83890-ceph-ha-sibling-fence@@3
Closed

[ZSTAC-83890] Ceph HA: sibling-fence VM before HA recreate@@3#3896
MatheMatrix wants to merge 1 commit into5.4.8from
sync/yingzhe.hu/fix/ZSTAC-83890-ceph-ha-sibling-fence@@3

Conversation

@MatheMatrix
Copy link
Copy Markdown
Owner

Summary

Fixes Ceph HA empty-watcher race that allows VM split-brain (TIC-5513). Adds sibling-host SSH fence in PreVmInstantiate hook before HA starts a new VM on the failed host's primary storage.

Changes (zstack main)

  • header/: new SiblingFenceVmOnHostMsg + SiblingFenceVmOnHostReply
  • header/: new SPI CephSiblingFenceExtensionPoint (implementation in premium)
  • plugin/ceph/CephPrimaryStorageFactory.preInstantiateVmResource: when getVolumeWatchers returns empty, dispatch sibling-fence via SPI before allowing VM start
  • Audit emitted via structured log/SystemEvent: ceph.ha.sibling-fence with outcome (success-killed / success-already-dead / success-ssh-unreachable / failed)

Cross-repo dependencies

  • premium MR: implements CephSiblingFencer SPI
  • zstack-utility MR: adds /ha/sibling-fence-vm kvmagent endpoint

Test plan

  • CI pipeline pass (compile + smoke)
  • T7 libvirtd-hang regression (will follow up)
  • T8 E2E fault injection (will follow up)
  • Manual dry-run on staging cluster

Resolves: ZSTAC-83890
Customer: TIC-5513

(Supersedes !9777, branch renamed to @@3 with squashed commit)

sync from gitlab !9778

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

Walkthrough

新增 Ceph sibling-fence SPI、请求/回复消息,并在 CephPrimaryStorageFactory.preInstantiateVmResource 的 watcher 为空路径中调用该 SPI,依据异步回复决定是否允许 VM 启动。

变更详情

Ceph Sibling Fence 功能

层级 / 文件 说明
消息与扩展点契约
header/src/main/java/org/zstack/header/storage/ceph/CephSiblingFenceExtensionPoint.java, header/src/main/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostMsg.java, header/src/main/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostReply.java
新增扩展点接口 CephSiblingFenceExtensionPoint 定义 fenceVmOnFailedHost(...) 回调。消息类携带 failedHostUuidvmUuidclusterUuidhaTargetHostUuid。回复类包含状态标志(alive/killed/sshReachable/qemuFound)及执行器字段(executorHostUuid、executorRole、reason)。
预实例化流程集成
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
preInstantiateVmResource 的根卷 watcher 为空分支中,检测并调用扩展点。根据 SiblingFenceVmOnHostReply 的状态决定调用 completion.success()(VM 不存在或已被杀死)或 completion.fail(...)(VM 仍存活并返回原因)。若扩展点不可用则保持原有行为(直接 success)。

Sequence Diagram(s)

sequenceDiagram
  participant Factory as CephPrimaryStorageFactory
  participant Plugin as CephSiblingFenceExtensionPoint
  participant Completion as completion
  Factory->>Factory: 检查 root volume watchers
  alt 观察者为空且扩展点存在
    Factory->>Plugin: fenceVmOnFailedHost(failedHostUuid, vmUuid, clusterUuid, haTargetHostUuid, completion)
    Plugin-->>Factory: SiblingFenceVmOnHostReply
    alt reply.alive == false or reply.killed == true
      Factory->>Completion: success()
    else
      Factory->>Completion: fail(ErrorCode with reason)
    end
  else 扩展点不存在
    Factory->>Completion: success()
  end
Loading

评估代码审查工作量

🎯 3 (Moderate) | ⏱️ ~20 分钟

诗歌

🐰 新增契约写清楚,
插件呼唤去查路,
若残影仍在跑,先去斩,
启动前夕莫糊涂,
Ceph HA 步步安如故。


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name Status Explanation Resolution
Title check ❌ Error The PR title does not follow the required format of [scope]: ; it uses [JIRA-KEY] prefix instead and lacks a type prefix. Reformat the title to match the required format, e.g., 'feat[ceph]: Sibling-fence VM before HA recreate (ZSTAC-83890)' ensuring it is 72 characters or less.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The PR description is comprehensive and directly related to the changeset, covering the purpose, changes, dependencies, and test plan.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/yingzhe.hu/fix/ZSTAC-83890-ceph-ha-sibling-fence@@3

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
header/src/main/java/org/zstack/header/storage/ceph/CephSiblingFenceExtensionPoint.java (1)

5-11: 💤 Low value

建议为接口方法添加 Javadoc 注释。

根据编码规范,接口方法必须配有有效的 Javadoc 注释。建议说明各参数的含义,特别是 failedHostUuidhaTargetHostUuid 的区别,以及 completion 回调的预期行为。

📝 建议的 Javadoc
 public interface CephSiblingFenceExtensionPoint {
+    /**
+     * Fence a VM on a failed host via sibling host SSH.
+     *
+     * `@param` failedHostUuid    the UUID of the host that failed
+     * `@param` vmUuid            the UUID of the VM to fence
+     * `@param` clusterUuid       the UUID of the cluster
+     * `@param` haTargetHostUuid  the UUID of the target host for HA migration
+     * `@param` completion        callback with SiblingFenceVmOnHostReply indicating fence result
+     */
     void fenceVmOnFailedHost(String failedHostUuid,
                              String vmUuid,
                              String clusterUuid,
                              String haTargetHostUuid,
                              ReturnValueCompletion<SiblingFenceVmOnHostReply> completion);
 }

As per coding guidelines: "接口方法不应有多余的修饰符(例如 public),且必须配有有效的 Javadoc 注释"

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@header/src/main/java/org/zstack/header/storage/ceph/CephSiblingFenceExtensionPoint.java`
around lines 5 - 11, Add a Javadoc block to the interface method
fenceVmOnFailedHost in CephSiblingFenceExtensionPoint describing the method
purpose and each parameter (clarify that failedHostUuid is the host that failed,
haTargetHostUuid is the HA target host to migrate/recover the VM to, and explain
vmUuid and clusterUuid), and document the expected behavior of the completion
callback (ReturnValueCompletion<SiblingFenceVmOnHostReply> completion) including
when to call success vs. failure and what fields in SiblingFenceVmOnHostReply
should be populated; also remove the redundant public modifier from the
interface method signature so it uses the default visibility.
header/src/test/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostMsgTest.java (1)

20-32: 💤 Low value

测试覆盖不完整:设置的字段未全部断言验证。

testReplyFields() 设置了多个字段但仅验证了 isKilled()getExecutorRole()。建议补充对其他字段的断言以确保完整的测试覆盖。

💚 建议补充的断言
         r.setReason("");
         assertTrue(r.isKilled());
         assertEquals("sibling", r.getExecutorRole());
+        assertTrue(r.isAlive());
+        assertTrue(r.isSshReachable());
+        assertTrue(r.isQemuFound());
+        assertEquals("h-sibling", r.getExecutorHostUuid());
+        assertEquals("", r.getReason());
     }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@header/src/test/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostMsgTest.java`
around lines 20 - 32, The test testReplyFields() in SiblingFenceVmOnHostMsgTest
sets multiple fields on SiblingFenceVmOnHostReply but only asserts two; update
the test method to assert all other set fields: add assertions for r.isAlive(),
r.getExecutorHostUuid(), r.isSshReachable(), r.isQemuFound(), and r.getReason()
(e.g., assertTrue/assertEquals as appropriate) so every setter in the test is
validated.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java`:
- Around line 1264-1285: The outcome label "success-other" is misleading because
the branch where sshReachable=true, alive=true, killed=false leads to
completion.fail(); update the outcome string in the branch that currently sets
"success-other" to a failure-oriented name (e.g., "failed-still-alive" or
"failed-fence-incomplete") so the value passed to
fireSiblingFenceAuditEvent(vmUuid, failedHostUuid, clusterUuid,
haTargetHostUuid, outcome, r.getReason()) correctly reflects the failure,
keeping the rest of the logic (checks on r.isKilled(), r.isSshReachable(),
r.isAlive(), completion.success()/completion.fail()) unchanged; ensure you only
change the outcome assignment in CephPrimaryStorageFactory around the
sibling-fence handling block.

---

Nitpick comments:
In
`@header/src/main/java/org/zstack/header/storage/ceph/CephSiblingFenceExtensionPoint.java`:
- Around line 5-11: Add a Javadoc block to the interface method
fenceVmOnFailedHost in CephSiblingFenceExtensionPoint describing the method
purpose and each parameter (clarify that failedHostUuid is the host that failed,
haTargetHostUuid is the HA target host to migrate/recover the VM to, and explain
vmUuid and clusterUuid), and document the expected behavior of the completion
callback (ReturnValueCompletion<SiblingFenceVmOnHostReply> completion) including
when to call success vs. failure and what fields in SiblingFenceVmOnHostReply
should be populated; also remove the redundant public modifier from the
interface method signature so it uses the default visibility.

In
`@header/src/test/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostMsgTest.java`:
- Around line 20-32: The test testReplyFields() in SiblingFenceVmOnHostMsgTest
sets multiple fields on SiblingFenceVmOnHostReply but only asserts two; update
the test method to assert all other set fields: add assertions for r.isAlive(),
r.getExecutorHostUuid(), r.isSshReachable(), r.isQemuFound(), and r.getReason()
(e.g., assertTrue/assertEquals as appropriate) so every setter in the test is
validated.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: 7852302d-20d8-4541-8fde-7f148df3fa64

📥 Commits

Reviewing files that changed from the base of the PR and between 6de8314 and 12eb41f.

📒 Files selected for processing (6)
  • header/src/main/java/org/zstack/header/storage/ceph/CephSiblingFenceExtensionPoint.java
  • header/src/main/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostMsg.java
  • header/src/main/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostReply.java
  • header/src/test/java/org/zstack/header/storage/ceph/CephSiblingFenceExtensionPointTest.java
  • header/src/test/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostMsgTest.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java

@MatheMatrix MatheMatrix force-pushed the sync/yingzhe.hu/fix/ZSTAC-83890-ceph-ha-sibling-fence@@3 branch from 12eb41f to b9569aa Compare May 7, 2026 08:38
Resolves: ZSTAC-83890

Change-Id: I652cbc9cc303f788657a1eddf7b706b38ba7adf0
@MatheMatrix MatheMatrix force-pushed the sync/yingzhe.hu/fix/ZSTAC-83890-ceph-ha-sibling-fence@@3 branch from ee30adc to a12f22c Compare May 8, 2026 07:18
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java`:
- Around line 1261-1269: The success handler for SiblingFenceVmOnHostReply
currently treats only !r.isAlive() || r.isKilled() as success and may
mis-classify ssh-unreachable cases; update the success(SiblingFenceVmOnHostReply
r) branch in CephPrimaryStorageFactory so that r.isSshReachable() is checked
explicitly as part of the decision: treat ssh-unreachable (i.e.,
!r.isSshReachable()) as a success case that calls completion.success(), while
preserving the existing checks for r.isAlive() and r.isKilled() and using
completion.fail(...) only when the reply indicates an actual fence failure
(e.g., r.isAlive() && r.isSshReachable() && !r.isKilled()), referencing the
success(...) method, SiblingFenceVmOnHostReply, r.isSshReachable(), r.isAlive(),
r.isKilled(), completion.success(), and completion.fail().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: 80110b15-e175-464e-9f33-680138ad8571

📥 Commits

Reviewing files that changed from the base of the PR and between ee30adc and a12f22c.

📒 Files selected for processing (4)
  • header/src/main/java/org/zstack/header/storage/ceph/CephSiblingFenceExtensionPoint.java
  • header/src/main/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostMsg.java
  • header/src/main/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostReply.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
✅ Files skipped from review due to trivial changes (3)
  • header/src/main/java/org/zstack/header/storage/ceph/CephSiblingFenceExtensionPoint.java
  • header/src/main/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostMsg.java
  • header/src/main/java/org/zstack/header/storage/ceph/SiblingFenceVmOnHostReply.java

Comment on lines +1261 to +1269
public void success(SiblingFenceVmOnHostReply r) {
if (!r.isAlive() || r.isKilled()) {
logger.debug(String.format("ceph sibling-fence succeeded for vm[%s] on failedHost[%s]: killed=%s sshReachable=%s",
vmUuid, failedHostUuid, r.isKilled(), r.isSshReachable()));
completion.success();
} else {
logger.warn(String.format("ceph sibling-fence failed for vm[%s] on failedHost[%s]: %s — blocking VM start to prevent split-brain",
vmUuid, failedHostUuid, r.getReason()));
completion.fail(operr("sibling fence failed: %s", r.getReason()));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

未显式放行 SSH 不可达分支,可能误阻断 HA 拉起

Line [1262] 仅判断 !r.isAlive() || r.isKilled()。按本 PR 目标存在 success-ssh-unreachable 成功态;若扩展实现返回 sshReachable=falsealive=true(或后续实现调整),这里会错误进入 completion.fail(),导致可恢复场景被阻断。建议把 !r.isSshReachable() 作为明确成功条件,避免依赖实现细节。

🔧 建议修复
-                            if (!r.isAlive() || r.isKilled()) {
+                            if (!r.isSshReachable() || !r.isAlive() || r.isKilled()) {
                                 logger.debug(String.format("ceph sibling-fence succeeded for vm[%s] on failedHost[%s]: killed=%s sshReachable=%s",
                                         vmUuid, failedHostUuid, r.isKilled(), r.isSshReachable()));
                                 completion.success();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java`
around lines 1261 - 1269, The success handler for SiblingFenceVmOnHostReply
currently treats only !r.isAlive() || r.isKilled() as success and may
mis-classify ssh-unreachable cases; update the success(SiblingFenceVmOnHostReply
r) branch in CephPrimaryStorageFactory so that r.isSshReachable() is checked
explicitly as part of the decision: treat ssh-unreachable (i.e.,
!r.isSshReachable()) as a success case that calls completion.success(), while
preserving the existing checks for r.isAlive() and r.isKilled() and using
completion.fail(...) only when the reply indicates an actual fence failure
(e.g., r.isAlive() && r.isSshReachable() && !r.isKilled()), referencing the
success(...) method, SiblingFenceVmOnHostReply, r.isSshReachable(), r.isAlive(),
r.isKilled(), completion.success(), and completion.fail().

@ZStack-Robot ZStack-Robot deleted the sync/yingzhe.hu/fix/ZSTAC-83890-ceph-ha-sibling-fence@@3 branch May 8, 2026 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants