Skip to content

<feature>[utils]: Network group high availability strategy#3897

Open
ZStack-Robot wants to merge 1 commit into5.5.22from
sync/haoyu.ding/fv-81413@@3
Open

<feature>[utils]: Network group high availability strategy#3897
ZStack-Robot wants to merge 1 commit into5.5.22from
sync/haoyu.ding/fv-81413@@3

Conversation

@ZStack-Robot
Copy link
Copy Markdown
Collaborator

DBImpact

Resolves: ZSTAC-81413

Change-Id: I6a7574656467636e686377756f716d67707a7063

sync from gitlab !9779

DBImpact

Resolves: ZSTAC-81413

Change-Id: I6a7574656467636e686377756f716d67707a7063
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

概览

该PR为ZStack平台的HA网络组功能添加基础支持。变更包括四个新数据库表、两个KVM代理命令、三个模拟器REST端点、22个HA错误代码常量以及五个API测试辅助方法的增强。

变更

HA网络组支持

层 / 文件 摘要
数据库模式
conf/db/upgrade/V5.5.22__schema.sql
创建HaNetworkGroupVO(网络组定义)、HaNetworkGroupL3NetworkRefVO(L3网络映射)、HostHaNetworkGroupStatusVO(主机状态)、HaNetworkGroupGlobalConfigVersionVO(版本跟踪)四张表,并初始化全局配置版本。
代理命令定义
plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java
添加SetupVmHaEnabledMetadataLiveCmdReconcileVmHaEnabledMetadataLiveCmd命令类(@GrayVersion("5.5.6")),扩展StartVmCmd添加enableHa属性。
端点常量和错误码
plugin/kvm/src/main/java/org/zstack/kvm/KVMConstant.java, utils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.java
定义三条HA元数据和网络组同步路由常量,新增22个HA相关错误代码(ORG_ZSTACK_HA_1001910040)。
模拟器集成
simulator/simulatorImpl/src/main/java/org/zstack/simulator/kvm/KVMSimulatorConfig.java, simulator/simulatorImpl/src/main/java/org/zstack/simulator/kvm/KVMSimulatorController.java, testlib/src/main/java/org/zstack/testlib/KVMSimulator.groovy
添加命令追踪列表、三个REST端点处理器(同步锁保护命令列表)、模拟器路由响应。
测试支持
testlib/src/main/java/org/zstack/testlib/ApiHelper.groovy, testlib/src/main/java/org/zstack/testlib/EnvSpec.groovy
五个HA网络组API方法添加闭包委托和API路径追踪,环境清理列表新增HaNetworkGroupGlobalConfigVersionVO

序列图

sequenceDiagram
  participant Agent as KVM Agent
  participant Simulator as Simulator Controller
  participant Config as Simulator Config
  Agent->>Simulator: setupHaEnabledMetadata/live
  Simulator->>Config: append SetupVmHaEnabledMetadataLiveCmd
  Config->>Config: synchronized list update
  Simulator->>Agent: AgentResponse
  Agent->>Simulator: reconcileHaEnabledMetadata/live
  Simulator->>Config: append ReconcileVmHaEnabledMetadataLiveCmd
  Simulator->>Agent: AgentResponse
  Agent->>Simulator: ha/networkgroup/sync
  Simulator->>Config: append sync command
  Simulator->>Agent: AgentResponse
Loading

🎯 3 (中等) | ⏱️ ~20 分钟

🐰 一批新表跳过舞蹈,代理命令列队排成行,
模拟器端点齐声唱,网络组从此有新装!
错误代码铺展开,HA之梦今朝共享享。 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed 标题遵循了正确的格式 [scope]: ,且长度为58个字符,未超过72个字符限制。标题清晰反映了PR的主要内容——引入网络组高可用性功能。
Description check ✅ Passed PR描述相关联的变更内容,包含了问题跟踪ID(ZSTAC-81413)、变更ID和来源说明,虽然内容简洁但与变更集相关。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/haoyu.ding/fv-81413@@3

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ast-grep (0.42.1)
utils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.java
🔧 Microsoft Presidio Analyzer (2.2.362)
testlib/src/main/java/org/zstack/testlib/ApiHelper.groovy

Microsoft Presidio Analyzer failed to scan this file


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
testlib/src/main/java/org/zstack/testlib/ApiHelper.groovy (1)

5663-5687: 🏗️ Heavy lift

抽取通用 apipath 调用模板,避免 5 处复制逻辑后续漂移

当前 apiId 补齐、ApiPathTracker 创建、a.call()Test.apiPaths 写入完全重复;后续修改容易漏改一处。建议提取成统一 helper。

♻️ 可参考的重构方向
+    private def callWithApiPathTracking(def a) {
+        if (System.getProperty("apipath") != null) {
+            if (a.apiId == null) {
+                a.apiId = Platform.uuid
+            }
+            def tracker = new ApiPathTracker(a.apiId)
+            def out = errorOut(a.call())
+            def path = tracker.getApiPath()
+            if (!path.isEmpty()) {
+                Test.apiPaths[a.class.name] = path.join(" --->\n")
+            }
+            return out
+        }
+        return errorOut(a.call())
+    }

然后每个方法收敛为:

-        if (System.getProperty("apipath") != null) {
-            ...
-        } else {
-            return errorOut(a.call())
-        }
+        return callWithApiPathTracking(a)

Also applies to: 9686-9710, 14870-14894, 31780-31795, 45201-45216

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@testlib/src/main/java/org/zstack/testlib/ApiHelper.groovy` around lines 5663
- 5687, Extract the repeated apipath handling into a single helper (e.g., a
private method like handleApiCall(Action a) or wrapApiCall) and replace the
duplicated blocks in methods such as changeHaNetworkGroupState by calling that
helper; the helper should: ensure a.apiId is set (Platform.uuid when null),
instantiate ApiPathTracker with a.apiId, invoke a.call() through errorOut,
record the path into Test.apiPaths keyed by a.class.name when non-empty, and
return the errorOut result—replace the duplicated logic in other locations (the
other listed ranges) to call this new helper and remove the inline duplication.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@conf/db/upgrade/V5.5.22__schema.sql`:
- Around line 8-9: Replace the invalid zero-datetime defaults with
CURRENT_TIMESTAMP: update the column definitions that use `DEFAULT '0000-00-00
00:00:00'` (e.g., `lastOpDate`, `createDate` and the other two occurrences) to
use `DEFAULT CURRENT_TIMESTAMP` (and keep `ON UPDATE CURRENT_TIMESTAMP` only on
columns that should auto-update, e.g., `lastOpDate`), ensuring the NOT NULL
semantics remain intact so the DDL will succeed in strict SQL modes.

In
`@utils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.java`:
- Around line 1875-1918: The new constants ORG_ZSTACK_HA_10019 through
ORG_ZSTACK_HA_10040 in CloudOperationsErrorCode are declared but lack message
definitions and are not referenced; either add message entries for each code in
the errorCode resource files and wire them into the ErrorCode/ResourceLoader
mechanism, then replace placeholder or create usage sites in business logic to
throw or return these codes (search for usages patterns like
ErrorCode/CLOUD_OPERATION_* or methods that create ErrorCode instances) or
remove the unused constants; if they are intended as reserved, add a clear
Javadoc comment on the constants explaining their planned purpose and why they
are unused.

---

Nitpick comments:
In `@testlib/src/main/java/org/zstack/testlib/ApiHelper.groovy`:
- Around line 5663-5687: Extract the repeated apipath handling into a single
helper (e.g., a private method like handleApiCall(Action a) or wrapApiCall) and
replace the duplicated blocks in methods such as changeHaNetworkGroupState by
calling that helper; the helper should: ensure a.apiId is set (Platform.uuid
when null), instantiate ApiPathTracker with a.apiId, invoke a.call() through
errorOut, record the path into Test.apiPaths keyed by a.class.name when
non-empty, and return the errorOut result—replace the duplicated logic in other
locations (the other listed ranges) to call this new helper and remove the
inline duplication.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: 1451bf12-1c77-4c28-b4dc-25391e49b304

📥 Commits

Reviewing files that changed from the base of the PR and between 18b0d53 and db04ee4.

⛔ Files ignored due to path filters (12)
  • sdk/src/main/java/SourceClassMap.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/ChangeHaNetworkGroupStateAction.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/ChangeHaNetworkGroupStateResult.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/CreateHaNetworkGroupAction.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/CreateHaNetworkGroupResult.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/DeleteHaNetworkGroupAction.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/DeleteHaNetworkGroupResult.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/HaNetworkGroupInventory.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/QueryHaNetworkGroupAction.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/QueryHaNetworkGroupResult.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/UpdateHaNetworkGroupAction.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/UpdateHaNetworkGroupResult.java is excluded by !sdk/**
📒 Files selected for processing (9)
  • conf/db/upgrade/V5.5.22__schema.sql
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMConstant.java
  • simulator/simulatorImpl/src/main/java/org/zstack/simulator/kvm/KVMSimulatorConfig.java
  • simulator/simulatorImpl/src/main/java/org/zstack/simulator/kvm/KVMSimulatorController.java
  • testlib/src/main/java/org/zstack/testlib/ApiHelper.groovy
  • testlib/src/main/java/org/zstack/testlib/EnvSpec.groovy
  • testlib/src/main/java/org/zstack/testlib/KVMSimulator.groovy
  • utils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.java

Comment on lines +8 to +9
`lastOpDate` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00' ON UPDATE CURRENT_TIMESTAMP,
`createDate` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

修复非法零时间默认值,避免升级脚本在严格模式下失败

Line 8、Line 9、Line 17、Line 18、Line 31、Line 32 使用了 DEFAULT '0000-00-00 00:00:00',这在 MySQL 8/GreatSQL 常见严格 SQL 模式下会导致 DDL 执行失败,属于升级阻断问题。

🛠️ 建议修复 diff
 CREATE TABLE IF NOT EXISTS `zstack`.`HaNetworkGroupVO` (
@@
-    `lastOpDate` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00' ON UPDATE CURRENT_TIMESTAMP,
-    `createDate` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
+    `lastOpDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
+    `createDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
@@
 CREATE TABLE IF NOT EXISTS `zstack`.`HaNetworkGroupL3NetworkRefVO` (
@@
-    `lastOpDate` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00' ON UPDATE CURRENT_TIMESTAMP,
-    `createDate` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
+    `lastOpDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
+    `createDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
@@
 CREATE TABLE IF NOT EXISTS `zstack`.`HostHaNetworkGroupStatusVO` (
@@
-    `lastOpDate` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00' ON UPDATE CURRENT_TIMESTAMP,
-    `createDate` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
+    `lastOpDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
+    `createDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,

As per coding guidelines **/*.sql: Do not use DEFAULT 0000-00-00 00:00:00, use DEFAULT CURRENT_TIMESTAMP instead.

Also applies to: 17-18, 31-32

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@conf/db/upgrade/V5.5.22__schema.sql` around lines 8 - 9, Replace the invalid
zero-datetime defaults with CURRENT_TIMESTAMP: update the column definitions
that use `DEFAULT '0000-00-00 00:00:00'` (e.g., `lastOpDate`, `createDate` and
the other two occurrences) to use `DEFAULT CURRENT_TIMESTAMP` (and keep `ON
UPDATE CURRENT_TIMESTAMP` only on columns that should auto-update, e.g.,
`lastOpDate`), ensuring the NOT NULL semantics remain intact so the DDL will
succeed in strict SQL modes.

Comment on lines +1875 to +1918
public static final String ORG_ZSTACK_HA_10019 = "ORG_ZSTACK_HA_10019";

public static final String ORG_ZSTACK_HA_10020 = "ORG_ZSTACK_HA_10020";

public static final String ORG_ZSTACK_HA_10021 = "ORG_ZSTACK_HA_10021";

public static final String ORG_ZSTACK_HA_10022 = "ORG_ZSTACK_HA_10022";

public static final String ORG_ZSTACK_HA_10023 = "ORG_ZSTACK_HA_10023";

public static final String ORG_ZSTACK_HA_10024 = "ORG_ZSTACK_HA_10024";

public static final String ORG_ZSTACK_HA_10025 = "ORG_ZSTACK_HA_10025";

public static final String ORG_ZSTACK_HA_10026 = "ORG_ZSTACK_HA_10026";

public static final String ORG_ZSTACK_HA_10027 = "ORG_ZSTACK_HA_10027";

public static final String ORG_ZSTACK_HA_10028 = "ORG_ZSTACK_HA_10028";

public static final String ORG_ZSTACK_HA_10029 = "ORG_ZSTACK_HA_10029";

public static final String ORG_ZSTACK_HA_10030 = "ORG_ZSTACK_HA_10030";

public static final String ORG_ZSTACK_HA_10031 = "ORG_ZSTACK_HA_10031";

public static final String ORG_ZSTACK_HA_10032 = "ORG_ZSTACK_HA_10032";

public static final String ORG_ZSTACK_HA_10033 = "ORG_ZSTACK_HA_10033";

public static final String ORG_ZSTACK_HA_10034 = "ORG_ZSTACK_HA_10034";

public static final String ORG_ZSTACK_HA_10035 = "ORG_ZSTACK_HA_10035";

public static final String ORG_ZSTACK_HA_10036 = "ORG_ZSTACK_HA_10036";

public static final String ORG_ZSTACK_HA_10037 = "ORG_ZSTACK_HA_10037";

public static final String ORG_ZSTACK_HA_10038 = "ORG_ZSTACK_HA_10038";

public static final String ORG_ZSTACK_HA_10039 = "ORG_ZSTACK_HA_10039";

public static final String ORG_ZSTACK_HA_10040 = "ORG_ZSTACK_HA_10040";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# 1. Check which of the new constants have message definitions in error-code resource files
echo "=== Searching for message definitions ==="
for code in $(seq 10019 10040); do
    key="ORG_ZSTACK_HA_${code}"
    count=$(rg -l "$key" --glob '*.properties' --glob '*.json' --glob '*.yaml' --glob '*.yml' 2>/dev/null | wc -l)
    echo "$key: $count file(s) with definition"
done

# 2. Check which of the new constants are referenced in Java/Groovy source code
echo ""
echo "=== Searching for usages in source code ==="
for code in $(seq 10019 10040); do
    key="ORG_ZSTACK_HA_${code}"
    count=$(rg -l "$key" --glob '*.java' --glob '*.groovy' 2>/dev/null | wc -l)
    echo "$key: $count file(s) referencing this constant"
done

Repository: MatheMatrix/zstack

Length of output: 252


🏁 Script executed:

#!/bin/bash
# Find error code message definitions and usages for ORG_ZSTACK_HA_10019-10040

echo "=== Searching for message definitions in properties/JSON/YAML files ==="
python3 << 'PYTHON_EOF'
import subprocess
import os

codes_without_defs = []
codes_with_defs = []

for code in range(10019, 10041):  # 10019 to 10040 inclusive
    key = f"ORG_ZSTACK_HA_{code}"
    result = subprocess.run(['rg', key, '-g', '*.properties', '-g', '*.json', '-g', '*.yaml', '-g', '*.yml'], 
                          capture_output=True, text=True)
    matches = result.stdout.strip()
    if matches:
        codes_with_defs.append((key, len(matches.split('\n'))))
    else:
        codes_without_defs.append(key)

if codes_with_defs:
    print(f"Codes WITH message definitions ({len(codes_with_defs)}):")
    for key, count in codes_with_defs:
        print(f"  {key}")
else:
    print("No message definitions found for any of the new codes")

if codes_without_defs:
    print(f"\nCodes WITHOUT message definitions ({len(codes_without_defs)}):")
    for key in codes_without_defs:
        print(f"  {key}")
PYTHON_EOF

echo ""
echo "=== Searching for usages in Java/Groovy source code ==="
python3 << 'PYTHON_EOF'
import subprocess

codes_with_usage = []
codes_without_usage = []

for code in range(10019, 10041):  # 10019 to 10040 inclusive
    key = f"ORG_ZSTACK_HA_{code}"
    result = subprocess.run(['rg', key, '-g', '*.java', '-g', '*.groovy'], 
                          capture_output=True, text=True)
    matches = result.stdout.strip()
    if matches:
        codes_with_usage.append((key, len(matches.split('\n'))))
    else:
        codes_without_usage.append(key)

if codes_with_usage:
    print(f"Codes WITH usages in source ({len(codes_with_usage)}):")
    for key, count in codes_with_usage[:10]:  # Show first 10
        print(f"  {key}: {count} match(es)")
    if len(codes_with_usage) > 10:
        print(f"  ... and {len(codes_with_usage) - 10} more")
else:
    print("No usages found for any of the new codes in source files")

if codes_without_usage:
    print(f"\nCodes NOT used in source ({len(codes_without_usage)}):")
    if len(codes_without_usage) <= 22:
        for key in codes_without_usage:
            print(f"  {key}")
PYTHON_EOF

# Also check the structure of error code message files
echo ""
echo "=== Finding error code message resource files ==="
find . -type f \( -name "*errorCode*" -o -name "*error*code*" \) \( -name "*.properties" -o -name "*.json" -o -name "*.yaml" -o -name "*.yml" \) | head -20

Repository: MatheMatrix/zstack

Length of output: 1394


新增的 22 个错误码缺少消息定义且未被使用

验证结果表明:

  • 所有 22 个新错误码(ORG_ZSTACK_HA_10019 ~ ORG_ZSTACK_HA_10040)均未在任何资源文件中定义消息
  • 所有 22 个新错误码在源代码中均没有被引用

这是一个关键问题:

  1. 错误码缺少对应的消息定义,运行时将无法获取到错误描述信息,影响用户体验
  2. 所有新增的错误码当前都是死代码,需要:
    • 补充消息定义(在 errorCode 相关资源文件中),且
    • 确保在实际业务代码中被引用使用

建议:如果这些错误码是预留性的,应在代码注释中明确说明其预期用途;否则应补充完整的定义和使用,或移除这些未使用的常量。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@utils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.java`
around lines 1875 - 1918, The new constants ORG_ZSTACK_HA_10019 through
ORG_ZSTACK_HA_10040 in CloudOperationsErrorCode are declared but lack message
definitions and are not referenced; either add message entries for each code in
the errorCode resource files and wire them into the ErrorCode/ResourceLoader
mechanism, then replace placeholder or create usage sites in business logic to
throw or return these codes (search for usages patterns like
ErrorCode/CLOUD_OPERATION_* or methods that create ErrorCode instances) or
remove the unused constants; if they are intended as reserved, add a clear
Javadoc comment on the constants explaining their planned purpose and why they
are unused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants