Skip to content

<fix>[eip]: expose attaching EIP to release#4127

Open
ZStack-Robot wants to merge 2 commits into
5.5.22from
sync/dejing.liu/codex/ZSTAC-83309-eip-attach-race
Open

<fix>[eip]: expose attaching EIP to release#4127
ZStack-Robot wants to merge 2 commits into
5.5.22from
sync/dejing.liu/codex/ZSTAC-83309-eip-attach-race

Conversation

@ZStack-Robot
Copy link
Copy Markdown
Collaborator

Persist vmNicUuid and guestIp before applying an EIP on the backend. This lets VM stop and migration release paths see the in-flight EIP and enqueue cleanup while the backend apply is still running.

Roll back the binding only when the same EIP, NIC, and guest IP are still present if backend apply fails.

Related: ZSTAC-83309

Change-Id: Ia9b75acc82004755b905a02acca1a6a82c207332

sync from gitlab !10029

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

Review Change Stack

走查

调整 EIP 附加处理:成功时缓存并刷新绑定字段以生成回调 inventory;失败时使用原始绑定值通过 SQL 清空记录中的绑定字段并进入失败通知。新增集成测试覆盖并发释放和失败回滚场景。

变更

EIP 附加状态管理 与 测试

Layer / File(s) Summary
附加流程成功和失败的状态管理
plugin/eip/src/main/java/org/zstack/network/service/eip/EipManagerImpl.java
在 attach 成功分支缓存 vmNicUuidguestIp,通过 dbf.updateAndRefresh() 刷新并生成用于成功回调的 EipInventory;失败分支使用原始绑定值作为 SQL 定位条件,清空对应 EIP 记录的 vmNicUuidguestIp
并发与失败回滚集成测试
test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/eip/AttachEipRaceCase.groovy
新增 AttachEipRaceCase,构建测试环境并包含两项验证:并发 attach 场景中停止 VM 应触发 EIP 释放(拦截并断言 DELETE_EIP_PATH),以及 attach 失败时后端返回 500 应回滚 EipVO 中的 vmNicUuid/guestIpnull

序列图

sequenceDiagram
  participant Client
  participant EipManagerImpl
  participant Database
  participant FlatEipBackend
  participant Callback

  Client->>EipManagerImpl: APIAttachEipMsg
  EipManagerImpl->>EipManagerImpl: 缓存 attachedVmNicUuid/attachedGuestIp
  EipManagerImpl->>Database: dbf.updateAndRefresh(vo)
  Database-->>EipManagerImpl: updated EipVO
  EipManagerImpl->>FlatEipBackend: APPLY_EIP 请求
  alt 附加成功
    FlatEipBackend-->>EipManagerImpl: 200 OK
    EipManagerImpl->>Callback: 发布成功事件(使用刷新后的 EipInventory)
  else 附加失败
    FlatEipBackend-->>EipManagerImpl: 500 错误
    EipManagerImpl->>Database: SQL 按 eipUuid + 原 vmNicUuid + 原 guestIp 清空绑定字段
    Database-->>EipManagerImpl: 清空完成
    EipManagerImpl->>Callback: 进入失败通知流程
  end
Loading

评估代码审查工作量

🎯 4 (复杂) | ⏱️ ~45 分钟

🐰 我在草丛里跳跃,代码也跟着跑,
成功时我把绑定刷新好,
失败时悄悄回滚不留毛,
测试并发把边界守牢,
小小一行,稳如老窖。

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed PR标题遵循了指定格式 '[scope]: ',长度为43个字符,未超过72字符限制,且标题与changeset直接相关。
Description check ✅ Passed PR描述与changeset相关,清晰说明了在EIP绑定前持久化vmNicUuid和guestIp、以及失败时的回滚逻辑。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/dejing.liu/codex/ZSTAC-83309-eip-attach-race

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
plugin/eip/src/main/java/org/zstack/network/service/eip/EipManagerImpl.java (1)

610-621: ⚡ Quick win

在回滚条件更新后检查受影响行数并告警,防止与 vmIpChanged 竞态造成状态不一致。

  • vmIpChanged(...) 会在找到 EipVO_.vmNicUuid == nic.uuidEipVO_.guestIp == oldIp 时,直接 eip.setGuestIp(newIp)dbf.update(eip)
  • 回滚里使用条件更新同时匹配 EipVO_.vmNicUuid == attachedVmNicUuidEipVO_.guestIp == attachedGuestIp;若回滚前已发生 vmIpChanged 改写了 guestIp,则该回滚可能更新 0 行,绑定不会被清空。
  • UpdateQuery.update() 返回 int,建议在回滚处接收返回值:affectedRows == 0 时记录 warning 日志(带 eipUuid/attachedVmNicUuid/attachedGuestIp),便于排查该边界竞态。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugin/eip/src/main/java/org/zstack/network/service/eip/EipManagerImpl.java`
around lines 610 - 621, The rollback in the anonymous fail(...) inside
EipManagerImpl uses a conditional UpdateQuery but ignores the returned affected
row count; modify the fail handler to capture the int returned by
SQL.New(...).set(...).update() and when affectedRows == 0 emit a warning log
(include msg.getEipUuid(), attachedVmNicUuid and attachedGuestIp) to surface the
vmIpChanged race where guestIp was changed and the conditional update cleared 0
rows; keep existing evt.setError(...) and bus.publish(evt) behavior after
logging.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@plugin/eip/src/main/java/org/zstack/network/service/eip/EipManagerImpl.java`:
- Around line 610-621: The rollback in the anonymous fail(...) inside
EipManagerImpl uses a conditional UpdateQuery but ignores the returned affected
row count; modify the fail handler to capture the int returned by
SQL.New(...).set(...).update() and when affectedRows == 0 emit a warning log
(include msg.getEipUuid(), attachedVmNicUuid and attachedGuestIp) to surface the
vmIpChanged race where guestIp was changed and the conditional update cleared 0
rows; keep existing evt.setError(...) and bus.publish(evt) behavior after
logging.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: 7442b00d-6e9c-4be2-9da2-4d7be0a506d2

📥 Commits

Reviewing files that changed from the base of the PR and between 113a77a and aa07605.

📒 Files selected for processing (1)
  • plugin/eip/src/main/java/org/zstack/network/service/eip/EipManagerImpl.java

dejing.liu added 2 commits June 2, 2026 10:07
Persist vmNicUuid and guestIp before applying an EIP on the backend. This lets VM stop and migration release paths see the in-flight EIP and enqueue cleanup while the backend apply is still running.

Roll back the binding only when the same EIP, NIC, and guest IP are still present if backend apply fails.

Related: ZSTAC-83309

Change-Id: Ia9b75acc82004755b905a02acca1a6a82c207332
Add an integration subcase that blocks backend EIP apply.

Verify the EIP DB binding before the backend call returns.

Cover backend apply failure too.

Ensure the early binding is rolled back after attach fails.

Related: ZSTAC-83309

Change-Id: Ifadfc85f14412579abb6c7b5f1706bc614db5edd
@MatheMatrix MatheMatrix force-pushed the sync/dejing.liu/codex/ZSTAC-83309-eip-attach-race branch from aa07605 to 8ac6855 Compare June 2, 2026 02:12
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/eip/AttachEipRaceCase.groovy`:
- Around line 174-180: The test currently asserts deleteCmd != null immediately
after StopVmInstanceAction.call(), but deleteCmd is set asynchronously in the
env.afterSimulator callback for FlatEipBackend.DELETE_EIP_PATH; update the test
to wait for the simulator callback to run before asserting (e.g. poll/retry with
timeout or use a CountDownLatch/AtomicReference) by waiting until deleteCmd is
non-null (and optionally verifying cmd.eip.eipUuid == eip.uuid) after calling
StopVmInstanceAction.call(); reference the
env.afterSimulator(FlatEipBackend.DELETE_EIP_PATH) callback, the deleteCmd
variable, and the StopVmInstanceAction.call() invocation when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: daff05d3-149d-48e5-8831-b2c376e7c800

📥 Commits

Reviewing files that changed from the base of the PR and between aa07605 and 8ac6855.

📒 Files selected for processing (2)
  • plugin/eip/src/main/java/org/zstack/network/service/eip/EipManagerImpl.java
  • test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/eip/AttachEipRaceCase.groovy
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugin/eip/src/main/java/org/zstack/network/service/eip/EipManagerImpl.java

Comment on lines +174 to +180
env.afterSimulator(FlatEipBackend.DELETE_EIP_PATH) { rsp, HttpEntity<String> entity ->
FlatEipBackend.DeleteEipCmd cmd = json(entity.getBody(), FlatEipBackend.DeleteEipCmd.class)
if (cmd.eip.eipUuid == eip.uuid) {
deleteCmd = cmd
}
return rsp
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

等待 DELETE_EIP_PATH 回调后再断言。

Line 221 直接断言 deleteCmd != null,但这个值是在模拟器回调里异步赋值的。StopVmInstanceAction.call() 返回时,删除请求不一定已经到达这里,测试会变成偶发失败。

建议修改
         CountDownLatch applyEntered = new CountDownLatch(1)
         CountDownLatch releaseApply = new CountDownLatch(1)
+        CountDownLatch deleteEntered = new CountDownLatch(1)
         FlatEipBackend.ApplyEipCmd applyCmd = null
         FlatEipBackend.DeleteEipCmd deleteCmd = null
@@
         env.afterSimulator(FlatEipBackend.DELETE_EIP_PATH) { rsp, HttpEntity<String> entity ->
             FlatEipBackend.DeleteEipCmd cmd = json(entity.getBody(), FlatEipBackend.DeleteEipCmd.class)
             if (cmd.eip.eipUuid == eip.uuid) {
                 deleteCmd = cmd
+                deleteEntered.countDown()
             }
             return rsp
         }
@@
         def result = action.call()
 
         assert result.error == null
+        assert deleteEntered.await(30, TimeUnit.SECONDS)
         assert deleteCmd != null
         assert deleteCmd.eip.eipUuid == eip.uuid

Also applies to: 215-223

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@test/src/test/groovy/org/zstack/test/integration/networkservice/provider/flat/eip/AttachEipRaceCase.groovy`
around lines 174 - 180, The test currently asserts deleteCmd != null immediately
after StopVmInstanceAction.call(), but deleteCmd is set asynchronously in the
env.afterSimulator callback for FlatEipBackend.DELETE_EIP_PATH; update the test
to wait for the simulator callback to run before asserting (e.g. poll/retry with
timeout or use a CountDownLatch/AtomicReference) by waiting until deleteCmd is
non-null (and optionally verifying cmd.eip.eipUuid == eip.uuid) after calling
StopVmInstanceAction.call(); reference the
env.afterSimulator(FlatEipBackend.DELETE_EIP_PATH) callback, the deleteCmd
variable, and the StopVmInstanceAction.call() invocation when making the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant