fix: exclude status-only API v2 resources from readiness gating#2745
Open
yangkaa wants to merge 2 commits intoapache:masterfrom
Open
fix: exclude status-only API v2 resources from readiness gating#2745yangkaa wants to merge 2 commits intoapache:masterfrom
yangkaa wants to merge 2 commits intoapache:masterfrom
Conversation
Contributor
|
Hi @yangkaa, please fix failed CI |
Author
I fixed the missing license header in the new test file. For the remaining failed CI jobs, they look unrelated to this PR. The E2E jobs are failing before the test suite starts, in the existing workflow step
with:
So the current failures seem to come from the CI workflow / ADC dev image layout rather than from the readiness-gating change in this PR. |
Baoyuantop
approved these changes
Apr 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
背景
这个改动用于修复
apisix-ingress-controller在启动阶段的 readiness 等待逻辑问题。在当前实现中,controller 在启动后不会立刻开始第一次全量同步,而是会先等待一批资源完成初始 reconcile。只有当这些资源都被标记为 ready 后,provider 才会进入:
然后继续进入:
问题现象
当前 API v2 readiness 注册集合中包含:
IngressApisixRouteApisixGlobalRuleApisixPluginConfigApisixTlsApisixConsumerApisixUpstream但是在
v2.0.1中:ApisixUpstreamReconcilerApisixPluginConfigReconciler都只是状态更新型 controller,本身不会调用
Readier.Done(...)。这意味着:
根因分析
这里的问题不是偶发性的“done 丢失”,而是 readiness 注册集合本身就包含了不应该参与 startup gating 的资源。
从职责上看:
ApisixUpstream/ApisixPluginConfig在当前版本中并不直接驱动 dataplane 配置下发。ApisixRoutecontroller。ApisixRouteReconciler已经会 watch 这两类对象变化,并且它自身会调用Readier.Done(...)。因此,把
ApisixUpstream和ApisixPluginConfig作为 startup readiness 的硬性等待条件,既没有必要,也会造成启动阻塞。修复方式
本 PR 的修复策略是:
ApisixUpstream从 API v2 readiness 注册集合中移除ApisixPluginConfig从 API v2 readiness 注册集合中移除保留真正参与 provider 同步、并且已有
Done()调用的资源:IngressApisixRouteApisixGlobalRuleApisixTlsApisixConsumer另外补充了一个单元测试,用于约束 readiness 注册列表,避免后续回归。
验证结果
包级测试
执行:
env GOPROXY=https://goproxy.cn,direct GOSUMDB=sum.golang.google.cn \ go test ./internal/manager/... ./internal/controller/... ./internal/provider/apisix/...测试通过。
现场验证
在实际环境中验证了如下场景:
buildpack-upstream在 controller 启动前已存在2.0.1时:ApisixUpstreamwaiting for readinessApisixUpstreamReady detected, starting sync loop在启动后秒级出现7070)不再出现 5 分钟不可用窗口这样修复的原因
这个修复没有选择去给
ApisixUpstreamReconciler或ApisixPluginConfigReconciler强行补Done(),原因是:ApisixRoutecontroller,而后者已经有完整的 readiness 生命周期。所以从语义上讲,缩小 readiness 注册范围,比给状态型 controller 添加额外同步语义更稳妥。
Issue 关联
关联以下 issue:
#2725#2726