fix: /healthz and /readyz wait for initial device connection (SOFIE-4325)#1730
fix: /healthz and /readyz wait for initial device connection (SOFIE-4325)#1730jstarpl wants to merge 2 commits into
/healthz and /readyz wait for initial device connection (SOFIE-4325)#1730Conversation
Walkthrough
Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/mos-gateway/src/mosHandler.ts (2)
83-83: Non-null assertion makes existing_coreHandlerundefined-checks dead code.With
_coreHandler!: CoreHandlerthe field can no longer be typedundefined, so theif (!this._coreHandler) throw …branches at lines 156-158, 294, 453-455, and 599 are unreachable per the type system. They're harmless, but worth removing in a follow-up to keep the runtime invariant aligned with the type. Not blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/mos-gateway/src/mosHandler.ts` at line 83, The field _coreHandler is declared with a non-null assertion but code contains runtime undefined checks, so make the type reflect that by changing the declaration from "_coreHandler!: CoreHandler" to "private _coreHandler?: CoreHandler" so existing guards (the if (!this._coreHandler) throw ...) remain meaningful; update any call sites that currently rely on non-null typing to either keep the guards or use explicit non-null assertions where you truly know it is initialized (e.g., inside methods that run after initialization) so the type and runtime behavior stay aligned.
209-218: Clear_triggerUpdateDevicesTimeoutondispose().The deferred update is now scheduled via
setTimeoutand stored on the instance, butdispose()(lines 141-148) never clears it. The_disposedcheck inside_updateDevices()does prevent work from happening, but the pending timer still keeps the event loop alive and holds a closure reference tothis. Cheap to fix:♻️ Proposed cleanup
async dispose(): Promise<void> { this._disposed = true + if (this._triggerUpdateDevicesTimeout) { + clearTimeout(this._triggerUpdateDevicesTimeout) + this._triggerUpdateDevicesTimeout = null + } if (this.mos) { return this.mos.dispose() } else { return Promise.resolve() } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/mos-gateway/src/mosHandler.ts` around lines 209 - 218, The instance schedules a deferred timer in triggerUpdateDevices (stores it in _triggerUpdateDevicesTimeout) but dispose() does not clear it, leaving the timer alive and holding a reference to this; update dispose() to clearTimeout(this._triggerUpdateDevicesTimeout) and then null/undefined the property (or otherwise reset it) so any pending timer is cancelled, and ensure triggerUpdateDevices still overwrites the field when scheduling a new timer; reference functions/fields: triggerUpdateDevices, _triggerUpdateDevicesTimeout, dispose, and _updateDevices.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/mos-gateway/src/mosHandler.ts`:
- Around line 132-139: The ordering regression causes setupObservers() to run
before _initMosConnection() and can NPE when _deviceOptionsChanged() calls
this.mos.setDebug; fix by guarding _deviceOptionsChanged() to tolerate a missing
mos (e.g. early-return or defer calling this.mos.setDebug if this.mos is
undefined) or by moving setupObservers() to run only after
_initMosConnection()/_updateDevices() has initialized this.mos; update the code
paths in setupObservers, triggerUpdateDevices, _deviceOptionsChanged,
_updateDevices and _initMosConnection so observers never call this.mos methods
when this.mos is undefined (prefer adding a simple if (!this.mos) return or
queue the change until after _initMosConnection completes).
---
Nitpick comments:
In `@packages/mos-gateway/src/mosHandler.ts`:
- Line 83: The field _coreHandler is declared with a non-null assertion but code
contains runtime undefined checks, so make the type reflect that by changing the
declaration from "_coreHandler!: CoreHandler" to "private _coreHandler?:
CoreHandler" so existing guards (the if (!this._coreHandler) throw ...) remain
meaningful; update any call sites that currently rely on non-null typing to
either keep the guards or use explicit non-null assertions where you truly know
it is initialized (e.g., inside methods that run after initialization) so the
type and runtime behavior stay aligned.
- Around line 209-218: The instance schedules a deferred timer in
triggerUpdateDevices (stores it in _triggerUpdateDevicesTimeout) but dispose()
does not clear it, leaving the timer alive and holding a reference to this;
update dispose() to clearTimeout(this._triggerUpdateDevicesTimeout) and then
null/undefined the property (or otherwise reset it) so any pending timer is
cancelled, and ensure triggerUpdateDevices still overwrites the field when
scheduling a new timer; reference functions/fields: triggerUpdateDevices,
_triggerUpdateDevicesTimeout, dispose, and _updateDevices.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7ad57a02-ca62-4d6c-bc26-bb7fb90c58f3
📒 Files selected for processing (1)
packages/mos-gateway/src/mosHandler.ts
| coreHandler.onConnected(() => { | ||
| // This is called whenever a connection to Core has been (re-)established | ||
| this.setupObservers() | ||
| this.sendStatusOfAllMosDevices() | ||
| }) | ||
| this.setupObservers() | ||
|
|
||
| this.triggerUpdateDevices() |
There was a problem hiding this comment.
Ordering regression: setupObservers() now runs before _initMosConnection() and can NPE on this.mos.
Previously init() awaited _updateDevices() (which awaits _initMosConnection()), so this.mos was always defined before any observer/_deviceOptionsChanged could run. After this change, the call order in init() is:
setupObservers()(line 137) — synchronously invokes_deviceOptionsChanged()at line 172.triggerUpdateDevices()(line 139) — schedules_updateDevices()(and therefore_initMosConnection()) ~20 ms later.
If the peripheral device's deviceSettings.debugLogging differs from the default this.debugLogging = false, _deviceOptionsChanged() enters the branch at line 186 and reaches line 191:
if (!this.mos) {
throw Error('mos is undefined!')
}
this.mos.setDebug(this.debugLogging)this.mos is still undefined here, so init() will throw and MosHandler.create() will reject — which is exactly the boot-failure mode this PR is trying to remove for /healthz and /readyz. The same race exists for any peripheralDevice observer event that fires inside the 20 ms window before _updateDevices() initializes this.mos.
Either initialize the MOS connection eagerly (without awaiting device connections) before observers run, or guard _deviceOptionsChanged() so it tolerates this.mos being undefined at this stage. Sketch:
🛡️ Suggested guard in _deviceOptionsChanged
if (this.debugLogging !== settings.debugLogging) {
this._logger.info('Changing debugLogging to ' + settings.debugLogging)
this.debugLogging = !!settings.debugLogging
- if (!this.mos) {
- throw Error('mos is undefined!')
- }
- this.mos.setDebug(this.debugLogging)
+ // `this.mos` may not yet exist if observers fire before _initMosConnection completes;
+ // the value will be applied on next change once mos is initialized.
+ this.mos?.setDebug(this.debugLogging)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/mos-gateway/src/mosHandler.ts` around lines 132 - 139, The ordering
regression causes setupObservers() to run before _initMosConnection() and can
NPE when _deviceOptionsChanged() calls this.mos.setDebug; fix by guarding
_deviceOptionsChanged() to tolerate a missing mos (e.g. early-return or defer
calling this.mos.setDebug if this.mos is undefined) or by moving
setupObservers() to run only after _initMosConnection()/_updateDevices() has
initialized this.mos; update the code paths in setupObservers,
triggerUpdateDevices, _deviceOptionsChanged, _updateDevices and
_initMosConnection so observers never call this.mos methods when this.mos is
undefined (prefer adding a simple if (!this.mos) return or queue the change
until after _initMosConnection completes).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/mos-gateway/src/mosHandler.ts`:
- Around line 140-141: init() currently returns before the first
_updateDevices() runs and swallows _initMosConnection() failures, so
MosHandler.create() can appear successful while MOS is broken; update init() to
await the initial device update by returning/awaiting
this.triggerUpdateDevices() (or directly await this._updateDevices()) so init()
only resolves after the first successful update, and change the
_initMosConnection() error handling to rethrow or propagate errors instead of
only logging (so callers of init()/MosHandler.create() receive the failure).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1af185af-b884-4d8c-9a2b-ae15055f4973
📒 Files selected for processing (1)
packages/mos-gateway/src/mosHandler.ts
| this.triggerUpdateDevices() | ||
| } |
There was a problem hiding this comment.
Startup now hides fatal MOS init/config errors
init() returns before the first _updateDevices() runs, and Line 215–217 only logs failures. That means _initMosConnection() errors (eg. invalid mosId/ports) no longer fail MosHandler.create(), so the process can report initialized while MOS is broken.
Suggested fix
private async init(config: MosConfig, coreHandler: CoreHandler): Promise<void> {
this.mosOptions = config
this._coreHandler = coreHandler
...
this.mosTypes = getMosTypes(this.strict)
+ // Fail fast on invalid MOS configuration, but still avoid waiting for device connections
+ await this._initMosConnection()
+
coreHandler.onConnected(() => {
this.setupObservers()
this.sendStatusOfAllMosDevices()
})
this.setupObservers()
this.triggerUpdateDevices()
}Also applies to: 214-217
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/mos-gateway/src/mosHandler.ts` around lines 140 - 141, init()
currently returns before the first _updateDevices() runs and swallows
_initMosConnection() failures, so MosHandler.create() can appear successful
while MOS is broken; update init() to await the initial device update by
returning/awaiting this.triggerUpdateDevices() (or directly await
this._updateDevices()) so init() only resolves after the first successful
update, and change the _initMosConnection() error handling to rethrow or
propagate errors instead of only logging (so callers of
init()/MosHandler.create() receive the failure).
About the Contributor
This pull request is posted on behalf of the NRK.
Type of Contribution
This is a:
Bug fix
Current Behavior
The Kubernetes
/healthzand/readyzdon't return an OK signal until devices configured at boot-up have connected.New Behavior
The device connection is detached from boot-up, since it's not really necessary for the boot-up process. This allows the instance to stay up instead of being restarted.
Testing
Affected areas
This PR affects the MOS Gateway bootup process.
Time Frame
Not urgent, but we would like to get this merged into the in-development release.
Other Information
Status