Fix crash on connection timeout#1729
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughWinston logger initialization in the module was updated to set Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/playout-gateway/src/index.ts (1)
89-96: Optional: consider scoping which errors are swallowed, and add observability.Catching every
uncaughtExceptionindefinitely is a known Node.js footgun — the process may be left in an inconsistent state after, e.g., aTypeErrordeep in a callback, and subsequent failures can be hard to diagnose. The PR description already calls this out as a workaround pendingsofie-timeline-state-resolver#457. Two low-cost mitigations worth considering:
- Emit at a higher severity / include the
originargument fromuncaughtExceptionso operators can distinguish "device timeout" noise from genuine bugs:process.on('uncaughtException', (e: Error, origin: string) => { logger.error(`Unhandled exception (${origin}): ${stringifyError(e)}`) })- A metric/counter on these handlers would make it easy to alert if the gateway starts silently absorbing a flood of exceptions instead of crash-looping (which is currently the only signal).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/playout-gateway/src/index.ts` around lines 89 - 96, Update the process error handlers (the process.on('uncaughtException', ...) and process.on('unhandledRejection', ...) blocks) to record the exception origin and increase observability: change the uncaughtException handler signature to accept (err: Error, origin: string) and include origin in the logger.error message along with stringifyError(err), and add a metric/counter increment (e.g. call a metrics.increment or similar telemetry function) inside both handlers to track counts of swallowed uncaught exceptions and unhandled rejections; keep the current non-crashing behavior but ensure you reference stringifyError and logger in the new messages so operators can distinguish device/noise errors from real bugs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/playout-gateway/src/index.ts`:
- Around line 89-96: Remove the explicit process.on listeners for
'uncaughtException' and 'unhandledRejection' (the two handlers that call
logger.error with stringifyError) and rely on Winston's configured transports +
exitOnError: false to handle and log exceptions/rejections; this avoids
duplicate logging from both Winston's internal handlers and the manual
process.on handlers (alternatively, if you prefer keeping manual handlers,
disable handleExceptions/handleRejections on the Winston transports instead).
---
Nitpick comments:
In `@packages/playout-gateway/src/index.ts`:
- Around line 89-96: Update the process error handlers (the
process.on('uncaughtException', ...) and process.on('unhandledRejection', ...)
blocks) to record the exception origin and increase observability: change the
uncaughtException handler signature to accept (err: Error, origin: string) and
include origin in the logger.error message along with stringifyError(err), and
add a metric/counter increment (e.g. call a metrics.increment or similar
telemetry function) inside both handlers to track counts of swallowed uncaught
exceptions and unhandled rejections; keep the current non-crashing behavior but
ensure you reference stringifyError and logger in the new messages so operators
can distinguish device/noise errors from real bugs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 78d36f7d-409f-4189-99b1-0f59477675b3
📒 Files selected for processing (1)
packages/playout-gateway/src/index.ts
| // Log uncaught exceptions and unhandled rejections without crashing the process. | ||
| // Device connection errors (e.g. ETIMEDOUT) must not bring down the gateway. | ||
| process.on('uncaughtException', (e: Error) => { | ||
| logger.error(`Unhandled exception: ${stringifyError(e)}`) | ||
| }) | ||
| process.on('unhandledRejection', (reason: unknown) => { | ||
| logger.error(`Unhandled promise rejection: ${stringifyError(reason)}`) | ||
| }) |
There was a problem hiding this comment.
Likely duplicate logging — Winston already handles these.
Both transports above are configured with handleExceptions: true and handleRejections: true (lines 35–41 and 65–66). Winston's exception/rejection handlers register their own process.on('uncaughtException') / process.on('unhandledRejection') listeners when those transports are attached, so each uncaught exception/rejection will now be logged twice — once by Winston's internal handler and once by the explicit handler added here.
Pick one approach. The smaller change (and consistent with packages/mos-gateway/src/index.ts, which only relies on Winston) is to drop the explicit listeners and let exitOnError: false keep the process alive:
♻️ Option A — rely on Winston only (matches mos-gateway)
-// Log uncaught exceptions and unhandled rejections without crashing the process.
-// Device connection errors (e.g. ETIMEDOUT) must not bring down the gateway.
-process.on('uncaughtException', (e: Error) => {
- logger.error(`Unhandled exception: ${stringifyError(e)}`)
-})
-process.on('unhandledRejection', (reason: unknown) => {
- logger.error(`Unhandled promise rejection: ${stringifyError(reason)}`)
-})♻️ Option B — keep these handlers, disable Winston's built-in handling
const transportConsole = new Winston.transports.Console({
level: logLevel || 'silly',
- handleExceptions: true,
- handleRejections: true,
})
const transportFile = new Winston.transports.File({
level: logLevel || 'silly',
- handleExceptions: true,
- handleRejections: true,
filename: logPath,
format: myFormat,
})…and the same for the console-only transport in the else branch.
Please confirm Winston 3.19.0's behavior here (does attaching a transport with handleExceptions: true install its own process.on('uncaughtException') listener?):
winston 3.19 handleExceptions handleRejections process.on uncaughtException listener
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/playout-gateway/src/index.ts` around lines 89 - 96, Remove the
explicit process.on listeners for 'uncaughtException' and 'unhandledRejection'
(the two handlers that call logger.error with stringifyError) and rely on
Winston's configured transports + exitOnError: false to handle and log
exceptions/rejections; this avoids duplicate logging from both Winston's
internal handlers and the manual process.on handlers (alternatively, if you
prefer keeping manual handlers, disable handleExceptions/handleRejections on the
Winston transports instead).
The update to TSR 4eed5ee seems to have changed the behaviour when a timeout occurs so that some sort of exception occurs. Winston in playout gateway sees this exception and exits. This configures Winston to not exit in those circumstances.
053f85a to
ce0c3bb
Compare
About the Contributor
This pull request is posted on behalf of the BBC.
Type of Contribution
This is a Bug fix
Current Behavior
The update to TSR 4eed5ee seems to have changed the behaviour when a timeout occurs so that some sort of exception occurs. Winston in playout gateway sees this exception and exits.
New Behavior
Change Winston config to not exit on errors.
Testing
Affected areas
This affects how playout gateway handles errors.
Time Frame
Other Information
This PR is not immediately needed if we prevent TSR passing the errors to Sofie core, as in this PR: Sofie-Automation/sofie-timeline-state-resolver#457 but it might be a good idea to do anyway.
Status