Probe cycle discrimination + scheduler health snapshot by henry0816191 · Pull Request #50 · cppalliance/paperscout

henry0816191 · 2026-05-25T16:35:48Z

Summary

run_cycle() returns CycleResult (success / empty / failed) instead of a list.
/health reads scheduler state from a locked, immutable snapshot (last_updated, last_cycle_status, etc.).

Why

Operators could not tell a real empty probe from a failed one, and the health thread read scheduler fields without synchronization.

Main changes

models.py — CycleStatus, CycleResult
sources.py — discriminated run_cycle(), new log lines
monitor.py — handle all cycle outcomes; do not advance staleness on FAILED; SchedulerSnapshot + health_snapshot()
__main__.py — health extras from scheduler.health_snapshot()

Tests

pytest tests/test_sources.py tests/test_monitor.py tests/test_health.py -q

138 passed (includes empty/failed cycle, FAILED staleness, concurrent health snapshot).

Related issues

close #46
close #49

Summary by CodeRabbit

New Features
- /health now publishes an atomic, lock-protected scheduler snapshot (last-updated, poll count, probe stats, probe success rate, last cycle status/error) and merges allow-listed queue and DB metrics
- Probe cycles now return discriminated outcomes (SUCCESS / EMPTY / FAILED) with clearer probe-cycle logging
Behavior
- /health sanitizes extra fields so extras cannot overwrite core health keys
Documentation
- Clarified concurrency and immutable health-snapshot semantics
Tests
- Expanded tests for health merging, concurrency, probe-cycle outcomes, and queue health metrics

coderabbitai · 2026-05-25T16:36:00Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Implements discriminated ISO probe cycle outcomes (CycleResult with SUCCESS/EMPTY/FAILED), publishes an immutable SchedulerSnapshot under a lock, records cycle status/error and probe stats, merges scheduler snapshot + allow-listed MQ + db_pool for /health extras (scheduler wins on conflicts), and expands tests/docs for safety and concurrency.

Changes

Discriminated Cycles and Health Snapshots

Layer / File(s)	Summary
Docs and changelog updates `CHANGELOG.md`, `docs/architecture.md`	Document CycleResult/CycleStatus and SchedulerSnapshot + health_snapshot() behavior.
Cycle Result Types `src/paperscout/models.py`	`CycleStatus` enum and `CycleResult` frozen dataclass define discriminated probe outcomes and conditional `hits` property.
ISOProber Cycle Outcome Refactor `src/paperscout/sources.py`	`run_cycle()` now returns `CycleResult`, wraps cycle in try/except to return `FAILED` on cycle-level exceptions, runs concurrent probes, classifies `SUCCESS`/`EMPTY`, logs outcomes, and includes `cycle_status` in PROBE-CYCLE-SUMMARY.
Scheduler snapshot & health state `src/paperscout/monitor.py`	Adds probe success-rate computation, frozen `SchedulerSnapshot` and module defaults, thread-safe health fields and helpers, and snapshot publish under a lock integrated into `seed()`/`poll_once()`.
Main /health merge helpers `src/paperscout/__main__.py`	Adds `_mq_health_fields()` and `_merge_extra_health_fields()` to fetch MQ metrics (with fallbacks), allow-list MQ keys, merge with scheduler snapshot (scheduler-first), and append `db_pool`.
Health handler sanitization `src/paperscout/health.py`	Sanitize `extra_fields_fn()` output to prevent overwriting base health fields; build response from `{base, safe_extra}`.
MessageQueue metrics & config `src/paperscout/scout.py`, `src/paperscout/config.py`	Add `MessageQueue.health_fields()` reporting `mq_depth`, `mq_max_size`, `mq_utilization`, `mq_circuit_state`, and new MQ-related `Settings` fields used by health merging and tests.
ISOProber Unit Tests `tests/test_sources.py`	Update tests to assert `cycle.status` and `cycle.hits`; add `test_run_cycle_empty` and `test_run_cycle_failed`.
Scheduler Tests and Helpers `tests/test_monitor.py`	Add `CycleResult` helpers, update scheduler mocks to use `CycleResult` objects, assert FAILED cycles don't advance `last_successful_poll`, and validate `health_snapshot()` immutability and `poll_count`/`last_updated`.
Health Endpoint Tests and Concurrency `tests/test_health.py`	Expand `extra_fields_fn` fixture to include snapshot fields; add safety tests preventing overwrite/failure propagation; add concurrent snapshot publisher test verifying `/health` consistency under concurrent updates.
Health Merge Unit Tests `tests/test_main_health_merge.py`	Add tests for `_merge_extra_health_fields` and `_mq_health_fields`: scheduler-wins precedence, dropping unknown MQ keys with debug logs, fallback on exceptions, and allow-listed MQ key inclusion.
MessageQueue & CycleResult Tests `tests/test_message_queue.py`, `tests/test_models_cycle_result.py`	Add unit tests for `MessageQueue.health_fields()` outputs and `CycleResult` invariants.
Test fixtures `tests/conftest.py`	Add MQ-related test defaults used by the new tests.

Sequence Diagram

sequenceDiagram
  participant Scheduler
  participant ISOProber
  participant HealthEndpoint
  participant MessageQueue
  ISOProber->>Scheduler: CycleResult = run_cycle()
  alt Probe Success
    ISOProber->>ISOProber: perform concurrent HEAD probes, collect hits
    ISOProber-->>Scheduler: CycleResult(SUCCESS, results=hits)
  else Probe Empty
    ISOProber->>ISOProber: probes return no hits
    ISOProber-->>Scheduler: CycleResult(EMPTY)
  else Probe Failed
    ISOProber->>ISOProber: exception during cycle
    ISOProber-->>Scheduler: CycleResult(FAILED, error=msg)
  end
  Scheduler->>Scheduler: _probe_hits_from_cycle() (record status/error)
  Scheduler->>Scheduler: _record_probe_cycle_completion() (store probe_stats)
  alt non-FAILED cycle
    Scheduler->>Scheduler: _mark_poll_successful_if_probe_ok()
  end
  Scheduler->>Scheduler: _publish_health_snapshot()
  HealthEndpoint->>Scheduler: health_snapshot()
  Scheduler-->>HealthEndpoint: SchedulerSnapshot dict
  HealthEndpoint->>MessageQueue: health_fields() / depth()
  MessageQueue-->>HealthEndpoint: MQ metrics dict
  HealthEndpoint->>HealthEndpoint: merge snapshot + MQ + db_pool (scheduler wins)
  HealthEndpoint-->>Client: /health response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

cppalliance/paperscout#43: Both PRs modify the ISOProber probing flow in src/paperscout/sources.py—changing run_cycle/_probe_one behavior and the PROBE-CYCLE-SUMMARY/cycle-metrics output.
cppalliance/paperscout#44: Overlaps on probe success rate computation and health metric plumbing.
cppalliance/paperscout#29: Related changes to /health extra fields and snapshot wiring.

Suggested labels

bug

Suggested reviewers

wpak-ai

Poem

🐰 I hopped through cycles, brave and spry,

Found success, empty, or why it died;
I locked a snapshot, stamped the time,
Merged queues and DB in tidy rhyme;
Now health reports a steady sign.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Probe cycle discrimination + scheduler health snapshot' clearly summarizes the two main changes: discriminating cycle outcomes and adding a health snapshot mechanism.
Description check	✅ Passed	The description covers summary, rationale, main changes, and test results with clear references to related issues (`#46`, `#49`), though test plan checkboxes are incomplete.
Linked Issues check	✅ Passed	The PR successfully implements both `#46` (discriminated CycleResult with SUCCESS/EMPTY/FAILED) and `#49` (synchronized health snapshots via SchedulerSnapshot), with comprehensive test coverage including concurrency scenarios.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to issues `#46` and `#49`; MessageQueue config fields and health_fields() support the health snapshot feature; no unrelated changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/probe-cycle-health-snapshot

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/paperscout/monitor.py`:
- Around line 128-136: The health snapshot fallback uses
dict(_HEALTH_SNAPSHOT_DEFAULTS) which only performs a shallow copy and ends up
sharing the nested probe_stats dict; change the construction in the code that
creates the fallback (where dict(_HEALTH_SNAPSHOT_DEFAULTS) is used) to produce
a deep/independent copy of nested structures—either use copy.deepcopy on
_HEALTH_SNAPSHOT_DEFAULTS or rebuild the dict while explicitly creating a new
empty dict for "probe_stats" (e.g., {**_HEALTH_SNAPSHOT_DEFAULTS, "probe_stats":
{}}) so each consumer gets its own probe_stats instead of a shared mutable
object.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f831e1c1-21d8-46f5-b3dd-da55b443e852

📥 Commits

Reviewing files that changed from the base of the PR and between f7b0bd5 and fc58957.

📒 Files selected for processing (9)

CHANGELOG.md
docs/architecture.md
src/paperscout/__main__.py
src/paperscout/models.py
src/paperscout/monitor.py
src/paperscout/sources.py
tests/test_health.py
tests/test_monitor.py
tests/test_sources.py

henry0816191 · 2026-05-25T16:58:45Z

@coderabbitai don't skip. full review

coderabbitai · 2026-05-25T17:01:32Z

Tip

For best results, initiate chat on the files or code changes.

@henry0816191 Here is the full review of PR #50.

PR `#50` — Full Review: Probe cycle discrimination + scheduler health snapshot

Summary

The implementation is well-structured, correctly addresses issues #46 and #49, and the test coverage is solid (happy path, empty, failed, staleness, concurrent snapshot reads). A few issues are worth addressing before merging.

`src/paperscout/models.py`

CycleResult lacks invariant validation

CycleResult is frozen but its fields are not validated. A FAILED result with error=None, or a SUCCESS result with an empty results tuple, is silently constructable.

`@dataclass`(frozen=True, slots=True)
class CycleResult:
    status: CycleStatus
    results: tuple[ProbeHit, ...] = ()
    error: str | None = None

    def __post_init__(self) -> None:
        if self.status == CycleStatus.FAILED and not self.error:
            raise ValueError("CycleResult FAILED must carry a non-empty error string")
        if self.status == CycleStatus.SUCCESS and not self.results:
            raise ValueError("CycleResult SUCCESS must carry at least one ProbeHit")
        if self.status != CycleStatus.FAILED and self.error is not None:
            raise ValueError("CycleResult error is only valid for FAILED status")

`src/paperscout/sources.py`

Dead code: unreachable return None in _probe_one

The async with sem: block always returns (a ProbeHit | None) or raises. The bare return None after it is unreachable and should be removed.

    async with sem:
        ...
        return ProbeHit(...)
    return None  # ← unreachable; remove this line

`src/paperscout/monitor.py`

1. Missing wildcard case in _probe_hits_from_cycle

The match statement covers all three current CycleStatus values, but there is no case _: wildcard. If a new status is added later the method falls through and implicitly returns None, silently breaking the -> list[ProbeHit] contract. Add a defensive fallback:

def _probe_hits_from_cycle(self, cycle: CycleResult) -> list[ProbeHit]:
    self._last_cycle_status = cycle.status
    self._last_cycle_error = cycle.error
    match cycle.status:
        case CycleStatus.SUCCESS:
            return cycle.hits
        case CycleStatus.EMPTY:
            log.info("POLL  probe cycle empty")
            return []
        case CycleStatus.FAILED:
            log.error("POLL  probe cycle failed: %s", cycle.error)
            return []
        case _:
            log.error("POLL  unknown cycle status: %s", cycle.status)
            return []

2. _record_probe_cycle_completion() is called twice in the seed paths

seed() already calls _record_probe_cycle_completion() when enable_iso_probe=True. Both early-return branches in poll_once() call it again unconditionally:

Cold-start path (not had_prior_state, enable_iso_probe=True): seed() → _record_probe_cycle_completion(), then poll_once() → _record_probe_cycle_completion() again.
Restart-with-prior-state path: same double-call.

The second call is harmless (snapshots the same stats), but it's misleading. Consider removing the redundant calls from poll_once() for the seed sub-paths, or moving all stat recording out of seed().

3. match/case requires Python ≥ 3.10

_probe_hits_from_cycle uses structural pattern matching. Please verify python_requires in pyproject.toml is >=3.10, or replace with an if/elif/else chain if the project targets 3.8/3.9.

4. SchedulerSnapshot.probe_stats — mutable dict in a frozen dataclass

frozen=True prevents reassigning the probe_stats field but doesn't prevent in-place mutation of the dict. Since _publish_health_snapshot copies stats (dict(self._last_probe_stats)) and health_snapshot calls dataclasses.asdict() (which deep-copies nested structures), this is safe in practice. However, for true immutability, consider using MappingProxyType or a types.MappingProxyType(stats) here, or document the invariant explicitly.

`src/paperscout/main.py`

Key merge order in _extra_health_fields

return {**scheduler.health_snapshot(), **mq_extra, "db_pool": _pool_status(pool)}

mq_extra can silently overwrite any key from scheduler.health_snapshot() (e.g., last_updated if health_fields() ever returns it). Consider an explicit allow-list or at least a debug-level log/assertion when conflicts occur.

`tests/test_monitor.py`

Missing test: health_snapshot() defaults before any snapshot is published

There is a test for health_snapshot() after a poll, but no test verifies the default dict returned when _health_snapshot is None (i.e., before the first _publish_health_snapshot() call):

async def test_health_snapshot_defaults_before_first_poll(self, fake_pool):
    scheduler, _, _, _, _ = _make_scheduler(fake_pool)
    snap = scheduler.health_snapshot()
    assert snap["last_updated"] is None
    assert snap["poll_count"] == 0
    assert snap["last_cycle_status"] is None
    assert snap["last_successful_poll"] is None

`tests/test_health.py`

TestHealthSnapshotConcurrency test could assert field-type consistency more strictly

The loop asserts last_cycle_status in ("success", "empty", None) — good. Consider also asserting that poll_count is an int and last_updated (when not None) is a parseable ISO datetime string, to catch any serialization issues:

if data["last_updated"] is not None:
    datetime.fromisoformat(data["last_updated"])  # would raise if malformed
assert isinstance(data["poll_count"], int)

Minor nits

File	Location	Note
`sources.py`	`run_cycle()`	The `PROBE-CYCLE-SUMMARY` JSON log is emitted after the `PROBE-SUCCESS` / `PROBE-EMPTY` log. Reversing the order would make it easier to correlate the summary line with the outcome line when grepping logs.
`monitor.py`	`health_snapshot()`	`copy.deepcopy(_HEALTH_SNAPSHOT_DEFAULTS)` is called on every request when `_health_snapshot is None`. Since defaults are constant, a shallow `dict(_HEALTH_SNAPSHOT_DEFAULTS)` is sufficient (no nested mutable values that need deep-copying).
`models.py`	`CycleResult.hits` docstring	`"otherwise empty"` — consider clarifying it also returns empty for `EMPTY` status, not just `FAILED`.

Verdict

The approach is sound. The staleness-clock fix (not advancing _last_successful_poll on FAILED) and the lock-protected snapshot are the right patterns. The main actionable items are:

Unreachable return None in _probe_one (trivial).
Missing wildcard case in _probe_hits_from_cycle (low-risk but important for forward safety).
Python version check for match/case (must verify).
Redundant _record_probe_cycle_completion() calls in seed paths (cleanup).
CycleResult.__post_init__ validation (defensive programming).

coderabbitai

🧹 Nitpick comments (3)

src/paperscout/scout.py (1)

58-67: ⚡ Quick win

Keep mq_utilization normalized to [0.0, 1.0].

MessageQueue is unbounded today, so depth / mq_max_size can exceed 1.0 under load. Clamping keeps dashboard semantics stable.

Proposed patch

     def health_fields(self) -> dict[str, Any]:
         """Metrics for the ``/health`` endpoint (merged by ``__main__``)."""
         d = self.depth()
         m = settings.mq_max_size
+        util = (d / m) if m else 0.0
         return {
             "mq_depth": d,
             "mq_max_size": m,
-            "mq_utilization": round(d / m, 4) if m else 0.0,
+            "mq_utilization": round(min(max(util, 0.0), 1.0), 4),
             "mq_circuit_state": "closed",
         }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/paperscout/scout.py` around lines 58 - 67, The mq_utilization calculation
in health_fields can exceed 1.0 because MessageQueue is unbounded; change the
computation in scout.health_fields to compute utilization = (d / m) if m else
0.0 and then clamp it to [0.0, 1.0] (e.g., utilization = min(1.0, max(0.0,
utilization))) before rounding; keep the existing keys
("mq_depth","mq_max_size","mq_utilization","mq_circuit_state") and use
settings.mq_max_size and self.depth() as the sources.

tests/test_main_health_merge.py (1)

59-62: ⚡ Quick win

Assert all allow-listed MQ fields in the inclusion test.

You already pass mq_utilization and mq_circuit_state in mq_extra; asserting them here closes a regression gap in allow-list behavior.

Proposed patch

 def test_merge_includes_allowlisted_mq_fields():
@@
     out = _merge_extra_health_fields(scheduler, mq_extra, {})
     assert out["mq_depth"] == 2
     assert out["mq_max_size"] == 1000
+    assert out["mq_utilization"] == 0.002
+    assert out["mq_circuit_state"] == "closed"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_main_health_merge.py` around lines 59 - 62, The test currently
only asserts mq_depth and mq_max_size after calling
_merge_extra_health_fields(scheduler, mq_extra, {}); update the test to also
assert that the allow-listed MQ fields passed in mq_extra—specifically
mq_utilization and mq_circuit_state—are present and equal to the values provided
in mq_extra, so modify the assertions in the test (around the call to
_merge_extra_health_fields) to include checks for out["mq_utilization"] and
out["mq_circuit_state"] matching the input.

src/paperscout/__main__.py (1)

41-48: ⚡ Quick win

Guard MQ health extraction so one MQ failure doesn’t drop all extras.

If mq.health_fields() raises, _extra_health_fields() fails and /health falls back to base-only. Catching here keeps scheduler/db pool extras available.

Proposed patch

 def _mq_health_fields(mq: MessageQueue) -> dict:
     """MQ metrics for /health; from health_fields() when present, else depth only."""
     if hasattr(mq, "health_fields"):
-        raw = mq.health_fields()
+        try:
+            raw = mq.health_fields()
+        except Exception:
+            log.exception("health: mq.health_fields() failed, using mq_depth only")
+            return {"mq_depth": mq.depth()}
         if isinstance(raw, dict):
             return raw
         log.warning("health: mq.health_fields() returned non-dict, using mq_depth only")
     return {"mq_depth": mq.depth()}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/paperscout/__main__.py` around lines 41 - 48, Wrap the call to
mq.health_fields() in a try/except so a thrown exception from any MessageQueue
implementation doesn’t cancel gathering other extras; in _mq_health_fields, call
mq.health_fields() inside a try block, catch Exception, log a warning including
the exception details and the MQ identity, and then fall back to returning
{"mq_depth": mq.depth()} (preserving the existing non-dict check path). Ensure
you reference _mq_health_fields and use mq.health_fields and mq.depth in the
fix.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/paperscout/__main__.py`:
- Around line 41-48: Wrap the call to mq.health_fields() in a try/except so a
thrown exception from any MessageQueue implementation doesn’t cancel gathering
other extras; in _mq_health_fields, call mq.health_fields() inside a try block,
catch Exception, log a warning including the exception details and the MQ
identity, and then fall back to returning {"mq_depth": mq.depth()} (preserving
the existing non-dict check path). Ensure you reference _mq_health_fields and
use mq.health_fields and mq.depth in the fix.

In `@src/paperscout/scout.py`:
- Around line 58-67: The mq_utilization calculation in health_fields can exceed
1.0 because MessageQueue is unbounded; change the computation in
scout.health_fields to compute utilization = (d / m) if m else 0.0 and then
clamp it to [0.0, 1.0] (e.g., utilization = min(1.0, max(0.0, utilization)))
before rounding; keep the existing keys
("mq_depth","mq_max_size","mq_utilization","mq_circuit_state") and use
settings.mq_max_size and self.depth() as the sources.

In `@tests/test_main_health_merge.py`:
- Around line 59-62: The test currently only asserts mq_depth and mq_max_size
after calling _merge_extra_health_fields(scheduler, mq_extra, {}); update the
test to also assert that the allow-listed MQ fields passed in
mq_extra—specifically mq_utilization and mq_circuit_state—are present and equal
to the values provided in mq_extra, so modify the assertions in the test (around
the call to _merge_extra_health_fields) to include checks for
out["mq_utilization"] and out["mq_circuit_state"] matching the input.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8da341de-1054-40e5-be6d-4857363b9673

📥 Commits

Reviewing files that changed from the base of the PR and between f637b98 and cf54eec.

📒 Files selected for processing (13)

src/paperscout/__main__.py
src/paperscout/config.py
src/paperscout/health.py
src/paperscout/models.py
src/paperscout/monitor.py
src/paperscout/scout.py
src/paperscout/sources.py
tests/conftest.py
tests/test_health.py
tests/test_main_health_merge.py
tests/test_message_queue.py
tests/test_models_cycle_result.py
tests/test_monitor.py

💤 Files with no reviewable changes (1)

src/paperscout/sources.py

✅ Files skipped from review due to trivial changes (1)

tests/conftest.py

coderabbitai

🧹 Nitpick comments (2)

tests/test_message_queue.py (1)

33-41: 💤 Low value

Consider asserting the patched mq_max_size value.

The test correctly verifies clamping behavior, but adding an assertion for fields["mq_max_size"] == 2 would confirm the patch was applied successfully and make the test's setup more explicit.

✨ Optional enhancement

     assert fields["mq_depth"] == 5
+    assert fields["mq_max_size"] == 2
     assert fields["mq_utilization"] == 1.0

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_message_queue.py` around lines 33 - 41, In the
test_health_fields_clamps_utilization_when_depth_exceeds_max add an assertion
that the patched mq_max_size was applied by asserting fields["mq_max_size"] == 2
after calling mq.health_fields(); this confirms the patch on
paperscout.scout.settings.mq_max_size took effect and makes the test setup
explicit while keeping the existing assertions (fields["mq_depth"] and
fields["mq_utilization"]) intact.

src/paperscout/__main__.py (1)

46-54: ⚡ Quick win

Guard fallback mq.depth() reads so /health extras remain available.

If mq.depth() fails on Line 54, extra_fields_fn() fails and the handler drops all extras (including scheduler snapshot). Keep _mq_health_fields() exception-safe in both fallback paths.

Proposed defensive patch

 def _mq_health_fields(mq: MessageQueue) -> dict:
     """MQ metrics for /health; from health_fields() when present, else depth only."""
     if hasattr(mq, "health_fields"):
         try:
             raw = mq.health_fields()
         except Exception as exc:
             log.warning(
                 "health: mq.health_fields() failed for %s id=%s: %s",
                 type(mq).__name__,
                 id(mq),
                 exc,
                 exc_info=True,
             )
-            return {"mq_depth": mq.depth()}
+            try:
+                return {"mq_depth": mq.depth()}
+            except Exception:
+                log.warning("health: mq.depth() fallback failed; omitting MQ fields", exc_info=True)
+                return {}
         if isinstance(raw, dict):
             return raw
         log.warning("health: mq.health_fields() returned non-dict, using mq_depth only")
-    return {"mq_depth": mq.depth()}
+    try:
+        return {"mq_depth": mq.depth()}
+    except Exception:
+        log.warning("health: mq.depth() failed; omitting MQ fields", exc_info=True)
+        return {}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/paperscout/__main__.py` around lines 46 - 54, The fallback call to
mq.depth() can itself raise and currently will propagate out of
extra_fields_fn()/ _mq_health_fields(), dropping all health "extras"; wrap any
call to mq.depth() (the one inside the exception handler and the one inside
_mq_health_fields()) in its own try/except and return a safe sentinel (e.g. None
or "unknown") or omit that field on error, while still logging the depth failure
(include exc_info=True) so the rest of the extras (scheduler snapshot, etc.) are
preserved; update both the except block shown and the _mq_health_fields()
implementation to be exception-safe around mq.depth().

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/paperscout/__main__.py`:
- Around line 46-54: The fallback call to mq.depth() can itself raise and
currently will propagate out of extra_fields_fn()/ _mq_health_fields(), dropping
all health "extras"; wrap any call to mq.depth() (the one inside the exception
handler and the one inside _mq_health_fields()) in its own try/except and return
a safe sentinel (e.g. None or "unknown") or omit that field on error, while
still logging the depth failure (include exc_info=True) so the rest of the
extras (scheduler snapshot, etc.) are preserved; update both the except block
shown and the _mq_health_fields() implementation to be exception-safe around
mq.depth().

In `@tests/test_message_queue.py`:
- Around line 33-41: In the
test_health_fields_clamps_utilization_when_depth_exceeds_max add an assertion
that the patched mq_max_size was applied by asserting fields["mq_max_size"] == 2
after calling mq.health_fields(); this confirms the patch on
paperscout.scout.settings.mq_max_size took effect and makes the test setup
explicit while keeping the existing assertions (fields["mq_depth"] and
fields["mq_utilization"]) intact.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a757c472-99ee-44c9-a03f-83117e29ff55

📥 Commits

Reviewing files that changed from the base of the PR and between cf54eec and c0c5286.

📒 Files selected for processing (4)

src/paperscout/__main__.py
src/paperscout/scout.py
tests/test_main_health_merge.py
tests/test_message_queue.py

resolved issue_01 and issue_04 for week 4

fc58957

henry0816191 self-assigned this May 25, 2026

henry0816191 requested a review from wpak-ai as a code owner May 25, 2026 16:35

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

Comment thread src/paperscout/monitor.py

henry0816191 added 2 commits May 26, 2026 00:47

addressed ai reivew

29108ea

fixed lint error

f637b98

henry0816191 added 2 commits May 26, 2026 23:45

addressed AI reviews and supplemented unit tests

0ad86b6

fixed lint error

cf54eec

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

addressed AI reviews

c0c5286

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

addressed AI reviews

b2a958d

wpak-ai approved these changes May 26, 2026

View reviewed changes

wpak-ai merged commit 01562cf into develop May 26, 2026
10 checks passed

wpak-ai deleted the feat/probe-cycle-health-snapshot branch May 26, 2026 17:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Probe cycle discrimination + scheduler health snapshot#50

Probe cycle discrimination + scheduler health snapshot#50
wpak-ai merged 7 commits into
developfrom
feat/probe-cycle-health-snapshot

henry0816191 commented May 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 25, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

henry0816191 commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

henry0816191 commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Main changes

Tests

Related issues

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

henry0816191 commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026

PR #50 — Full Review: Probe cycle discrimination + scheduler health snapshot

Summary

src/paperscout/models.py

src/paperscout/sources.py

src/paperscout/monitor.py

src/paperscout/__main__.py

tests/test_monitor.py

tests/test_health.py

Minor nits

Verdict

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

henry0816191 commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading

PR `#50` — Full Review: Probe cycle discrimination + scheduler health snapshot

`src/paperscout/models.py`

`src/paperscout/sources.py`

`src/paperscout/monitor.py`

`src/paperscout/main.py`

`tests/test_monitor.py`

`tests/test_health.py`