Skip to content

dogfood: run diffscope on evalops platform PRs — replace / complement Cursor Bugbot #90

@haasonsaas

Description

@haasonsaas

Diffscope advertises "adaptive learning to suppress low-value recurring feedback" and "scoped custom context." That's exactly what evalops platform PRs need — Cursor Bugbot currently reviews there (see `evalops/platform#601`, `#603`, `#613`) and its hit rate is 2-for-3 in the last session (one real bug caught, two false positives).

The evalops org doesn't dogfood its own code-review engine on its own repos today. That's a miss on both sides: diffscope loses the best possible training signal (a busy monorepo with adaptive feedback patterns), and platform PRs don't benefit from diffscope's adaptive-suppression feature that's supposed to reduce noise over time.

Ask

  • Install diffscope as a GitHub Action on `evalops/platform`, side-by-side with Cursor Bugbot initially (not replacing — comparing)
  • Scoped rules matching the "same template" pattern this codebase produces (populated-field-zero-consumer, sort comparator coverage, etc.)
  • Track false-positive rate over 30 days; if diffscope's adaptive-suppression works as advertised, consider promoting it to the primary review surface
  • If it goes well, roll out to the other active repos (maestro-internal, chat, console, ensemble)

Related context

  • Cursor Bugbot's recent false-positive on `evalops/platform#613` (case-glob pattern claim, refuted empirically in the PR thread) — that's the kind of signal adaptive-suppression should learn to quiet
  • `evalops/platform#613`'s rank-coverage-check tool is exactly the kind of structural pattern diffscope's custom-context feature could ingest for higher-precision reviews

Scope

~3 days. Install + config + baseline metrics run. The adaptive learning happens after.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions