Skip to content

Promote latent specs into a documented conformance contract for diffscope (composable code review engine automated) #99

@haasonsaas

Description

@haasonsaas

Summary

Turn TODOs, docs promises, and implied API behavior into a versioned contract with conformance checks.

This issue was generated from an org-wide EvalOps mining pass on 2026-05-10 07:57 UTC. It combines live GitHub repo signals with a per-repo arXiv search. Treat the research links as grounding for a concrete implementation, not as a request for a literature review.

Repo Evidence

  • Repository description: A composable code review engine for automated diff analysis
  • Tree signals: 6 docs files, 7 workflows, 0 proto files, 8 test-like files.
  • CLAUDE.md:21 includes latent-spec language: - migrations/ — PostgreSQL migrations (sqlx) - eval/ — Evaluation and benchmarking - examples/ — Usage examples
  • CLAUDE.md:33 includes latent-spec language: - Wide events for observability (OpenTelemetry-compatible) - Self-hosted first: Ollama/vLLM/LM Studio should be first-class providers
  • README.md:89 includes latent-spec language: # Evaluate reviewer quality against fixtures diffscope eval --fixtures eval/fixtures --output eval-report.json ```
  • README.md:125 includes latent-spec language: ### Evaluation Fixtures ```yaml
  • README.md:145 includes latent-spec language: diffscope eval now reports per-rule precision/recall/F1 (micro and macro), and includes top rule-level TP/FP/FN counts in CLI and JSON output. Starter fixtures live in eval/fixtures/repo_regressions.
  • README.md:146 includes latent-spec language: diffscope eval now reports per-rule precision/recall/F1 (micro and macro), and includes top rule-level TP/FP/FN counts in CLI and JSON output. Starter fixtures live in eval/fixtures/repo_regressions. Markdown and smart-review reports now include rule-level issue breakdown tables when rule ids are available.

Research Grounding

Repo axes: memory, governance, evaluation, tooling

Search keywords: diffscope, review, model, fixtures, diff, eval, git, ollama, bash, docker, pull, github

  • arXiv:2504.08893v1 Knowledge Graph-extended Retrieval Augmented Generation for Question Answering (Jasper Linders, Jakub M. Tomczak), 2025.
  • arXiv:2504.05163v2 Evaluating Knowledge Graph Based Retrieval Augmented Generation Methods under Knowledge Incompleteness (Dongzhuoran Zhou, Yuqicheng Zhu, Xiaxia Wang, Yuan He, Jiaoyan Chen, Steffen Staab), 2025.
  • arXiv:2511.11017v1 AI Agent-Driven Framework for Automated Product Knowledge Graph Construction in E-Commerce (Dimitar Peshevski, Riste Stojanov, Dimitar Trajanov), 2025.
  • arXiv:2502.01113v3 GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation (Linhao Luo, Zicheng Zhao, Gholamreza Haffari, Dinh Phung, Chen Gong, Shirui Pan), 2025.
  • arXiv:2502.06864v1 Knowledge Graph-Guided Retrieval Augmented Generation (Xiangrong Zhu, Yuexiang Xie, Yi Liu, Yaliang Li, Wei Hu), 2025.
  • arXiv:2506.21556v3 VAT-KG: Knowledge-Intensive Multimodal Knowledge Graph Dataset for Retrieval-Augmented Generation (Hyeongcheol Park, Jiyoung Seo, MinHyuk Jang, Hogun Park, Ha Dam Baek, Gyusam Chang), 2025.
  • arXiv:2507.16826v1 A Query-Aware Multi-Path Knowledge Graph Fusion Approach for Enhancing Retrieval-Augmented Generation in Large Language Models (Qikai Wei, Huansheng Ning, Chunlong Han, Jianguo Ding), 2025.
  • arXiv:2508.09460v1 Towards Self-cognitive Exploration: Metacognitive Knowledge Graph Retrieval Augmented Generation (Xujie Yuan, Shimin Di, Jielong Tang, Libin Zheng, Jian Yin), 2025.
  • arXiv:2512.20626v2 MegaRAG: Multimodal Knowledge Graph-Based Retrieval Augmented Generation (Chi-Hsiang Hsiao, Yi-Cheng Wang, Tzung-Sheng Lin, Yi-Ren Yeh, Chu-Song Chen), 2025.
  • arXiv:2405.15436v1 Hybrid Context Retrieval Augmented Generation Pipeline: LLM-Augmented Knowledge Graphs and Vector Database for Accreditation Reporting Assistance (Candace Edwards), 2024.

What To Build

  • Create a versioned contract document for the repo's public or agent-facing behavior.
  • Move the highest-signal latent TODO/doc promises into explicit normative requirements.
  • Add conformance fixtures that detect incompatible behavior changes.

Acceptance Criteria

  • A short design note names the repo-specific workflow, threat or correctness model, and the research assumptions being adopted.
  • A runnable check, fixture, or verifier exercises the new contract in CI or an equivalent local command documented in the repo.
  • The implementation emits or stores enough evidence for a downstream agent/operator to cite inputs, decisions, and outputs.
  • At least one negative/degraded-mode case is covered so failures are observable rather than silently accepted.
  • Documentation links the new behavior to the relevant EvalOps platform primitive or explicitly records why this repo remains standalone.

Notes

  • Generated issue 5/5 for evalops/diffscope by evalops_org_miner.py.
  • Before implementation, confirm the sampled latent-spec snippets still match main; this issue intentionally cites exact file paths/lines where the mining pass saw them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions