Skip to content

docs(mcp): spec send_scpi v0.2 safety design + structured error contract#232

Open
cptkoolbeenz wants to merge 4 commits into
claude/youthful-mclaren-c56954from
claude/mcp-v01-sendscpi-and-errors
Open

docs(mcp): spec send_scpi v0.2 safety design + structured error contract#232
cptkoolbeenz wants to merge 4 commits into
claude/youthful-mclaren-c56954from
claude/mcp-v01-sendscpi-and-errors

Conversation

@cptkoolbeenz
Copy link
Copy Markdown
Member

Builds on #231. See the commit / diff for details. Adds: (1) §Safety: send_scpi v0.2 design — destructive deny-list (per-call allowDestructive) + wiki-pattern hallucination-warn (LLM-self-check phrasing). (2) §Structured error response contract for all SCPI tools in v0.1 — drain SYST:ERR? before/after + optional SYST:LOG? scrape + readback failures, uniform {success,response,errors,warnings} shape. (3) v0.2 fast-follow priority note per chat ('ship v.2 quickly so the mcp tests new features'). Tool table phasing untouched; send_scpi stays in v0.2. Targets your PR branch (claude/youthful-mclaren-c56954) so additions land inside #231 rather than racing it.

Three additions to the RFC (#231):

1. Safety: send_scpi v0.2 design — two-tier net (destructive deny-list per-call allowDestructive, wiki-pattern hallucination warn with edit-distance suggestions). Mode-interaction matrix. Wiki snapshot strategy. Non-blocking — firmware response is ground truth.

2. Structured error response contract (v0.1 work) — drain SYST:ERR? before/after, optional SYST:LOG? scrape, readback failures as structured errors. Uniform {success,response,errors[],warnings[]} shape across all SCPI tools so the model never has to grep free text.

3. v0.2 fast-follow priority note — per chat: ship v0.2 quickly so the MCP becomes the primary firmware-feature-testing harness, replacing the per-feature Python harness pattern.

Tool table phasing, marketing, open questions, tech stack untouched. send_scpi stays in v0.2 per original RFC.
@cptkoolbeenz cptkoolbeenz requested a review from a team as a code owner May 27, 2026 06:33
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Spec send_scpi v0.2 safety design and structured error contract

📝 Documentation ✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add comprehensive safety design for send_scpi v0.2 with two-tier protection
  - Destructive deny-list (hard block) requiring explicit allowDestructive flag
  - Wiki-pattern hallucination detection with edit-distance suggestions
• Define structured error response contract for all SCPI tools
  - Uniform {success, response, errors[], warnings[]} shape across v0.1 and v0.2
  - Drain SCPI error queue before/after, optional async log scraping, readback validation
• Establish v0.2 fast-follow priority to replace per-feature Python test harness
• Document mode interactions and wiki snapshot freshness strategy
Diagram
flowchart LR
  A["send_scpi v0.2<br/>Safety Design"] --> B["Tier 1:<br/>Destructive Deny-List"]
  A --> C["Tier 2:<br/>Wiki-Pattern Warn"]
  B --> D["allowDestructive Flag<br/>Per-Call Override"]
  C --> E["Hallucination Detection<br/>Edit-Distance Suggestions"]
  F["Structured Error<br/>Response Contract"] --> G["SYST:ERR? Queue<br/>Before/After Drain"]
  F --> H["SYST:LOG? Buffer<br/>Async Errors"]
  F --> I["Readback Validation<br/>Timeout Failures"]
  G --> J["Uniform Response Shape<br/>success/response/errors/warnings"]
  H --> J
  I --> J
  K["v0.2 Fast-Follow<br/>Priority"] --> L["Replace Python<br/>Test Harness"]

Loading

Grey Divider

File Changes

1. docs/planning/mcp/RFC.md 📝 Documentation +85/-3

Add send_scpi safety design and error contract spec

• Add detailed send_scpi v0.2 safety design section with two-tier protection mechanism
 (destructive deny-list + wiki-pattern hallucination detection)
• Define structured error response contract for all SCPI tools with uniform `{success, response,
 errors[], warnings[]}` shape
• Document error surfaces: SCPI error queue, async log buffer, and readback validation failures
• Add v0.2 fast-follow priority note explaining MCP as primary firmware feature testing harness
• Update tool table reference and v0.2 roadmap with design specification links

docs/planning/mcp/RFC.md


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 27, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Wrong tool name referenced ✓ Resolved 🐞 Bug ≡ Correctness
Description
The warning example tells clients/agents to call sendRawScpi, but the RFC’s tool reference defines
the escape-hatch tool as send_scpi, so following the warning will attempt to call a non-existent
tool.
Code

docs/planning/mcp/RFC.md[226]

Evidence
The warning text explicitly says sendRawScpi, while the RFC tool table defines only send_scpi as
the escape hatch tool, creating a direct naming mismatch.

docs/planning/mcp/RFC.md[212-230]
docs/planning/mcp/RFC.md[426-430]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The RFC warning example references `sendRawScpi`, but the tool is documented everywhere else as `send_scpi`.

### Issue Context
This is in the example warning payload for the proposed v0.2 `send_scpi` safety design.

### Fix Focus Areas
- docs/planning/mcp/RFC.md[219-230]
- docs/planning/mcp/RFC.md[426-430]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. CI freshness claim unsupported ✓ Resolved 🐞 Bug ⚙ Maintainability
Description
The RFC states CI enforces a 30‑day max age for the bundled SCPI wiki snapshot, but the repository’s
CI workflow only restores/builds/tests and contains no snapshot freshness validation, making the
statement inaccurate for this repo.
Code

docs/planning/mcp/RFC.md[234]

Evidence
The RFC asserts a CI guarantee, but the only CI workflow present performs .NET restore/build/test
and does not include any step that checks a bundled wiki snapshot age.

docs/planning/mcp/RFC.md[232-240]
.github/workflows/ci.yml[1-28]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
RFC claims CI checks SCPI wiki snapshot age, but the current CI workflow has no such step.

### Issue Context
If the check is intended but not yet implemented, the RFC should be explicit (e.g., “CI SHOULD check…” / “TODO”) or the CI workflow should be updated accordingly.

### Fix Focus Areas
- docs/planning/mcp/RFC.md[234-235]
- .github/workflows/ci.yml[1-28]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Destructive confirmation inconsistent ✓ Resolved 🐞 Bug ⚙ Maintainability
Description
The safety model specifies destructive tools require confirmed: true, but the send_scpi design
uses a different per-call override flag (allowDestructive), which makes the confirmation contract
inconsistent across tools and harder for clients to implement correctly.
Code

docs/planning/mcp/RFC.md[R197-203]

Evidence
The RFC’s general safety model sets confirmed: true as the destructive confirmation mechanism,
while the new send_scpi section requires allowDestructive: true for deny-listed destructive
patterns, creating two competing confirmation fields.

docs/planning/mcp/RFC.md[170-182]
docs/planning/mcp/RFC.md[197-206]
docs/planning/mcp/RFC.md[236-240]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The RFC uses `confirmed: true` as the destructive confirmation pattern, but the `send_scpi` spec introduces `allowDestructive: true` for the same purpose.

### Issue Context
This is primarily a contract/UX consistency problem for client schema generation and tool-call prompting.

### Fix Focus Areas
- docs/planning/mcp/RFC.md[172-182]
- docs/planning/mcp/RFC.md[197-206]
- docs/planning/mcp/RFC.md[236-240]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment thread docs/planning/mcp/RFC.md Outdated
cptkoolbeenz and others added 2 commits May 27, 2026 00:45
Second wave of additions to the RFC, focused on items that aren't visible from outside the firmware team but materially affect the MCP UX.

## Firmware-aware constraints

- **Quiescence rule** — any SCPI during a benchmarked stream throttles the encoder enough to corrupt the measurement (firmware CLAUDE.md + memory feedback_no_scpi_during_benchmark). #1 footgun for an LLM driving the device. MCP enforces via DeviceRegistry.isStreaming + RequiresQuiescence attribute; structured quiescence_violation error tells the model exactly what to do. Opt-out via respectQuiescence=false, logged in audit trail.

- **Variant-aware defaults + capability-filtered tool surface** — NQ1/NQ2/NQ3 differ in ADC resolution, channels, peripherals (DAC NQ3-only). connect_device caches BoardCapabilities; start_streaming etc. apply variant-appropriate defaults (precision 4/6/7). DAC tools filter out on NQ1 handles. Capabilities exposed as daqifi://devices/{id}/capabilities.

- **ADC channel enable: bitmask form** — looping configure_analog_channel N times triggers N firmware freq-cap recomputes. set_active_channels(int[]) should issue the bitmask SCPI form (one recompute) instead.

- **Credentials handling** — WiFi passwords never echoed in responses / logs / audit. Direct response to 2026-05-07 Tesla AP password leak. ScpiAuditLog masks pass/secret/key/token keys at log-emission time.

## Cross-cutting infrastructure (v0.1 internal)

- **ScpiAuditLog** — every SCPI call + response logged (with credential masking), exposed as daqifi://session/audit. Bounded at 500 entries. Also captures quiescence_violation opt-outs for forensics.

- **RateLimiter** — per-handle 30/sec cap. Prevents runaway LLM loops; rate-limit errors are a useful loop-detection signal.

- **Logging conventions** — Microsoft.Extensions.Logging to stderr; stdout reserved for JSON-RPC.

## Cross-cutting concern for daqifi-core

Two API additions that would clean up the MCP layer and benefit daqifi-desktop too. Neither blocks v0.1; both batched into a follow-up daqifi-core PR if Tyler agrees:

1. **Task<DeviceMetadata> GetMetadataAsync()** on DaqifiDevice — replaces the TaskCompletionSource-vs-ChannelsPopulated dance for getting typed serial/IP/firmware after connect.

2. **public Task<ConfirmedScpiResult> ExecuteConfirmedAsync(command, drainErrorQueue=true, scrapeLogAfter=false, ct)** — wraps the currently-protected ExecuteTextCommandAsync + DrainErrorQueueAsync pattern that the structured error contract relies on. daqifi-desktop's SCPI console feature would benefit too.

Builds on prior commit in this PR; tool table phasing still untouched.
Addresses the three Qodo flags on PR #232:
- Tighten the UNDOCUMENTED_SCPI warning (~70% fewer tokens) while keeping
  the "you may have hallucinated" prompt-engineering hook; also fixes the
  sendRawScpi → send_scpi tool-name typo in the warning example.
- Reframe wiki-snapshot CI freshness: it lands in v0.1 under #mcp-13
  alongside the snapshot bundle itself, not as a v0.2 aspiration.
- Unify destructive confirmation field: rename allowDestructive →
  confirmed across the send_scpi section so it matches the existing
  safety-model contract on delete_sd_file / format_sd_card / etc.

Also syncs GITHUB_ISSUES.md with the new scope this PR added to the RFC:
- #mcp-2: BoardCapabilities caching at connect time (per-variant smarts)
- #mcp-5: RequiresQuiescence attribute + dispatcher gate (enforces the
  firmware quiescence rule mechanically)
- #mcp-13 (new): structured SCPI error contract (ConfirmedScpiResult),
  bundled wiki snapshot, CI freshness check
- #mcp-14 (new): cross-cutting infra (ScpiAuditLog, RateLimiter,
  stderr-only logging)
- Updated epic table, success criteria, dependency graph
- Total effort: 66h → 90h (~6-8 weeks part-time)
- Daqifi.Core API additions (GetMetadataAsync, ExecuteConfirmedAsync)
  called out as a separate follow-up cleanup, not part of this epic

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tylerkron
Copy link
Copy Markdown
Contributor

Pushed 47d36ec to address the three Qodo flags and sync the issue drafts with the scope you added. Full rationale is in the commit message; one-line summary:

  • Qodo bugs fixed: typo (sendRawScpisend_scpi), CI freshness reframed as v0.1 work under #mcp-13 (it's not future aspiration), allowDestructive renamed to confirmed to match the existing safety-model contract
  • Warning text tightened (~70% fewer tokens) while keeping the "you may have hallucinated" prompt-engineering hook
  • Two new issues added (#mcp-13 structured error contract, #mcp-14 cross-cutting infra), and #mcp-2/#mcp-5 extended for capabilities caching + quiescence enforcement
  • Effort estimate updated: 66h → 90h (~6–8 weeks part-time)
  • Daqifi.Core API additions (GetMetadataAsync, ExecuteConfirmedAsync) called out as a separate follow-up, not part of this epic

Strongest adds from your side that I left untouched

  • Quiescence rule — this would have been a launch-week support disaster without it
  • Three-surface error contract — right architectural call, especially LOG_E scrape
  • Variant-aware capabilities exposed as a resource
  • Audit log with credential redaction (the Tesla AP password story is a good motivator)

Still outstanding from the original RFC

The two blocking open questions are still unanswered:

  1. Default safety mode for npx @daqifi/mcpread-only (demo doesn't work until users add a flag) or control (works first try but risks first-contact mis-config)?
  2. Pricing strategy — does the AI story support a higher ASP or SaaS tier? Needs an answer before launch copy gets written.

Once you weigh in on those, I can start creating actual GitHub issues from GITHUB_ISSUES.md — assuming you're good with the merged scope.

Records Chris's 2026-05-27 decisions on Tyler's 8 open questions, restructured from a flat numbered list into per-question Answer blocks. Tyler can override any in PR review.

## Substantive content changes

- **Recipe clarification** — added an explainer of what an 'MCP prompt' is mechanically (parameterized template surfaced as a slash-command / suggestion menu by Claude Desktop / Cursor / Cline) so future readers don't have to derive it. Cross-link to Q7 for the hosted-library deferred decision.

- **Q4 telemetry — design specced** — new §Telemetry (opt-in) section under Cross-cutting infrastructure. Strict opt-in via --telemetry flag or DAQIFI_MCP_TELEMETRY=1. Per-event shape (tool name, mode, outcome, error_code, latency, server_version, client_id, os, device_variant, session_id, event_ts) and explicit no-list (tool args, response text, serials, MACs, IPs, hostnames, SSIDs, credentials, file contents/names, no per-user identifier). --telemetry-dry-run for procurement review. 90d raw / indefinite aggregates retention.

- **Q5 stability contract — tightened**: tool names stable from v0.1; optional fields freely additive; required fields only via vN-optional → v(N+1)-required deprecation; tool removal needs ≥6-month deprecation window + major bump; experimental: true schemas carry no promise; resource URIs follow same rules as tool names.

- **Q6 LabVIEW interop — inverted framing**: instead of MCP-calls-out-to-LabVIEW (permanent per-VI support burden), document the cleaner inverse: LabVIEW-as-client of the MCP via a small example VI library. Added a v0.3+ phasing line referencing it.

## Open-questions section restructured

Each Q now has the original text + Answered (Chris, 2026-05-27): block with the resolution. Q1, Q2, Q3, Q7, Q8 mostly confirm Tyler's framing. Q4 reverses my earlier no-telemetry suggestion (Chris approved opt-in). Q5 + Q6 + recipe clarification land richer than the original questions.

Tool table phasing, marketing, tech stack, distribution, appendix still untouched.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants