fix(validators): harden mcp-name matching (PyPI/NuGet anchoring, comment-form safe) + cargo follow-ups#1331
Merged
Merged
Conversation
bc87b70 to
d796057
Compare
5b3e6b5 to
57aacf9
Compare
…uGet Stacked on the cargo follow-up (introduces containsMCPNameToken). This extends the boundary-anchored ownership-token match to the PyPI and NuGet validators, replacing their bare strings.Contains checks so a README declaring a longer name (e.g. io.github.acme/widget-pro) no longer satisfies a claim for a shorter prefix (io.github.acme/widget).⚠️ BEHAVIOR CHANGE for PyPI/NuGet (not just additive): The new match is strictly stricter — it can only flip a previously-passing publish to failing, never the reverse. The realistic case that flips is a README whose ONLY occurrence of the token is immediately followed by a server-name character [A-Za-z0-9._/-], e.g. a trailing period in prose ("...published as mcp-name: io.github.acme/widget."). The token on its own line, in backticks, or followed by whitespace/newline/HTML-tag is unaffected. Re-validation runs only at publish time (CreateServer); edits/status updates do not re-check ownership and there is no background re-validation, so already- stored servers are not affected — but an existing PyPI/NuGet publisher pushing a NEW VERSION with the token in the glued form would fail where it previously passed. Given the v0.1 API freeze, this should land deliberately and not be promoted to prod without sign-off. Live positive tests (time-mcp-pypi, TimeMcpServer) still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The boundary-anchored matcher rejected the documented hidden-comment form when the trailing space was omitted: `<!-- mcp-name: NAME-->` / `<!--mcp-name: NAME-->` fail because the byte after NAME is `-` (a server-name char). Since PyPI/NuGet publishers commonly hide the token in an HTML comment, this would break the recommended form on the next publish or edit. Add isMCPNameBoundary, which treats the HTML comment close (`-->` / `--!>`) immediately after the name as a boundary, so all spacing variants of the comment form validate while a genuine longer name (e.g. `…/widget--pro`) still does not. Tests: comment-form cases (spaced/unspaced/legacy `--!>`) and a double-hyphen longer-name negative; plus FuzzContainsMCPNameToken pinning the safety property that the matcher is strictly stricter than strings.Contains (can only flip pass→fail, never fail→pass) — verified over 2.3M executions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…en 403 Three follow-up hardening fixes to the cargo validator (review of #1330): - SSRF: the README host pin and the redirect policy keyed on Hostname() only, so any port/scheme on crates.io/static.crates.io was accepted. cargoURLAllowed now additionally requires https + the default port for the real crates.io base (test/mock bases still match on host only, so httptest fixtures keep working). - Transient 403 disambiguation: when the README CDN 403s, the crate-version existence probe previously reported "not found" if the probe itself failed (429/5xx/network) — the same misclassification the 5xx handling fixed one layer up. probeCargoVersion now returns a four-state result and a transient probe yields a retryable message instead of "not found". - A 403 with the crate present no longer flatly asserts "no rendered README" (a 403 isn't definitive proof — could be a CDN/WAF block); the message now says the README could not be retrieved and gives the actionable next step. Tests: TestCargoURLAllowed (https/port/userinfo/foreign-host matrix) and a combined-fixture case where the 403 existence-probe is rate-limited (429) and must report transient, not "not found". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Note in the PyPI and NuGet ownership sections that the token must be followed by a newline, whitespace, an HTML tag, or the comment close `-->`, and must not be glued to trailing punctuation (e.g. a sentence-ending period). The matcher fix handles the comment-close case; this documents the remaining boundary rule. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
55f72ed to
165c1bf
Compare
If a publisher's README contains `mcp-name: NAME` but the boundary-anchored match rejects it (the token is glued to a trailing character such as a sentence-ending period or `/`), the previous error said the name "must appear as 'mcp-name: NAME' in the package README" — which the publisher sees that it already does, giving no clue what's wrong. This is the failure mode of the registry-wide anchoring change, so the message needs to name the cause. Add mcpNameTokenGluedTrailing, which reports the offending trailing character when the literal token is present but unterminated, and use it in the PyPI, Cargo, and NuGet validators to emit an actionable message: "found 'mcp-name: NAME' but it is immediately followed by 'X' — put it on its own line and republish". NuGet gains a GluedReadme state to distinguish this from a genuinely-absent token. Tests: TestMCPNameTokenGluedTrailing for the helper, and a cargo combined-fixture case asserting the explanatory message on a glued trailing period. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Registry-wide hardening of the
mcp-name:ownership-token match, plus the cargo follow-up fixes from the #1330 review — consolidated here per maintainer request. Rebased onto currentmain(the merged #1330 cargo commit drops out), so the diff is just the net-new work.Changes
strings.Contains(readme, "mcp-name: "+name)with the sharedcontainsMCPNameToken, so a README declaring a longer name (…/widget-pro) no longer satisfies an ownership claim for a shorter prefix (…/widget). (Cargo already got this in fix(cargo): harden README fetch, clarify status handling, strengthen tests #1330; NPM is unaffected — it compares an exact metadata field.)<!-- mcp-name: NAME-->/<!--mcp-name: NAME-->(any spacing) validate again — the documented hidden-comment form for PyPI/NuGet — while a genuine longer name (…/widget--pro) still does not.The match is strictly stricter (verified by fuzz, 2.3M execs: it can only flip pass→fail, never fail→pass). After the comment-close fix, the only forms that newly fail are unusual inline ones: the token ending a sentence (
…/my-mcp.) or glued to a trailing/. The documented forms (own line, in<!-- … -->) are unaffected.Correcting an earlier claim: this re-validates on edits too, not just new versions —
edit.go → UpdateServer → ValidateUpdateRequest → validateRegistryOwnershipruns the token check on any edit of a live server (only delete-transitions skip it). So an existing PyPI/NuGet server whose README uses one of the breaking inline forms would fail on its next publish or edit. Given the v0.1 API-freeze posture, this should land as a deliberate, noted change.Testing
go build,vet,gofmt,golangci-lint(prod files),check-schema,validate-examplesall clean. Hermetic + live PyPI/NuGet/cargo suite passes. New:TestCargoURLAllowed, a combined-fixture probe-429→transient case, comment-form matcher cases, andFuzzContainsMCPNameToken(the strictly-stricter property, 2.3M execs).🤖 Generated with Claude Code