Follow-up: cargo validator hardening + registry-wide mcp-name token anchoring#1
Closed
rdimitrov wants to merge 3 commits into
Closed
Conversation
The PyPI, NuGet, and Cargo validators all proved package ownership with a bare `strings.Contains(readme, "mcp-name: "+serverName)`. That substring check is vulnerable to prefix confusion: a README that legitimately declares `mcp-name: io.github.acme/widget-pro` also satisfies an ownership claim for the shorter `io.github.acme/widget`, because the shorter string is a substring of the longer one. This is contained by namespace authorization (a publisher can only claim names within a namespace they already own, so it cannot cross a namespace boundary), but it still weakens the package<->server-name binding the token is meant to prove. Extract a shared, boundary-anchored containsMCPNameToken helper and use it from all three README-token validators: a match now requires the character after the server name to be end-of-content or a non-server-name character (whitespace, newline, or an HTML tag delimiter from a rendered README). NPM is unaffected — it compares an exact metadata field rather than scanning README text. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ngthen tests Addresses review findings on the cargo validator's two-call README retrieval: SSRF hardening (the README pointer + redirects were unconstrained): - Pin the step-2 README URL to an allowed host (static.crates.io for the real base; the base host itself for httptest mocks) before fetching, so a metadata response cannot steer the validator at an internal/attacker-chosen host. - Add a CheckRedirect policy pinning every redirect hop to the same allowlist — the initial URL being host-pinned is not enough if an upstream 3xx is followed. - Cap the README body with io.LimitReader (5 MiB) so an oversized response cannot exhaust validator memory. Clearer status handling (previously any non-5xx, non-200 collapsed to "not found"): - 429 (crates.io rate-limiting) is now reported as transient/retryable rather than as "not found", at both the metadata and README steps. - A 403 from static.crates.io is disambiguated via the crate-version metadata endpoint: a genuinely-missing crate/version stays "not found", but a crate that exists with no rendered README gets an actionable "add a README containing mcp-name and republish" message (mirrors the NuGet validator). - The non-200 branches are extracted into helpers to keep cyclomatic complexity within the project's cyclop limit. Use the boundary-anchored containsMCPNameToken helper; promote the contact User-Agent to a package const (was a function-local shadowing nuget.go's const). Tests: - ServerNameFormats is now a hermetic POSITIVE test (each format must validate when present as an exact token) instead of asserting against a token-less live crate where every case failed trivially. - Combined fixture gains a version-existence endpoint and cases for 403-missing vs 403-no-readme, 429-transient, and prefix-confusion rejection. - New RejectsForeignReadmeHost test for the SSRF host pin. - PositivePathMock no longer calls t.Fatalf from the httptest handler goroutine. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The validator enforces https://crates.io as the only supported Cargo base URL, but the canonical "Restricted Registry Base URLs" list and the Package model's doc comment / struct tag still enumerated only npm/pypi/nuget/oci/mcpb. Add Cargo to both so the supported-registry surface is consistent. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Author
|
Superseding this with follow-up PRs opened directly against |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked follow-up on modelcontextprotocol#1207
This branch is stacked on top of #1207 (base =
feat/cargo-package-registry-support), so the diff here is only the follow-up changes. It collects the actionable items from a review of the cargo work. Entirely optional — merge into the cargo branch if you agree, or cherry-pick the parts you like. Nothing here blocks modelcontextprotocol#1207.What's here
1. Registry-wide: boundary-anchor the
mcp-nametoken match (a5ef071)PyPI, NuGet, and Cargo all matched ownership with a bare
strings.Contains(readme, "mcp-name: "+serverName). That allows prefix confusion: a README declaringmcp-name: io.github.acme/widget-proalso satisfies a claim for the shorterio.github.acme/widget. It's contained by namespace auth (can't cross a namespace you don't own) so it's low-severity, but it weakens the package↔name binding. Extracted a sharedcontainsMCPNameTokenthat requires a trailing boundary (end-of-content or a non-server-name char). NPM is untouched (it compares an exact field).2. Cargo: SSRF hardening + clearer status handling (
3036c47)static.crates.ioin prod; the mock host under test) before fetching, aCheckRedirectpolicy pins every redirect hop, and the body is capped withio.LimitReader(5 MiB). Not publisher-exploitable today (the URL comes from crates.io), but defense-in-depth so the "host is pinned" guarantee holds in code rather than by crates.io's good behavior.libc 0.1.0exists but its README CDN URL 403s.3. Test quality (in
3036c47)ServerNameFormatsis now a hermetic positive test (each format must validate when present as an exact token) instead of asserting against a token-less live crate where every case failed trivially.RejectsForeignReadmeHosttest for the SSRF pin; new internal test forcontainsMCPNameToken;PositivePathMockno longer callst.Fatalffrom the httptest handler goroutine.4. Docs (
351f53a)Add Cargo/
https://crates.ioto the canonical "Restricted Registry Base URLs" list and to thePackagemodel doc comment + struct tag (both still omitted it).Testing
go build,go vet, full./internal/validators/...suite (incl. live PyPI/NuGet/Cargo positives),make check-schema, andvalidate-examplesall pass. No newgolangci-lintfindings vs. base. Branch is rebased on currentmain(via the merge in modelcontextprotocol#1207).🤖 Generated with Claude Code