feat(vendor): download prebuilt patched packages from patch.socket.dev (--vendor-source)#116
Merged
Mikola Lysenko (mikolalysenko) merged 9 commits intoJun 26, 2026
Conversation
…v (npm + shared core)
Add a service-download path to `vendor`: instead of always building the
installable patched artifact locally, download the already-built tarball +
integrity from the patch.socket.dev vendoring service, falling back to the
local build on any miss. This commit lands the shared infrastructure + the
npm flavor (package-lock / pnpm / yarn-classic / yarn-berry / bun); other
ecosystems follow.
- config: --vendor-source {auto|service|build} (SOCKET_VENDOR_SOURCE, default
auto), --vendor-url (SOCKET_VENDOR_URL), --patch-server-url
(SOCKET_PATCH_SERVER_URL); all env-var-backed with parse/tripwire tests
- api client: ApiClient::fetch_vendor_package — two-step package-reference POST
(/v0/orgs/{slug}/patches/package or proxy /patch/package) -> grant-tokenized
serve GET, with host rewrite + status mapping; 12 wiremock tests
- core: VendorServiceConfig, service_fetch (sha512 + golang-h1 verify,
fail-closed), PackedTarball::from_bytes (DRY with pack_deterministic)
- threading: Option<&VendorServiceConfig> through the vendor dispatch chain
(scan --vendor / repair pass None = build-only, unchanged)
- npm: service path in stage_patch_pack with the auto/service/build fallback
table; integrity always re-verified before write; 9 integration tests cover
both the service download and the local-build fallback
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extend the prebuilt-package download to pypi. `vendor_pypi` now acquires the patched wheel service-first (skipping the installed-dist requirement), falling back to the local wheel build on any miss. - acquire_patched_wheel: service-first then local-build; the service path writes the downloaded wheel, recomputes sha256 (lockfiles embed sha256 while the service reports sha512), and derives the platform-locked advisory from the wheel filename's tag triple - only .whl artifacts are usable (pypi vendoring is wheel-based) — an sdist (or any miss) falls back under `auto` and hard-fails under `service` - in_sync_outcome refactored onto a shared synthesized_apply_result - 5 integration tests: service success (wheel written + requirements line wired to the recomputed sha256), sdist-fallback (auto) / sdist-hard-fail (service), integrity-mismatch hard-fail, offline+service refusal - box the large service-decision enum variants (clippy large_enum_variant) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extend the prebuilt-package download to cargo (the first Tier-B / directory- vendored ecosystem). `vendor_cargo_crate` now materialises the patched copy service-first: download the prebuilt `.crate`, verify sha512, and extract it into `.socket/vendor/cargo/<uuid>/<name>-<version>/` (dropping any `.cargo-checksum.json` so it stays a path dep) — no pristine source needed. Falls back to the existing copy-pristine-and-patch build on any miss. - expose registry_fetch::extract_tgz as pub(crate) for the .crate extraction - cargo_service_copy helper + boxed CargoServiceCopy enum; auto/service fallback policy; offline+service refusal; existing config + Cargo.lock wiring is unchanged (it never read the copy contents) - 4 integration tests: service success (extracts patched crate, wires config, no sidecar, no pristine needed), integrity-mismatch hard-fail, not-built auto-fallback-to-build, offline+service refusal Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Close the fail-closed gap for the partial rollout and document the feature. - dispatch_vendor_one: under `--vendor-source=service`, ecosystems without a service path yet (golang, gem, composer, maven, nuget) now refuse with `vendor_service_unsupported_ecosystem` instead of silently building locally (which would violate the fail-closed contract). `auto`/`build` are unchanged. - CLI_CONTRACT.md: --vendor-source/--vendor-url/--patch-server-url flag rows, the env-var table, and a "Prebuilt vendor artifacts" section (two-step flow, fail-closed integrity, per-outcome fallback table, current ecosystem coverage npm/pypi/cargo, and the new event codes) - README.md: the three new flags + env vars Service coverage today: npm (all lock flavors), pypi (wheel), cargo (.crate). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… zip)
Extend the prebuilt-package download to golang. `vendor_go_module` now
materialises the patched module service-first: download the prebuilt module
zip, verify it (sha512 + the `h1:` dirhash), extract it into
`.socket/vendor/golang/<uuid>/<module>@<version>/` (stripping the zip's
`{module}@{version}/` prefix), synthesize a minimal go.mod if absent, and wire
the go.mod `replace` via `ensure_replace_entry` — the same end state
`apply_go_redirect` produces, minus the copy + local apply, and with no
pristine module source needed. Falls back to the engine build on any miss.
- expose registry_fetch::extract_zip_with_prefix + go_redirect::ensure_module_go_mod
as pub(crate)
- go_service_redirect helper + boxed GoServiceRedirect enum; auto/service
fallback; offline+service refusal; empty-files patches defer to the engine
- add golang to dispatch_vendor_one's SERVICE_ECOSYSTEMS allowlist
- 4 integration tests: service success (extracts module, wires replace, no
pristine needed), wrong-h1-dirhash hard-fail (exercises the golang dirhash
check), not-built auto-fallback, offline+service refusal
Service coverage now: npm, pypi, cargo, golang.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… zip) Extend the prebuilt-package download to composer. `vendor_composer` now materialises the patched copy service-first: download the prebuilt dist zip, verify sha512, and extract it into `.socket/vendor/composer/<uuid>/<vendor>/<name>@<version>/` (dropping the zip's variable top-level dir) — no installed package needed. Falls back to copy-installed-and-patch on any miss. - expose registry_fetch::extract_zip as pub(crate) - composer_service_copy helper + boxed ComposerServiceCopy enum; auto/service fallback; offline+service refusal; composer.lock dist->path rewiring unchanged - add composer to dispatch_vendor_one's SERVICE_ECOSYSTEMS allowlist - 4 integration tests: service success (extracts dist, rewrites lock, no install needed), integrity-mismatch hard-fail, not-built auto-fallback, offline+service refusal Service coverage now: npm, pypi, cargo, golang, composer. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…omposer); gem gated gem stays build-local: a path-sourced gem needs a stub gemspec that the `.gem` archive doesn't carry in bundler's required eval-able form (it's metadata.gz YAML; RubyGems generates the stub into specifications/). A clean service path can't produce it without the local install or Ruby-specific serialization. - dispatch_vendor_one gate comment + detail message updated to the final set - CLI_CONTRACT.md "Coverage today" + README.md flag doc updated; note Tier-B build-equivalence is exercised by the toolchain-backed e2e suites Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nd artifact) Close the last gap in `vendor --vendor-source`: gem now downloads the prebuilt patched `.gem` from patch.socket.dev instead of always building locally, like npm/pypi/cargo/golang/composer. Bundler's path source needs an eval-able Ruby `<name>.gemspec`, but a `.gem` only carries the gemspec as YAML inside `metadata.gz`. The converter generates that stub and serves it as a `gem-stub-gemspec` SECOND artifact alongside the `.gem` (mirroring npm's `yarn-berry-zip`); the gem backend downloads and integrity-verifies BOTH, extracts the `.gem`'s `data.tar.gz` into the vendor copy dir, and writes the stub as `<name>.gemspec`. The Gemfile + Gemfile.lock pair wiring is unchanged — only how the copy dir + its `.gemspec` are produced differs. - api/client.rs: surface non-tarball served artifacts on `FetchedVendorPackage` as `secondary_artifacts` (host-rewritten URL + sha512), and add `download_artifact` to fetch one lazily. - service_fetch.rs: carry the secondary refs on `VerifiedArchive` and add `fetch_verified_secondary` (download + fail-closed sha512 verify). - registry_fetch.rs: factor a `pub(crate) extract_gem_data` out of `fetch_gem` so the service path reuses the exact same `.gem` extraction. - gem.rs: thread `service` through `vendor_gem`; `gem_service_copy` downloads + verifies the `.gem` and the stub (absent stub => miss: native-ext gem or a pre-rollout patch), refuses a native-ext stub, extracts, writes the stub; `materialise_patched_copy` unifies service-first / local-fallback across both the full path and the hot-path artifact rebuild. The local stub read is now non-fatal so an auto-fetched (not-installed) gem can still vendor via the service. 8 new wiremock-backed tests. - vendor.rs: add `gem` to `SERVICE_ECOSYSTEMS`; pass `service` to `vendor_gem`. - README / CLI_CONTRACT: gem is now service-covered. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The `test (macos-latest)` matrix job installs vexctl via `go install` and runs
tests/e2e_vex.rs against it. The macОS-latest runner image (Sequoia+) has a
dyld that refuses to load a Mach-O binary lacking an LC_UUID load command, and
Go's linker only began emitting one in 1.24 — so the 1.22-built vexctl crashed
on launch ("dyld: missing LC_UUID load command in .../vexctl") and every
e2e_vex assertion failed with "vexctl rejected the document". Environmental,
not a code regression (ubuntu/windows were unaffected); the shared matrix pin
just needed bumping.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wenxin Jiang (Wenxin-Jiang)
approved these changes
Jun 26, 2026
Mikola Lysenko (mikolalysenko)
added a commit
that referenced
this pull request
Jul 1, 2026
Pure rustfmt reflows (rewraps, trailing commas, brace elision) — no logic changes. Normalizes the #116 merge and this branch's test files to plain `cargo fmt` output. Assisted-by: Claude Code:claude-fable-5
Mikola Lysenko (mikolalysenko)
added a commit
that referenced
this pull request
Jul 2, 2026
…cosystem + VEX coverage (#117) * feat(cli): add `scan --redirect` hosted-vendored-patch mode Adds a new patch-apply mode that rewrites lockfiles/manifests so ONLY patched dependencies resolve from Socket's hosted vendored patches (patch.socket.dev), instead of vendoring local artifact bytes or writing .socket/manifest.json. - `scan --redirect` flag (conflicts_with_all apply/sync/vendor) - `--patch-server-url` / `SOCKET_PATCH_SERVER_URL` global arg, defaulting to https://patch.socket.dev (DEFAULT_PATCH_SERVER_URL) - api/client.rs `fetch_registry_references` (authed /v0/orgs/{org}/patches/package, or proxy /patch/package) - patch/redirect/mod.rs rewriters for 9 ecosystems (npm package-lock, pnpm, yarn-classic, pypi requirements, uv, cargo, composer, nuget, gem); golang documented as a limitation (no per-dependency remote redirect without a global GOPROXY) Shared golden fixtures (tests/fixtures/redirect/**) are consumed by BOTH this crate's redirect_golden.rs and the depscan backend's golden.test.ts, keeping the two rewriter implementations byte-identical. Behavioral coverage: tests/in_process_redirect.rs. Assisted-by: Claude Code:opus-4-8 * feat(vex): attest redirected patches with a (redirected) provenance marker `scan --redirect` previously dropped `--vex` silently and left no trail VEX could attest: the redirect ledger carried only file edits, no patch records, and the VEX verifier never read it. - redirect ledger (`.socket/vendor/redirect-state.json`) now embeds the manifest PatchRecord (file hashes + vulnerabilities) per redirected PURL, fetched from the view endpoint during the redirect run; schema lives in patch/redirect/state.rs (RedirectState, load_redirect_state) - vex: new augment_with_redirect folds ledger records into the manifest view (mirror of augment_with_detached) for both the standalone `vex` command and the embedded --vex paths, so redirected patches attest post-install against the installed tree exactly like `apply` - build: build_document_with_provenance generalizes the vendored marker with a `(redirected)` impact phrasing; redirected PURLs bypass the property-7 ecosystem filter for the same reason vendored ones do (the committed lockfile rewrite is the persistence mechanism) - scan --redirect --vex now emits the OpenVEX document in-run (the redirected bytes are remote pre-install, so the in-run attestation is built from the ledger records; a post-install `socket-patch vex` hash-verifies the installed tree) Tests: e2e_vex_redirect.rs (installed-tree verify, property-7 bypass, tampered→hash_mismatch fail-closed, no-verify ledger attestation, cross-ecosystem incl. maven/nuget/composer + qualified PURLs), in_process_redirect.rs --vex leg, core unit tests for the ledger round-trip and the (redirected) phrasing, and real-toolchain VEX legs in the npm/cargo/pnpm/yarn-classic/pypi/golang e2e_vendor_*_build capstones. Assisted-by: Claude Code:claude-fable-5 * style: cargo fmt Pure rustfmt reflows (rewraps, trailing commas, brace elision) — no logic changes. Normalizes the #116 merge and this branch's test files to plain `cargo fmt` output. Assisted-by: Claude Code:claude-fable-5 * test(vex): real-install npm redirect capstone + docker gem/composer VEX legs e2e_redirect_npm_build.rs proves the full redirect chain with the real toolchain: a mock patch server serves real patched tarball bytes; scan --redirect rewrites package-lock.json; a fresh-checkout npm ci (empty cache) installs the patched bytes from the served URL; a post-install socket-patch vex HASH-VERIFIES the installed tree and emits the (redirected) statement. The negative leg serves tampered bytes against the real lockfile pin and asserts npm ci fails with EINTEGRITY — the pin is enforcement, not decoration. The docker vendor suites (gem via bundler, composer) gain a VEX leg inside the existing stage-1 container run: the staged manifest now carries a vulnerability, socket-patch vex runs in-container against the vendored artifact, and a host-side oracle asserts the document (one statement, not_affected, correct PURL subcomponent, "(vendored)" marker). Both suites executed against real Docker containers. Assisted-by: Claude Code:claude-fable-5 * fix(redirect,vex): fail-closed attestation, idempotent rewriters, surfaced record-fetch failures Adversarial-review findings on the redirect->VEX wiring, each with a pinned regression test: - CRITICAL: the in-run --vex exemption trusted the whole redirect ledger, so a granted patch whose rewriter never touched a file (no lockfile present), a stale ledger record, or a manifest patch that failed verification could all be attested not_affected — suppressing a live CVE. run_redirect now derives a CONFIRMED set (the dep's hosted URL actually present in a project file after the rewrite): only confirmed deps are recorded in the ledger, exempted from in-run verification (VexBuildParams.assume_applied), counted as redirected, or allowed through the property-7 bypass. The no-lockfile case now warns, writes no ledger, and a requested attestation fails (exit 1). - record-fetch failures (view endpoint down/404) after a successful redirect were silently swallowed, leaving the patch permanently unattestable with zero signal. They now surface as record_fetch_failed warnings in the JSON envelope and on stderr. - five rewriters recorded an edit even when the entry was already at its target values, growing the committed ledger on every re-run and poisoning a future revert (npm entry, pnpm, Cargo.lock via a tri-state that keeps the pkg-not-found warning honest, nuget lock, gem CHECKSUMS). A second pass over rewritten output now records zero edits; the ledger merge keeps append semantics for real changes. - requirements.txt: the environment marker was captured to end-of-line, swallowing a previously appended --hash and duplicating it on every re-run; markers are now taken from the requirement portion only. pip-compile --generate-hashes backslash continuations are refused with redirect_requirements_continuation instead of corrupted (orphaned --hash lines / mid-marker backslash = pip InvalidMarker). - vendor::load_state treats a mode-tagged NON-vendor ledger squatting on .socket/vendor/state.json (an early depscan GitHub-app registry ledger) as empty instead of bricking remove/vendor/repair with vendor_state_unreadable; genuinely corrupt vendor ledgers stay fail-closed. Assisted-by: Claude Code:claude-fable-5 * fix(ci): check the redirect golden fixtures out byte-exact on Windows The fixtures are a byte-for-byte cross-language contract; Windows runners' CRLF checkout conversion made redirect_golden_fixtures_match fail on the generated-vs-expected comparison. -text disables EOL conversion for the fixture tree (contents are already committed LF). Assisted-by: Claude Code:claude-fable-5 * chore: exclude test fixtures from Socket dependency scanning The redirect golden fixtures commit manifests/lockfiles that deliberately pin old, vulnerable dependency versions — they are the test inputs for a security-patching tool. Socket's PR scan flagged their CVEs (Puma, Flask) as project dependency alerts; socket.yml now scopes scanning away from the fixture trees. Assisted-by: Claude Code:claude-fable-5 * feat(cli): three-mode selector, maven hosted rewriter, nuget config fixes, golang NO-GO - scan --mode <hosted|vendored|agent> is the documented mode selector; the legacy spellings (--redirect/--vendor/--apply) stay as aliases. Cross-mode combinations error; --detached composes with either vendored spelling. The hosted ledger writes mode: "hosted" (loader stays tolerant of pre-rename strings). - rewrite_maven_pom: hosted maven support for pom projects — surgical <repository> insert (socket-patch-<uuid>, checksumPolicy=fail, snapshots disabled), same-GAV verify-only version check with unpinned/not-found/packaging warnings; gradle build files are never machine-edited (settings-level repos are silently ignored under PREFER_PROJECT) — a paste-able exclusiveContent snippet is emitted instead. Byte-identical to the TS twin via new shared maven/pom golden fixtures. - nuget config correctness: creating a packageSourceMapping now fans a '*' catch-all out to pre-existing sources (else every other package fails NU1100), seeds nuget.org when the sources list is empty, and a self-closing <packageSources /> is expanded in place instead of leaving a duplicate element. New shared golden cases pin all three. - golang hosted is a documented NO-GO (docs/design/golang-hosted-no-go .md): day-2 sumdb failures need uncommittable GOPRIVATE, replacement -module identity would force per-grant artifacts, and the default GOPROXY chain would leak tokened URLs or licensed bytes to a public mirror. The rewriter warning now names the decision and the remedy (vendored mode). Assisted-by: Claude Code:claude-fable-5 * feat(vendor): NuGet and Maven vendored backends with fragment-level revert - nuget_feed.rs: flat-folder feed at .socket/vendor/nuget/<uuid>/ (relative-path <add> source resolves against nuget.config — committable), packageSourceMapping with the '*' catch-all rule, packages.lock.json contentHash = base64(sha512(nupkg)) recomputed locally. Service-first with local-cache fallback (deterministic re-zip of the patched extraction). dotnet restore --locked-mode turns cache collisions into a loud NU1403. - maven_repo.rs: the vendored uuid dir IS a file:// maven2 repository (jar + REAL upstream pom + sha1/md5 sidecars — an authored minimal pom would drop transitive deps). Multi-module poms and gradle-only projects refuse loudly; warm ~/.m2 shadowing warned with a purge one-liner. - Fragment-level revert for both backends: per-purl removal excises ONLY our wiring fragment, preserving sibling patches' wiring and post-vendor user edits; whole-file restore only on the byte-identical fast path; drift warns per the ledger contract. - Shared plumbing: nuget/maven in ECOSYSTEM_DIRS + dispatch + leaf_to_purl; .nupkg/.jar handled by the zip verifier (VEX/repair); both cargo features promoted to defaults. Assisted-by: Claude Code:claude-fable-5 * test: three-mode × ecosystem behavioral matrix - Agent mode: VEX legs in all 8 docker apply suites (real container apply → in-container vex → plain-marker statement asserted host-side); dedicated in-process apply tests for npm/composer/maven/nuget; a cross-ecosystem agent-mode VEX matrix (8 ecosystems + qualified PURLs) in e2e_vex.rs. - Vendored mode: docker capstones for the new nuget (fresh-checkout cold-cache --network none restore, NU1403 tamper leg) and maven (offline mvn against the file:// repo, checksum negative, warm-cache shadow warning) backends; host capstones for gem/composer (toolchain- gated, CI-pinned via bundler 2.5 + setup-php); vendored VEX matrix in e2e_vex_vendor.rs including the new backends. - Hosted mode: e2e_redirect_cargo_build capstone (wiremock sparse index + real patched .crate; fresh CARGO_HOME cargo fetch --locked; cksum tamper leg; post-install vex (redirected)); scan --mode parse contract tests. - CI: docker vendor suites wired into the coverage-docker matrix; gem/composer host capstones pinned into the e2e job. Assisted-by: Claude Code:claude-fable-5 * docs: three-mode README, CHANGELOG, CLI_CONTRACT + support matrix README gains 'Choosing a patch mode' — the hosted/vendored/agent comparison table (mechanism, user-code changes, CI requirements, offline story, integrity story), an honest mode × ecosystem support matrix derived from the code, per-mode walkthroughs, VEX provenance- marker documentation (plain = agent, (vendored), (redirected) = hosted), and per-ecosystem caveats (maven ~/.m2 shadowing + purge one-liner, mirrorOf mirrors, nuget locked-mode). Agent mode is positioned as fully supported but not recommended for new setups. CHANGELOG covers hosted mode, --mode, the vendored backends, and the nuget mapping fixes. CLI_CONTRACT adds scan --mode/--redirect and the two ledger surfaces as contract. Assisted-by: Claude Code:claude-fable-5 * test(vendor): jsr is the unsupported-ecosystem exemplar now that nuget vendors unsupported_ecosystem_purl_is_a_benign_skip used pkg:nuget as its cannot-vendor ecosystem; the new NuGet vendored backend made that purl vendorable (and the feature a default), so the fixture failed as package-not-found instead of skipping benignly. pkg:jsr is the one compiled-in ecosystem with no vendor backend by design — the contract being pinned (benign skip, npm still vendors, exit 0) is unchanged. Assisted-by: Claude Code:claude-fable-5 * fix(cli): mode-conflict errors match the clap contract phrasing scan_vendor_flag_conflicts_are_clap_errors pins that flag-misuse errors read like clap's own ('cannot be used with' / 'required'); the --mode resolution errors said 'cannot be combined' and 'requires', failing the contract on every CI OS. Exit code was already 2. Assisted-by: Claude Code:claude-fable-5 * docs: spell out Maven's checksum-failure fallback for hosted mode A -C checksum failure on the Socket repository does not fail the build when the GAV also exists on Maven Central — Maven falls back and silently installs the unpatched artifact (the patch is dropped, though mismatched bytes are never consumed). Surfaced by the depscan install-verify negative leg; gradle dependency verification or vendored mode remain the strong paths. Assisted-by: Claude Code:claude-fable-5 * refactor(core): factor bun text-lock grammar + share uri encode; generalize pnpm redirect basename Groundwork with no behavior change to existing single-lock flows. Bun grammar: move the conservative `bun.lock` line grammar (BunEntry, parse_packages_section, parse_entry_line, packages_bounds, scan_json_string, scan_balanced_array, split_top_level, decode_json_string, split_name_spec, check_lock_version, SUPPORTED_LOCK_VERSION) out of vendor/bun_lock.rs into a new patch/bun_lock_text.rs. Fields on BunEntry become pub(crate) so both the vendor classifier and lock_inventory can consume it. The vendor-specific classify/TupleShape stay in bun_lock.rs. The grammar's unit test moves with it. lock_inventory.rs now imports the grammar from the shared module. URI: move encode_uri_component out of vendor/yarn_berry_lock.rs into a shared utils/uri.rs as a pub fn. Its impl already matches JS encodeURIComponent exactly (uppercase hex, `-_.!~*'()` unreserved), verified against yarn 4.12 with an added oracle vector. pnpm redirect: generalize rewrite_pnpm_lock to rewrite every files-map key equal to `pnpm-lock.yaml` or ending in `/pnpm-lock.yaml` (e.g. Rush's common/config/rush/pnpm-lock.yaml), iterating keys sorted for stable goldens. The per-dep entry-not-found warning now fires only when the dep matched in no pnpm lock across the whole set. Adds the npm/pnpm/nested-rush-lock golden case (shared by the Rust and TS suites). FileEdit.path and the output-files key are the actual key. Assisted-by: Claude Code:claude-fable-5 * feat(patches): Rush monorepo support — vendored refusal, scan inventory fallback, redirect-only repair no-op Rush keeps a single pnpm source-of-truth lock under common/config/rush/ (no root package.json/lock pair) and runs installs from a generated common/temp workspace, so vendor's relative file: rewiring cannot survive. Three disjoint pieces land the read/refuse side of hosted Rush support: - npm_flavor: when no root lockfile matches and rush.json is present, the flavor probe refuses with vendor_rush_unsupported, naming the generated-workspace install model (overrides in pnpm-config.json, installs from common/temp) and routing to `scan --mode hosted`. The check fires only in the otherwise-missing arm, so a stray rush.json beside a real root lock still routes normally. - lock_inventory: inventory_npm_lock (and thus inventory_project) falls back to the Rush locks when the root lock is absent but rush.json exists — the common source-of-truth lock plus every common/config/subspaces/*/pnpm-lock.yaml, read_dir sorted for determinism, repo-relative paths preserved. - repair: a project whose only trace is .socket/vendor/redirect-state.json (hosted mode; no manifest, no vendor ledger, no vendored references) is a no-op for repair, not a manifest_not_found error. Exit success with an informational redirect_only_project skip routing to `scan --mode hosted`; contents are not validated. Assisted-by: Claude Code:claude-fable-5 * feat(core): yarn-berry + bun hosted registry-redirect rewriters + shared goldens Adds `rewrite_yarn_berry` and `rewrite_bun_lock` to the registry-redirect engine, the Rust half of the cross-language rewriter contract with the depscan backend's TS `registry-rewrite/{yarn-berry,bun}` twins. - Berry: rewrites ONLY the lock entry — `resolution:` gains yarn's own `::__archiveUrl=<encodeURIComponent(url)>` binding and `checksum:` becomes the override's `yarnBerry10c0` (berry verifies the converted cache zip, not the tarball). Descriptor key + package.json untouched, so `--immutable` still passes. Whole-file gates refuse a cacheKey != 10c0 or a `.yarnrc.yml compressionLevel != 0` (no offline-reproducible checksum). - Bun: rewrites a registry 4-tuple `["name@ver","<reg>",{deps},"sha512-…"]` to a URL 3-tuple `["name@<url>",{deps verbatim},"<sha512>"]` using the shared `bun_lock_text` grammar (fail-CLOSED on any deviation). Binary `bun.lockb` is never parsed — presence without a text lock is a refusal. Content-based dispatch: berry is detected by `^__metadata:` (classic already declines it). New golden fixtures under `tests/fixtures/redirect/npm/{yarn- berry,bun}/` (authored TS-side, byte-matched here); `RUST_IMPLEMENTED` gains both. Warning-path unit tests cover every refusal branch. Assisted-by: Claude Code:claude-fable-5 * feat(cli): berry/bun redirect plumbing + bun.lockb auto-migration Wires the CLI's `scan --redirect` for the new berry/bun rewriters: - `REDIRECT_CANDIDATE_FILES` gains `.yarnrc.yml` (berry cache-config gate) and `bun.lock`. - The DepOverride merges the reference's `yarn-berry-zip` artifact — `yarnBerry10c0` into integrity and `berry_zip_url` from its URL — so the berry rewriter can pin the cache-zip checksum. - Confirmed-redirect gating also matches `encode_uri_component(artifact_url)`, since the berry rewriter writes the URL percent-encoded into the lock's `::__archiveUrl=` binding (the raw form never appears there). - bun.lockb auto-migration in `run_redirect`, before candidate reads: a binary lockfile with no text `bun.lock` is re-locked via `bun install --save-text-lockfile --frozen-lockfile --lockfile-only` (writes bun.lock, deletes bun.lockb, offline, fails closed). The removal is recorded as a ledger `FileEdit { action: "removed" }` (git history is the restore path); a dry-run only warns, and a failure degrades to the rewriter's own presence-only refusal. In-process tests cover the berry leg (encoded __archiveUrl + yarnBerry10c0 + preserved key + idempotent re-run), the bun leg (4-tuple → URL 3-tuple), and the lockb migration via a fake `bun` shim on PATH. Assisted-by: Claude Code:claude-fable-5 * feat(cli): scan --redirect discovers Rush pnpm locks + stale repo-state warning Rush monorepos have no root package.json/lock pair: the single pnpm source-of-truth lock lives at common/config/rush/pnpm-lock.yaml and, when subspaces are enabled, one lock per subspace under common/config/subspaces/<name>/. After the static REDIRECT_CANDIDATE_FILES loop, run_redirect now reads those locks (when rush.json is present) into the files map under their repo-relative keys — the pnpm rewriter is basename-generalized, so nested locks are rewritten in place and the path-generic write-back / confirmed-gating already handle them. Discovery of the packages themselves flows through the lock_inventory Rush fallback landed earlier. When at least one Rush lock is read and common/config/rush/repo-state.json exists, the run emits redirect_rush_repo_state_stale: editing the lock outside `rush update` desyncs the pnpmShrinkwrapHash that file records, so `rush install` fails until `rush update` refreshes it if preventManualShrinkwrapChanges is enabled — but the redirect survives that refresh (pnpm keeps locked resolutions for unchanged specifiers). repo-state is Rush's to manage; the redirect never touches it. Tests: a rush-shaped in_process_redirect fixture (rush.json + common lock + subspace lock, no root pair) asserts both nested locks are repointed in place; a subprocess JSON test pins the stale warning present-with-repo-state and absent in the twin without it. Assisted-by: Claude Code:claude-fable-5 * test(redirect,repair): pnpm hosted-lock legs + flavor repair matrix Add `in_process_redirect_pnpm.rs`: the plain single-project pnpm (lockfileVersion 9.0) root-lock counterpart of the npm hosted-redirect legs. Asserts the `resolution:` splice to `{integrity: <patched>, tarball: <hosted>}` (shared `npm/pnpm` golden shape), the `redirect_pnpm_resolution` ledger edit, idempotency, and the `--vex` `(redirected)` attestation. Add `repair_vendor_flavors_e2e.rs`: generalize the npm-classic repair invariants (`repair_vendor_e2e.rs`) across pnpm / yarn-berry / bun vendored projects — deleted-tarball rebuild, corrupt-tarball rebuild, tampered-ledger-sha fail-closed, and ledger-gone reconstruction from the lockfile's vendored-tarball reference. Fixtures run the real `scan --vendor` flow in-test against a mock API (no real package manager) using each backend's pre-vendor lock shape. Assisted-by: Claude Code:claude-fable-5 * test(apply): agent-mode legs for bun, yarn-berry node-modules, rush farm Extend `in_process_alternate_installers.rs` with three real-layout agent-mode apply legs: - bun: real `bun install` (private cache) → apply patches the hoisted node_modules copy; - yarn-berry node-modules linker: `.yarnrc.yml nodeLinker: node-modules` + corepack-dispatched `yarn@4.12.0`, install → apply (complements the PnP refusal test in `e2e_safety_yarn_pnp.rs`); - rush pnpm symlink farm: a hand-built `common/temp/node_modules/.pnpm/<pkg>@<v>/…` store with per-project `apps/{a,b}/node_modules/<pkg>` symlinks and `rush.json` at the root; apply at the root patches the canonical `.pnpm` file once and the bytes are visible through BOTH symlinks — pinning that the crawl finds the package via the project symlinks even though `common/temp` is in SKIP_DIRS. Assisted-by: Claude Code:claude-fable-5 * feat(redirect): fail-closed maven suffixing + trusted checksums Overhaul the maven pom rewriter to consume the new server-side version-suffixing contract. When a maven2 override carries `mavenSuffixedVersion` + `mavenPomSha256`, pin the patched jar to the Socket-only suffixed version explicitly — rewrite the literal `<version>` or, for a transitive/managed dependency, author a `<dependencyManagement>` entry — so an outage or tamper on the Socket repo hard-fails the build instead of silently resolving the unpatched upstream artifact. A property version is refused (a literal edit would break the reference); a mismatched literal version is skipped. When a pin lands and both the jar and served-pom sha256 are known, emit Maven Trusted Checksums files (`.mvn/maven.config` resolver args + `.mvn/checksums/checksums.sha256` entries), merging into any existing user config/checksum set (dedupe by key, never override a conflicting value, sort checksums by path). Absent a suffixed version, fall back to today's same-GAV repository injection with a `redirect_maven_same_gav_fallback` warning (not fail-closed). Add `maven_suffixed_version` + `maven_pom_sha256` to the identifiers struct, list the two `.mvn/*` files in REDIRECT_CANDIDATE_FILES, and confirm a maven redirect by the globally-unique suffixed version string (the `.pom` URL never lands in a pom). The Gradle manual snippet now pins the suffixed version and reminds the user to bump the declaration. Update the shared golden fixtures (basic, existing-repositories, property-version-warn now a zero-edit refusal, rerun-noop) and add new cases (transitive-depmgmt, existing-depmgmt, no-suffix-fallback, mvn-config-merge, version-mismatch-skip). The Rust rewriter is byte-identical to the TS twin. Assisted-by: Claude Code:claude-fable-5 * test(redirect): real-install berry + bun hosted capstones Add `e2e_redirect_yarn_berry_build.rs` and `e2e_redirect_bun_build.rs`, the hosted-mode (`scan --mode hosted`) full-chain analogs of `e2e_redirect_npm_build.rs` for the yarn-berry and bun flavors: `scan --mode hosted` rewrites the lock to resolve the patched dep from the wiremock hosted tarball, then a fresh-checkout install (`yarn install --immutable --check-cache` / `bun install --frozen-lockfile`, offline from the registry) MUST land the patched bytes; a tamper twin serving different bytes must fail (YN0018 / integrity). berry derives the exact `10c0/<hex>` cache-zip checksum the mock hands back by a bootstrap real-yarn `resolutions: file:` install (yarn recomputes the same zip checksum for the archiveUrl locator). Two harness robustness measures: the corepack gate probes from a neutral tempdir (the monorepo root's `packageManager` field otherwise makes corepack refuse yarn), and every yarn invocation sets `YARN_ENABLE_GLOBAL_CACHE=false` so the persistent `~/.yarn/berry` cache can't serve the tampered twin honest bytes. `bun.lockb` migration is not exercised here (bun 1.3.x has no flag to emit the binary lockfile); it stays covered by the in-process shim test `scan_redirect_migrates_bun_lockb_then_redirects`. Assisted-by: Claude Code:claude-fable-5 * test(docker): yarn berry 4.x agent + vendored e2e legs Dockerfile.npm: cache `yarn@4.12.0` in corepack WITHOUT `--activate` (the global `yarn` stays classic 1.22.22); berry fixtures opt in per-project via package.json `"packageManager": "yarn@4.12.0"`, which the corepack shim dispatches. A build-time check pins that the shim resolves yarn 4 under a packageManager pin while the global stays classic. docker_e2e_npm.rs: add two berry legs — an agent-mode install→apply chain (container twin of the npm agent leg with a yarn 4 install front-end, reusing the shared mock fixture) and a vendored offline-frozen-install chain (container twin of `e2e_vendor_yarn_berry_build.rs`: stage a manifest from the installed bytes, `vendor --offline`, then a fresh-checkout `yarn install --immutable --check-cache` must land the patched bytes — proving the CLI's offline `10c0/<hex>` checksum is what real yarn 4 accepts). Both gated behind the existing `docker-e2e` feature + `skip_if_no_docker_image`. Assisted-by: Claude Code:claude-fable-5 * test(redirect): Rush hosted-mode redirect capstone (sim + gated real rush) Add `e2e_redirect_rush_sim.rs`. Tier 1 (default-runnable, gated on corepack pnpm@9): run the real CLI `scan --mode hosted` over a committed Rush-shaped fixture (rush.json + common/config/rush/pnpm-lock.yaml), then REPLICATE `rush install` in-test — copy the rewritten common lock to common/temp/pnpm-lock.yaml, write a generated-style common/temp/package.json, and run `pnpm@9 install --frozen-lockfile` with the registry pinned to a dead port so the only reachable artifact URL is the wiremock hosted tarball. Asserts the patched bytes land in common/temp/node_modules, plus a tamper twin (serve wrong bytes → pnpm integrity failure). Tier 2 (gated on `RUSH_E2E=1`, network-dependent, off by default): real `npm x @microsoft/rush` — `rush update` → `scan --mode hosted` → `rush install`, asserting patched bytes, then the preventManualShrinkwrapChanges failure + `rush update` recovery. All `RUSH_*` env vars (including the gate) are stripped before invoking rush since it rejects unrecognized ones. Assisted-by: Claude Code:claude-fable-5 * test(redirect): clippy cleanups in berry + rush redirect legs Silence `-D warnings` in the new redirect capstones: fold the berry bootstrap-checksum `let...else { return None }` into the `?` operator, and drop the unused `copy_dir_recursive` helper from the rush sim (it rewrites locks in place and never copies a `.socket/` tree). Assisted-by: Claude Code:claude-fable-5 * docs: berry/bun + Rush hosted support, maven fail-closed, de-document scan aliases README: support matrix hosted npm-family cell now covers yarn-berry (cacheKey 10c0 / compressionLevel 0, node-modules linker e2e-covered, PnP untested for hosted) + bun (text bun.lock v1, bun.lockb auto-migrated); adds a Rush note (hosted repoints common/config lockfiles in place incl. subspaces, agent works through the symlink farm, vendored refused → hosted). The maven caveats section is rewritten from the old silent- fallback story to fail-closed version suffixing (a Socket-only <version>-socket.<hex8> the mirror/warm-~/.m2 can't shadow, build hard- fails on outage/tamper) plus optional Maven 3.9+ Trusted Checksums (silently inert below 3.9; 3.9.9 MNG-8182 readability fix); ~/.m2 shadow re-scoped to vendored-only, mirrorOf flipped to a loud failure, gradle snippet now carries the suffixed version. Legacy scan spellings --vendor /--apply/--sync are removed from every example and prose line; scan --mode <hosted|vendored|agent> is the only documented selector (the booleans stay as back-compat aliases, noted once). The standalone vendor command section stays (home of vendor --revert). VEX (redirected) marker docs unchanged (still accurate — nothing user-visible changed). CHANGELOG: rewrites the stale verify-only maven-hosted entry to the fail- closed suffixing + trusted checksums behavior; adds Unreleased entries for berry/bun hosted (+ bun.lockb auto-migration), Rush hosted/agent/vendored stances, the pnpm nested-lockfile generalization, and the repair redirect- only exit-0 no-op fix. CLI_CONTRACT: documents the redirect candidate-file set (.yarnrc.yml, bun.lock, .mvn/*), berry/bun/rush hosted coverage and the new redirect_* codes, the Rush lockfile-supplement discovery fallback, and repair's redirect_only_project exit-0. Assisted-by: Claude Code:claude-fable-5 * style(core): satisfy clippy cloned_ref_to_slice_refs in rewriter tests The berry/bun rewriter unit tests built single-override slices with `&[ovr.clone()]`, which trips clippy::cloned_ref_to_slice_refs under the --tests lint gate. Use std::slice::from_ref instead — no clone, same slice. Assisted-by: Claude Code:claude-fable-5 * test(cli): fix windows + release CI failures in in-process suites Two CI-only failures: - scan_redirect_migrates_bun_lockb_then_redirects is unix-only now: its fake bun shim is a #!/bin/sh script (Windows would need a .cmd twin and ';' PATH joining). The migration code path itself is OS-agnostic and keeps real-bun coverage in the gated e2e capstone. - in-process vendor/revert runs acquire the apply lock with a 5s timeout. flock guards are OFD-based, so a concurrent test's fork()ed pre-exec child briefly holds copies of every parent fd — including a just-dropped lock fd — making back-to-back in-process runs see their own lock as held for the fork→exec window. Seen twice as rare release-only lock_held failures (revert_round_trip previously, revert_works_without_manifest now); the wait absorbs the window via the acquire loop's 100ms retry while a real deadlock still fails. Assisted-by: Claude Code:claude-fable-5
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
socket-patch vendornow downloads the already-built, integrity-verified patched package from the patch.socket.dev vendoring service instead of always building it locally — across every vendorable ecosystem: npm, pypi, cargo, golang, composer, and gem.New
--vendor-source <mode>/SOCKET_VENDOR_SOURCE:auto(default) — download the prebuilt package; silently fall back to a local build on any miss.service— require the service, fail closed.build— always build locally (the pre-service behavior).Plus host overrides for testing against staging / local dev:
--vendor-url/SOCKET_VENDOR_URL(step-1 package-reference host) and--patch-server-url/SOCKET_PATCH_SERVER_URL(step-2 download host).Verification is fail-closed: a downloaded artifact that fails its sha512 (and, for golang, the
h1:dirhash) is never used — it's a hard error, never a silent fallback to wrong bytes.Per-ecosystem
gem-stub-gemspecsecond artifact (the converter-generated path-source stub gemspec a.gemdoesn't carry in bundler's required form) and writes it as<name>.gemspec. An absent stub (native-extension gem, or a patch built before the server-side rollout) is a service miss —autofalls back to the local build,servicerefuses. Server side: SocketDev/depscan#21768.Notable internals
api/client.rs: surface non-tarball served artifacts assecondary_artifacts(host-rewritten URL + sha512);download_artifactfetches one lazily.service_fetch.rs: shared download-and-verify funnel;fetch_verified_secondaryfor named second artifacts.registry_fetch.rs:pub(crate)extract helpers reused by the service paths (incl. a factoredextract_gem_data).--vendor-source=service→vendor_service_offline_conflict. A successful service vend emitsvendor_prebuilt_downloaded. Unrelated to--download-mode(which selects the local build's patch-content format).Verification
cargo clippy --workspace --all-features -- -D warnings— clean.cargo test— full core lib (1514) + CLI (297) green; per-ecosystem service paths covered by wiremock-backed unit tests (success, integrity-mismatch hard-fail, pending/unavailable fallback, offline refuse; gem also stub-missing fallback-vs-refuse and native-ext refuse).bundle installe2e that validates the TS-generated stub's fidelity.🤖 Generated with Claude Code