Objective
Add an agentv skills subcommand that serves skill content from inside the CLI tarball, version-matched to the binary. Bundle the contents of plugins/agentv-dev/skills/** into the published agentv npm package so a single npm i -g agentv is sufficient for skill-driven use, without requiring a separate allagents plugin install step.
Why
Today the agentv npm package and its skills ship as two independently-installed artifacts with no version coupling:
-
apps/cli/package.json declares files: ["dist", "README.md"] — no skill content in the tarball.
-
Skills live in plugins/agentv-dev/skills/** and are distributed through the marketplace declared in .claude-plugin/marketplace.json. The canonical setup in apps/web/src/content/docs/docs/getting-started/installation.mdx is:
npm install -g agentv
npx allagents plugin marketplace add EntityProcess/agentv
npx allagents plugin install agentv-dev@agentv
The CLI version a user installs (e.g. agentv@4.20) is not coupled to the skill version their plugin manager resolves. This permits silent drift:
- Skill references a CLI command that doesn't exist yet. User on
agentv@4.20 with agentv-dev at main, where agentv-bench/SKILL.md documents commands introduced in agentv@4.22. The agent runs them, the CLI errors with "unknown command", and there's no version-skew signal.
- CLI exposes flags the skill never mentions. User on
agentv@4.25 with a months-old plugin install. Newer flags (--target copilot-sdk, etc.) are unknown to the skill — the agent works around them.
- Renamed primitives appear inconsistently. Grader type renames like
llm_judge → llm-grader are normalized in YAML inputs by grader-parser.ts, but skill text and JSONL output use different names, which is hard to diagnose without the version context.
The skills also reference each other (agentv-bench hands off to agentv-eval-review and agentv-trace-analyst). A partial mismatch between two skills installed at different times is harder to diagnose than a clean version error.
A second, related limitation: today's skill content is reachable only via a plugin host. Agents running in Cursor, Codex, raw Claude API, or any harness that can run a shell can't load AgentV's skill content without first reaching into a Claude-specific plugin install path.
Prior art
vercel-labs/agent-browser (v0.26.0) ships skill content inside the CLI npm package and serves it via a built-in agent-browser skills subcommand. The package's files array lists skill-data and skills alongside bin. The Rust binary's cli/src/skills.rs (~600 LOC including tests) implements skills list/get/path with a deterministic resolution order. The discovery stub in skills/agent-browser/SKILL.md carries hidden: true and its entire body redirects the agent to agent-browser skills get core, with the design statement made explicit:
The CLI serves skill content that always matches the installed version, so instructions never go stale. The content in this stub cannot change between releases, which is why it just points at skills get core.
Drift is impossible by construction because skill content and binary live in the same tarball at the same version. npm i -g agent-browser@latest updates both atomically. The skills are also harness-portable — anything that can shell out to agent-browser skills get core can load them.
Full comparison: https://github.com/agentevals/agentevals-research/blob/main/research/findings/agent-browser/skill-delivery-vs-agentv.md
Design latitude
Surface
agentv skills list
agentv skills get <name>
agentv skills get <name> --full # also print files under references/ and templates/
agentv skills get --all
agentv skills get <name> --json # machine-readable
agentv skills path [<name>]
skills get output is the SKILL.md content (frontmatter included), matching how Claude Code consumes skills today.
Bundling
- Bundle skill content under
plugins/agentv-dev/skills/** into apps/cli/dist/skills/ at build time (tsup onSuccess step, or a cp script run before tsup). The existing files: ["dist", "README.md"] then carries the skills into the published tarball without changes.
- Implement
agentv skills under apps/cli/src/commands/skills/. Resolution walks from the CLI binary's location to find the bundled dist/skills/ directory. agent-browser's find_package_root in cli/src/skills.rs is the model — adapt the npm install layout (binary in dist/, skills in dist/skills/).
- Estimated cost: ~300–500 LOC TypeScript plus tests, plus one wiring change in
apps/cli/tsup.config.ts (or equivalent build script).
Marketplace plugin
Keep agentv-dev in the marketplace for Claude Code users who prefer plugin-based discovery. Rewrite each plugin SKILL.md to be a discovery stub: short body, redirect to agentv skills get <name>. This is the agent-browser pattern — the stub lives outside the install, so it must not try to teach. Concretely, today's full SKILL.md content moves into the bundled dist/skills/<name>/SKILL.md, and the marketplace plugin's SKILL.md becomes a few lines pointing the agent at the CLI subcommand.
Open design choices
- Subset vs. all. Minimum viable bundle is
agentv-bench (most invoked) plus agentv-eval-writer. Bundling all six (agentv-bench, agentv-eval-writer, agentv-eval-review, agentv-trace-analyst, agentv-onboarding, agentv-governance) is also fine — skill markdown is small.
- Source path. Today's
plugins/agentv-dev/skills/** can stay as the source of truth and be copied into dist/skills/ at build time, or skills can be moved to apps/cli/skills/ with the marketplace plugin referencing them from there. Either is acceptable; the first preserves the marketplace plugin's current layout.
- Subcommand naming.
agentv skills matches agent-browser skills. agentv skill (singular) is also fine.
--json. agent-browser exposes --json for programmatic agents loading skills into their own context machinery. AgentV could ship --json from day one or land it as a follow-up — the win from bundling is independent of output format.
Acceptance signals
agentv skills list returns the same set of skill names that the marketplace plugin currently exposes (or the explicitly-chosen subset, if subsetting).
agentv skills get agentv-bench returns content byte-equivalent to the agentv-bench SKILL.md at the same release tag.
- A clean
npm i -g agentv@<release> is sufficient to drive evals via skills, with no allagents plugin install step required.
- The marketplace plugin still installs in Claude Code, but each plugin SKILL.md is a stub redirecting to the CLI subcommand. No duplicate skill bodies in two locations.
- The release pipeline has exactly one source of truth for skill content. CI (or a pre-publish check) verifies bundled skills match source.
agentv skills get <name> --json parses as valid JSON with a stable schema (mirroring agent-browser: { "success": true, "data": [{ "name", "content", "files"? }] }).
- Tests cover: discovery from the install layout, frontmatter parsing including
hidden: true, --full collecting references/ and templates/, and the not-found error path.
Non-goals
- Does not move
agentic-engineering or agentv-claude-trace out of the marketplace. Those plugins are not tightly coupled to the CLI version and the existing model fits them.
- Not a rewrite of skill content. Skills move location but bodies are unchanged.
- No deprecation of the marketplace path — it stays as a Claude Code surface, just no longer the canonical setup in install docs.
- Not a new skill format or schema. Existing SKILL.md frontmatter (Claude Code skill conventions) is preserved.
Docs follow-up (in the same PR)
- Update
apps/web/src/content/docs/docs/getting-started/installation.mdx so the "Canonical Setup" is npm install -g agentv alone. The npx allagents plugin marketplace add ... block becomes a "Claude Code plugin" section, not the canonical path.
- Update
agentv init's printSkillFirstInstructions() (in apps/cli/src/commands/init/index.ts) to drop the marketplace step from the recommended setup, or reframe it as the Claude-Code-specific path.
- Update each
plugins/agentv-dev/skills/<name>/SKILL.md to its discovery-stub form.
Related
Objective
Add an
agentv skillssubcommand that serves skill content from inside the CLI tarball, version-matched to the binary. Bundle the contents ofplugins/agentv-dev/skills/**into the publishedagentvnpm package so a singlenpm i -g agentvis sufficient for skill-driven use, without requiring a separateallagents plugin installstep.Why
Today the
agentvnpm package and its skills ship as two independently-installed artifacts with no version coupling:apps/cli/package.jsondeclaresfiles: ["dist", "README.md"]— no skill content in the tarball.Skills live in
plugins/agentv-dev/skills/**and are distributed through the marketplace declared in.claude-plugin/marketplace.json. The canonical setup inapps/web/src/content/docs/docs/getting-started/installation.mdxis:The CLI version a user installs (e.g.
agentv@4.20) is not coupled to the skill version their plugin manager resolves. This permits silent drift:agentv@4.20withagentv-devatmain, whereagentv-bench/SKILL.mddocuments commands introduced inagentv@4.22. The agent runs them, the CLI errors with "unknown command", and there's no version-skew signal.agentv@4.25with a months-old plugin install. Newer flags (--target copilot-sdk, etc.) are unknown to the skill — the agent works around them.llm_judge→llm-graderare normalized in YAML inputs bygrader-parser.ts, but skill text and JSONL output use different names, which is hard to diagnose without the version context.The skills also reference each other (
agentv-benchhands off toagentv-eval-reviewandagentv-trace-analyst). A partial mismatch between two skills installed at different times is harder to diagnose than a clean version error.A second, related limitation: today's skill content is reachable only via a plugin host. Agents running in Cursor, Codex, raw Claude API, or any harness that can run a shell can't load AgentV's skill content without first reaching into a Claude-specific plugin install path.
Prior art
vercel-labs/agent-browser(v0.26.0) ships skill content inside the CLI npm package and serves it via a built-inagent-browser skillssubcommand. The package'sfilesarray listsskill-dataandskillsalongsidebin. The Rust binary'scli/src/skills.rs(~600 LOC including tests) implementsskills list/get/pathwith a deterministic resolution order. The discovery stub inskills/agent-browser/SKILL.mdcarrieshidden: trueand its entire body redirects the agent toagent-browser skills get core, with the design statement made explicit:Drift is impossible by construction because skill content and binary live in the same tarball at the same version.
npm i -g agent-browser@latestupdates both atomically. The skills are also harness-portable — anything that can shell out toagent-browser skills get corecan load them.Full comparison: https://github.com/agentevals/agentevals-research/blob/main/research/findings/agent-browser/skill-delivery-vs-agentv.md
Design latitude
Surface
skills getoutput is the SKILL.md content (frontmatter included), matching how Claude Code consumes skills today.Bundling
plugins/agentv-dev/skills/**intoapps/cli/dist/skills/at build time (tsuponSuccessstep, or acpscript run beforetsup). The existingfiles: ["dist", "README.md"]then carries the skills into the published tarball without changes.agentv skillsunderapps/cli/src/commands/skills/. Resolution walks from the CLI binary's location to find the bundleddist/skills/directory. agent-browser'sfind_package_rootincli/src/skills.rsis the model — adapt the npm install layout (binary indist/, skills indist/skills/).apps/cli/tsup.config.ts(or equivalent build script).Marketplace plugin
Keep
agentv-devin the marketplace for Claude Code users who prefer plugin-based discovery. Rewrite each plugin SKILL.md to be a discovery stub: short body, redirect toagentv skills get <name>. This is the agent-browser pattern — the stub lives outside the install, so it must not try to teach. Concretely, today's full SKILL.md content moves into the bundleddist/skills/<name>/SKILL.md, and the marketplace plugin's SKILL.md becomes a few lines pointing the agent at the CLI subcommand.Open design choices
agentv-bench(most invoked) plusagentv-eval-writer. Bundling all six (agentv-bench,agentv-eval-writer,agentv-eval-review,agentv-trace-analyst,agentv-onboarding,agentv-governance) is also fine — skill markdown is small.plugins/agentv-dev/skills/**can stay as the source of truth and be copied intodist/skills/at build time, or skills can be moved toapps/cli/skills/with the marketplace plugin referencing them from there. Either is acceptable; the first preserves the marketplace plugin's current layout.agentv skillsmatchesagent-browser skills.agentv skill(singular) is also fine.--json. agent-browser exposes--jsonfor programmatic agents loading skills into their own context machinery. AgentV could ship--jsonfrom day one or land it as a follow-up — the win from bundling is independent of output format.Acceptance signals
agentv skills listreturns the same set of skill names that the marketplace plugin currently exposes (or the explicitly-chosen subset, if subsetting).agentv skills get agentv-benchreturns content byte-equivalent to theagentv-benchSKILL.md at the same release tag.npm i -g agentv@<release>is sufficient to drive evals via skills, with noallagents plugin installstep required.agentv skills get <name> --jsonparses as valid JSON with a stable schema (mirroring agent-browser:{ "success": true, "data": [{ "name", "content", "files"? }] }).hidden: true,--fullcollectingreferences/andtemplates/, and the not-found error path.Non-goals
agentic-engineeringoragentv-claude-traceout of the marketplace. Those plugins are not tightly coupled to the CLI version and the existing model fits them.Docs follow-up (in the same PR)
apps/web/src/content/docs/docs/getting-started/installation.mdxso the "Canonical Setup" isnpm install -g agentvalone. Thenpx allagents plugin marketplace add ...block becomes a "Claude Code plugin" section, not the canonical path.agentv init'sprintSkillFirstInstructions()(inapps/cli/src/commands/init/index.ts) to drop the marketplace step from the recommended setup, or reframe it as the Claude-Code-specific path.plugins/agentv-dev/skills/<name>/SKILL.mdto its discovery-stub form.Related