Skip to content

Commit 705b5b4

Browse files
authored
Merge pull request #24889 from dvdksn/agent-readiness-audit-skill
feat: add agent readiness audit skill
2 parents 7ebbbdc + ee06a88 commit 705b5b4

13 files changed

Lines changed: 930 additions & 83 deletions

File tree

Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
---
2+
name: agent-readiness-audit
3+
description: >
4+
Audit a documentation site for agent-friendliness: discovery, markdown
5+
delivery, crawlability, semantic structure, machine-readable surfaces,
6+
and content legibility. Use when asked to assess docs.docker.com or any
7+
docs site for AI/agent readiness, produce a scored report, compare with
8+
external scanners, or generate a remediation list. Triggers on:
9+
"audit docs for agent readiness", "how agent-friendly is docs.docker.com",
10+
"score our docs for AI agents", "review llms.txt / markdown / crawlability",
11+
"create an agent-readiness remediation plan".
12+
argument-hint: "<base-url>"
13+
---
14+
15+
# Agent Readiness Audit
16+
17+
Audit the live site, not the source tree alone. Prefer the same fetch path
18+
an external agent would use in the wild: direct HTTP requests, sitemap
19+
sampling, and page-level inspection.
20+
21+
Do not reduce the result to a homepage-only scan or a binary checklist.
22+
23+
## 1. Set scope
24+
25+
Use `$ARGUMENTS` as the base URL when provided. Otherwise infer the base
26+
URL from context and state the assumption.
27+
28+
Decide whether the host being audited is:
29+
30+
- a docs-only host
31+
- an app/tool host
32+
- a mixed host
33+
34+
This matters for optional checks such as MCP, plugin manifests, or other
35+
tool discovery files. Do not penalize a docs-only host for missing
36+
tooling manifests that belong on a separate service.
37+
38+
For `docs.docker.com`, treat the public docs host as docs-only. Docker's
39+
MCP server is published separately, so missing MCP files on the docs host
40+
should be reported as `N/A`, not as a failure.
41+
42+
## 2. Gather sitewide signals
43+
44+
Always check these resources first:
45+
46+
- `/llms.txt`
47+
- `/llms-full.txt`
48+
- `/robots.txt`
49+
- `/sitemap.xml`
50+
51+
Only check host-level tool manifests when the host is an app/tool host,
52+
mixed host, or explicitly advertises them:
53+
54+
- `/.well-known/ai-plugin.json`
55+
- `/.well-known/agent.json`
56+
- `/.well-known/agents.json`
57+
58+
Use the bundled script for a baseline:
59+
60+
```bash
61+
bash .agents/skills/agent-readiness-audit/scripts/baseline-probes.sh \
62+
"$ARGUMENTS"
63+
```
64+
65+
The script produces baseline evidence only. You still need to interpret
66+
what matters for a docs property and score it with the rubric.
67+
68+
For docs-only hosts, you may skip tool-manifest probes to reduce noise:
69+
70+
```bash
71+
CHECK_TOOL_MANIFESTS=0 \
72+
bash .agents/skills/agent-readiness-audit/scripts/baseline-probes.sh \
73+
"$ARGUMENTS"
74+
```
75+
76+
## 3. Sample representative pages
77+
78+
Use the sitemap when available. Do not rely on the homepage alone.
79+
80+
If `llms.txt` exists, sample some URLs from it as well. This helps catch
81+
stale or misleading discovery surfaces that a sitemap-only sample would miss.
82+
83+
Sample at least 12 pages when the site is large enough, and cover multiple
84+
page types:
85+
86+
- homepage or docs landing page
87+
- section landing pages
88+
- task guides
89+
- product manuals
90+
- reference or API pages
91+
- tutorial or learning pages
92+
93+
If the sitemap is missing or unusable, discover pages through internal
94+
links and note the lower confidence.
95+
96+
If the site has distinct delivery patterns, sample each one. For example:
97+
98+
- normal content pages
99+
- generated reference pages
100+
- versioned docs
101+
- localized docs
102+
103+
## 4. Run fetch-path checks on each sample
104+
105+
For each sampled page, verify:
106+
107+
- HTML fetch status, content type, and final URL
108+
- `Accept: text/markdown` behavior
109+
- direct markdown route behavior such as `<page>.md` or another stable path
110+
- page-level markdown alternate links and whether they actually resolve
111+
- whether page actions such as "Open Markdown" agree with the working route
112+
- whether the HTML title or H1 matches the markdown H1 closely enough for
113+
retrieval parity
114+
- whether main content is present in the initial HTML
115+
- redirect chain length and canonical URL consistency
116+
- obvious chrome/noise in the markdown response
117+
118+
Do not assume a `.md` mirror exists just because another site uses one.
119+
Verify the actual markdown path the site exposes.
120+
121+
Treat these as separate signals:
122+
123+
- negotiated markdown works
124+
- a stable direct markdown URL works
125+
- the page advertises the correct markdown URL
126+
127+
If the page advertises dead markdown alternates but a working markdown route
128+
exists, do not fail markdown delivery outright. Score it as a discoverability
129+
and consistency problem instead.
130+
131+
For API or generated reference pages, also verify whether a machine-readable
132+
asset such as OpenAPI YAML is directly linked and fetchable.
133+
134+
## 5. Judge structure and legibility
135+
136+
Measure structural signals:
137+
138+
- exactly one `h1`
139+
- sane heading hierarchy
140+
- `main` and `article` presence where appropriate
141+
- canonical tags
142+
- JSON-LD or breadcrumb structured data
143+
- stable anchors and deep-linkable headings
144+
145+
Also make a qualitative judgment about agent legibility:
146+
147+
- markdown strips site chrome cleanly
148+
- headings are specific and task-oriented
149+
- code blocks stay intelligible without client-side JS
150+
- the page is not dominated by banners, injected chat, or nav noise
151+
152+
Measure code block labeling explicitly when code samples are common. A page
153+
type with many untagged fenced blocks should lose points even if the prose is
154+
otherwise clean.
155+
156+
For page types that intentionally render interactive UIs with JavaScript,
157+
judge them separately from normal docs pages. If the HTML shell is thin,
158+
check whether the page still provides:
159+
160+
- a fetchable markdown summary
161+
- a directly linked machine-readable asset
162+
- a usable non-JS fallback
163+
164+
## 6. Score with the rubric
165+
166+
Use [references/rubric.md](references/rubric.md).
167+
168+
Rules:
169+
170+
- score only what you verified
171+
- mark non-applicable checks as `N/A`
172+
- normalize the final score against applicable points only
173+
- do not let optional manifest checks dominate the grade
174+
175+
Apply the foundational caps from the rubric. A site with broken discovery
176+
or broken markdown delivery should not earn a high grade because it has
177+
clean metadata.
178+
179+
Do not average away a weak page type. If one major page type, such as API
180+
reference, is materially worse than the rest of the corpus, call it out as
181+
the weakest segment and reflect it in the category notes.
182+
183+
## 7. Compare with external scanners when useful
184+
185+
If external scanner results are available, compare them to your live
186+
findings. Treat them as secondary evidence.
187+
188+
If a scanner and the live fetch disagree:
189+
190+
- trust the live fetch
191+
- report the mismatch explicitly
192+
- explain whether the scanner is testing a different assumption
193+
194+
## 8. Produce a remediation list
195+
196+
Turn findings into a short backlog:
197+
198+
- `P0`: fetchability or discovery blockers
199+
- `P1`: recurring structural or parity issues
200+
- `P2`: polish, optional manifests, or low-impact enhancements
201+
202+
For each remediation, include:
203+
204+
- the failing signal
205+
- why it matters to agents
206+
- a concrete fix
207+
- whether it is sitewide or page-type-specific
208+
209+
## 9. Report in a stable format
210+
211+
Use [references/report-template.md](references/report-template.md).
212+
213+
Always include:
214+
215+
- overall score and grade
216+
- confidence level
217+
- sampled URLs or sample strategy
218+
- category scores
219+
- highest-priority findings
220+
- remediation backlog
221+
222+
## Notes
223+
224+
- Favor docs-delivery checks over marketing-site heuristics.
225+
- Do not fail a docs host for lacking MCP or plugin manifests unless the
226+
host itself is meant to expose tools.
227+
- Treat raw byte size as supporting evidence, not as a primary scoring input.
228+
- Prefer short evidence excerpts and commands over long copied page text.
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Agent Readiness Report Template
2+
3+
Use this structure for final audit output.
4+
5+
```markdown
6+
## Agent Readiness Audit
7+
8+
**Site:** <base-url>
9+
**Date:** <YYYY-MM-DD>
10+
**Overall score:** <score>/100
11+
**Grade:** <A-F>
12+
**Confidence:** <High|Medium|Low>
13+
14+
### Summary
15+
16+
<2-4 sentence verdict focused on what an external agent can actually
17+
discover, fetch, and interpret on this site.>
18+
19+
### Category Scores
20+
21+
| Category | Score | Notes |
22+
| --- | ---: | --- |
23+
| Discovery and policy | <x>/<y> | <short note> |
24+
| Retrieval and markdown delivery | <x>/<y> | <short note> |
25+
| Structure and semantics | <x>/<y> | <short note> |
26+
| Crawlability and delivery behavior | <x>/<y> | <short note> |
27+
| Machine-readable surfaces | <x>/<y> | <short note or N/A> |
28+
| Content legibility | <x>/<y> | <short note> |
29+
30+
### Sample
31+
32+
- Sample strategy: <sitemap / internal links / explicit URLs>
33+
- Sampled pages: <count>
34+
- Page types covered: <landing, guide, manual, reference, ...>
35+
- Weakest page type: <if any>
36+
37+
### Findings
38+
39+
- `P0`: <highest-priority blocker with evidence>
40+
- `P1`: <important recurring issue with evidence>
41+
- `P2`: <lower-priority or optional improvement>
42+
43+
### Remediation
44+
45+
- `P0`: <fix>, because <why it matters to agents>
46+
- `P1`: <fix>, because <why it matters to agents>
47+
- `P2`: <fix>, because <why it matters to agents>
48+
49+
### Evidence
50+
51+
- Sitewide checks: <llms.txt, robots.txt, sitemap.xml, manifests>
52+
- Fetch-path checks: <markdown negotiation, direct markdown routes,
53+
advertised alternates, parity>
54+
- Structural checks: <h1/main/article/canonical/json-ld/title-h1 parity>
55+
- Code block checks: <fence count, language-tag coverage>
56+
- Scanner comparison: <optional>
57+
```
58+
59+
## Notes
60+
61+
- Keep the summary short and outcome-oriented.
62+
- Findings should refer to concrete URLs or page types.
63+
- If a criterion is `N/A`, say why instead of leaving it blank.

0 commit comments

Comments
 (0)