test: surface-wide guard that no command response emits the internal apple platform by thymikee · Pull Request #1005 · callstack/agent-device

thymikee · 2026-07-01T19:05:12Z

Summary

A surface-wide guard enforcing the Platform-collapse non-breaking contract: NO public command response may emit the internal apple platform token on the wire (approach (b) keeps machine output leaf ios/macos). Three leaks were previously found one-at-a-time (open/perf response, perf-memory support, doctor byPlatform), proving per-site guards are insufficient — so this replaces them with one catalog-driven net.

New file: test/integration/provider-scenarios/apple-platform-output-guard.test.ts (runs in the provider-integration vitest project).

What the guard drives

Stands up a fake-provider daemon for BOTH a macOS Apple session (platform:'apple', appleOs:'macos') and an iOS-simulator Apple session (platform:'apple', iOS family) — no real hardware, permissive runner/tool/recording provider seams so every command reaches a scannable response.
Drives every public command against both worlds and deep-scans each serialized response, flagging a leak when any string VALUE or any object KEY exactly equals 'apple' (case-sensitive exact match — zero false positives on com.apple.Preferences bundle ids or the Apple TV device name).
Covers success responses (priority platform-bearing commands emit the leaf) and key error responses — driving the same set against both worlds naturally produces UNSUPPORTED-on-the-wrong-Apple-OS errors (e.g. shutdown/push on macOS), which are scanned too.
Exercise assertion: each world must emit its public leaf (ios/macos) in at least one response, so a regression that swapped the leaf back to apple trips the leak scan AND drops the leaf.

Auto-covers new commands via the catalog

DRIVEN_COMMANDS and SKIPPED_COMMANDS are keyed by PUBLIC_COMMANDS values. A partition test fails if any catalog command is in neither set (or in both), so a newly added command can't silently escape the guard — the author must categorize it.

Driven: all 50 public commands.
Skipped: 0. The skip-set is intentionally empty (kept as an explicit, enforced set): every command is driveable here — an orchestrator with no real workload (test/replay/batch) just returns a fast, still-scanned error response.

Leak caught + fixed

doctor data.platform — src/daemon/handlers/session-doctor.ts:132 echoed the raw internal scope.device?.platform ('apple') when doctor ran against a bound session with no --platform flag. Fixed by projecting the resolved device through publicPlatformString(device); the raw inventory selector fallback (a user-typed --platform value) is preserved, so the existing doctor --platform apple test stays green.

This is a leak PR #1004 does not fix — #1004 touches session-doctor-app/device/options.ts and explicitly leaves session-doctor.ts alone (it reasoned the :111 value is "an internal selector, not output"), but it is output via data.platform. This is exactly the class of miss a surface-wide guard exists to catch. Verified with teeth: reverting the fix makes the guard fail on both worlds with doctor: value at $.result.data.platform; independently reverting appstate's projection fails with appstate: value at $.result.data.platform.

Tracked (not duplicated)

doctor byPlatform.apple KEY — fixed by PR fix: project doctor output platform to the public leaf after the Platform collapse #1004 (open, not yet on main). Tolerated via one narrow, documented allowlist entry (doctor + kind:'key' + /\.byPlatform\.apple$/), removable once fix: project doctor output platform to the public leaf after the Platform collapse #1004 lands. Not re-fixed here.

Out-of-scan observations (documented, not enforced)

The scan is deliberately scoped to exact apple value/key positions (the shapes of all three known leaks). Two comma-list token positions surfaced during development and are out of scope, noted here for follow-up:

error.data.supportedOn = "apple, android" on UNSUPPORTED errors — derived from the capability-family bucket name, which was already apple before the Platform collapse (commit cd1551b), so it is not a collapse regression and not a device-platform projection.
doctor checks[].command = "agent-device apps --platform apple …" — the apps/close suggestion strings, which PR fix: project doctor output platform to the public leaf after the Platform collapse #1004 already projects to the leaf.

Verification (full CI check set, on the final tree)

tsc -p tsconfig.json — clean
oxlint . --deny-warnings — clean (exit 0)
oxfmt --write src test + --check — all files correctly formatted
layering/check.ts — OK (679 source files)
rslib build — succeeds
vitest run --project provider-integration — 85 passed (82 + 3 new)
vitest run --project unit — 2963 passed
vitest run --coverage — 3048 passed; Statements 84.84% (≥78), Lines 87% (≥80)
fallow audit --base origin/main — ✓ no issues

The android fillAndroid timeout test is a known CPU-contention flake — it flaked during one heavy coverage run and passes 92/92 in isolation; a subsequent coverage run was fully green.

Human-review only — do not merge.

github-actions · 2026-07-01T19:06:40Z

Size Report

Metric	Base	Current	Diff
JS raw	1.5 MB	1.5 MB	+2 B
JS gzip	477.9 kB	477.9 kB	+2 B
npm tarball	578.7 kB	578.7 kB	0 B
npm unpacked	2.0 MB	2.0 MB	+2 B

Startup median (7 runs, lower is better):

Scenario	Base	Current	Diff
CLI --version	26.6 ms	26.6 ms	-0.0 ms
CLI --help	47.4 ms	49.9 ms	+2.6 ms

Top changed chunks:

Chunk	Raw diff	Gzip diff
`dist/src/session.js`	+2 B	+2 B

…e platform Adds a provider-integration guard (apple-platform-output-guard.test.ts) that stands up a fake-provider daemon for BOTH a macOS Apple session and an iOS-simulator Apple session, drives EVERY public command off PUBLIC_COMMANDS, and deep-scans each serialized response for the internal 'apple' platform token (any string VALUE or object KEY that exactly equals 'apple'). The guard is catalog-driven: a partition test fails if a new public command is neither in DRIVEN_COMMANDS nor SKIPPED_COMMANDS, so a new command can't silently escape the check. All 50 public commands are driven; the skip-set is empty. Caught + fixed a leak PR #1004 misses: doctor's data.platform (session-doctor.ts) echoed the raw internal device.platform ('apple') when doctor ran against a bound session with no --platform flag. Now projected via publicPlatformString(device). doctor's byPlatform.apple KEY leak stays tracked to PR #1004 via a narrow, documented allowlist entry (not duplicated here).

github-actions · 2026-07-02T05:19:11Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-07-02 05:19 UTC

thymikee force-pushed the guard-apple-output-leaks branch from dfb8c85 to eda2965 Compare July 1, 2026 19:09

thymikee merged commit 3691715 into main Jul 2, 2026
21 checks passed

thymikee deleted the guard-apple-output-leaks branch July 2, 2026 05:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test: surface-wide guard that no command response emits the internal apple platform#1005

test: surface-wide guard that no command response emits the internal apple platform#1005
thymikee merged 1 commit into
mainfrom
guard-apple-output-leaks

thymikee commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thymikee commented Jul 1, 2026

Summary

What the guard drives

Auto-covers new commands via the catalog

Leak caught + fixed

Tracked (not duplicated)

Out-of-scan observations (documented, not enforced)

Verification (full CI check set, on the final tree)

Uh oh!

github-actions Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Size Report

Uh oh!

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jul 1, 2026 •

edited

Loading