ci: get Hub-Client E2E Playwright Tests running by gordonwoodhull · Pull Request #172 · quarto-dev/q2

gordonwoodhull · 2026-05-11T02:08:02Z

Get E2E Playwright tests running.
Enable them for PRs.

Summary

The Hub-Client E2E Tests workflow had never actually executed on a runner since it was added — every push to main queued for the 24-hour Actions maximum and was cancelled because runs-on: ubuntu-latest-8x isn't a runner label this repo can request. This PR gets the workflow running and green on a stock ubuntu-latest runner.

Three commits, deliberately split by concern:

ci: enable Hub-Client E2E Tests workflow — every fix needed in .github/workflows/hub-client-e2e.yml: runner label, build ordering, WASM tooling (wasm-bindgen-cli pinned to Cargo.lock, rust-src component, nightly pin to dodge a rustc SIGSEGV on tokio for wasm32), dtolnay action ref form, TS build scoping, hub binary pre-build, and a fix for the broken baseline auto-commit step (glob + permissions).
test(e2e): make hub-client e2e suite pass on CI — every fix needed in the test code that the workflow surfaced once it could actually run: a TS2345 in the vitest mock helper, an indexedDB shim in the Playwright Node-side helper, a Vite proxy env to point /auth/* at the e2e hub's port, a testIgnore for visual specs so they only run via the dedicated config, cutting CI workers from 4 → 2 to match the 2-core runner, and a SKIP_WASM_UNSUPPORTED entry for the one fixture that exercises a real WASM gap.
Add missing Playwright visual regression baselines — 6 auto-generated chromium-linux baselines committed back by the workflow itself on its first successful run; kept as a separate commit by github-actions[bot] to preserve provenance.

The remaining WASM gap is bd-izfv (Phase 9 follow-up: thread user_grammars through RenderToHtmlRenderer). A complete TDD plan for that fix is already on branch beads/bd-izfv-thread-user-grammars. When bd-izfv lands, the SKIP_WASM_UNSUPPORTED entry added in this PR can be removed and the user-grammar fixture will run; Phase 5 of that plan explicitly calls out the restore step.

Filed during this work:

bd-1jnb (P2): q2-demos/* vite build fails resolving /src/wasm-js-bridge/cache.js — the workflow scopes the TS build to skip them.
bd-233j (P3): noisy [TAG_RESOLVE_FAILED] YAMLWarning from !str tags in playwright stdout — cosmetic, not a failure cause.

Test plan

Successful CI run: https://github.com/quarto-dev/q2/actions/runs/25645506576 (conclusion success, 76 tests passed, 6 baselines auto-committed)
Local repro of the workers-contention failure (4 workers reproduced random preview-iframe timeouts on macOS; dropping to 2 cleared them)
Single-test local pass via npx playwright test --grep "01-builtin-python" in the worktree (5.7s)
git diff origin/main HEAD reviewed and the 3-commit history is lossless against the pre-squash branch

gordonwoodhull · 2026-05-11T02:38:23Z

Two additions after the initial PR:

pull_request trigger on hub-client-e2e.yml (folded into the first commit). PRs that touch hub-client/** or the workflow file now run the e2e suite automatically, so this very PR will prove itself green before merge. Same path filter as the existing push: branches: [main] trigger.
Cherry-picked ts-test-suite.yml nightly pin (new commit af0eec95, originally aa3e9ad8 on feature/q2-preview). The pre-existing TS Test Suite workflow was failing on this PR with the same rustc-SIGSEGV-on-tokio-wasm that we already worked around for the e2e workflow; picking up the same fix keeps the matrix legs green. When feature/q2-preview lands first, this commit will deduplicate out (or be dropped during the eventual rebase).

The branch is now four commits, with all .github/workflows/ changes grouped at the start of the stack and the bot's baselines commit kept separate at the tip.

PR #172's first pull_request-triggered run hit a worker-contention timeout in metadata/format-specific/doc-format-overrides.qmd — the preview iframe didn't render within 45s on any of the 3 attempts. Same shape as the failures we saw with 4 workers (cleared by dropping to 2), but now appearing at 2 workers because the runner happened to be slower this time. 8 other smoke-all tests passed only on retry, i.e. they were right at the edge. Bump waitForPreviewRender from 45s to 75s, and the per-test ceiling from 60s to 90s so the wider preview wait can actually fire. Fast tests still pass fast; slower fixtures under contention get the breathing room they need.

5-phase plan to migrate hub-client e2e from `vite dev` to `vite preview` as the real fix for the worker-contention flakes the previous commit papered over with a 75s preview-render timeout. Key motivating numbers (measured on the local dist/): 32 MB wasm_quarto_hub_client_bg.wasm 1.8 MB automerge_wasm_bg.wasm 192 KB web-tree-sitter.wasm ~5 MB dart-sass dynamic-import bundle ~3 MB Monaco editor chunks ---- ~42 MB+ of binary assets served per fresh Playwright browser context `vite dev` serves these uncompressed through a single-threaded plugin pipeline that also has to transform-on-demand the hundreds of TS/JSX modules in the hub-client source tree. With 2 Playwright workers contending for one dev server on a 2-core ubuntu-latest runner, the cold-context page-load tail dominates and randomly exceeds 45s. `vite preview` serves the same content from a prebuilt, hash-named, gzip-compressible dist/ directory with no transform pipeline — should drop ~50 MB of wire traffic by 3-4x and remove the serialization point entirely. Target: e2e workflow under 12 min, Run E2E tests step under 5 min, flaky count ≤ 2, zero hard failures across 3 consecutive runs. Plan is self-contained, scoped to this branch, validated by pushing back to PR #172.

PR #172's first pull_request-triggered run hit a worker-contention timeout in metadata/format-specific/doc-format-overrides.qmd — the preview iframe didn't render within 45s on any of the 3 attempts. Same shape as the failures we saw with 4 workers (cleared by dropping to 2), but now appearing at 2 workers because the runner happened to be slower this time. 8 other smoke-all tests passed only on retry, i.e. they were right at the edge. Bump waitForPreviewRender from 45s to 75s, and the per-test ceiling from 60s to 90s so the wider preview wait can actually fire. Fast tests still pass fast; slower fixtures under contention get the breathing room they need.

@master

The workflow had never actually run on a runner — every push to main since it was added queued for the 24-hour GitHub Actions maximum and was cancelled because `runs-on: ubuntu-latest-8x` isn't a runner label this repo can request. Switching to `ubuntu-latest` unblocked everything else; from there the workflow itself needed several layers of fixes to actually pass. Workflow fixes (hub-client-e2e.yml, with parallel changes to ts-test-suite.yml where applicable): - Runner label: ubuntu-latest-8x → ubuntu-latest. - Build order: build the WASM module before the TypeScript packages, since hub-client's vite build imports the WASM JS glue. - WASM tooling: install wasm-bindgen-cli pinned to the version in Cargo.lock, and add the rust-src component (required by -Zbuild-std=std,panic_unwind in the wasm-quarto-hub-client crate's .cargo/config.toml). - Rust nightly: invoke dtolnay/rust-toolchain@master with no `toolchain:` input so the action reads the pin from rust-toolchain.toml (bd-at72, nightly-2026-04-28 from main). Single source of truth — no duplicate `RUSTUP_TOOLCHAIN` env var. - dtolnay/rust-toolchain action: dated nightlies need @master with a toolchain input, not @nightly-YYYY-MM-DD as a ref. - TypeScript build scoping: only build ts-packages + hub-client. The q2-demos workspaces fail their vite build (vite can't resolve an absolute import in the wasm-bindgen output) and trace-viewer is out of scope. Tracked as bd-1jnb. - Pre-build the hub binary: globalSetup launches it with `cargo run --bin hub` and waits 120s for "Hub server listening" — on a cold runner the cargo compile exceeded that. - Baseline-commit step: switch the broken `**` glob to `find ... -name '*-snapshots' -exec git add -f {} +` (bash globstar isn't enabled by default; visual specs sit directly under hub-client/e2e/, not one directory deeper) and grant `contents: write` so the default GITHUB_TOKEN can push the auto-generated baselines back to the branch. - Add a `pull_request` trigger with the same path filter as `push`, so PRs that touch hub-client or the workflow file have to prove themselves green before merging. Test-side bugs that surfaced once the workflow could actually run: - TS2345 in client.test.ts: installMockRepo used `ReturnType<typeof createMockHandle>` without supplying T, so the parameter defaulted to unknown. Forwarding T through makes the helper generic and the test compiles again. - Node-side IndexedDB shim: projectFactory.ts runs in the Playwright test process (Node, not browser) but createSyncClient instantiates an IndexedDBStorageAdapter unconditionally. Import fake-indexeddb/auto, matching how sync-test-harness already handles vitest. - Vite proxy target: hub-client's vite.config.ts proxies /auth/* and the websocket to VITE_HUB_SERVER (default http://localhost:3000), but globalSetup starts the e2e hub on port 3030 — so vite returned HTTP 500 from /auth/me on every test, blocking the in-browser hub-client from rendering the preview iframe. Pass VITE_HUB_SERVER=http://localhost:3030 to the webServer env. - Functional config picks up visual specs: testDir './e2e' with no testIgnore was finding setup-screens.visual.spec.ts and running it through the functional config, which has no missing-baseline retry. Skip *.visual.spec.ts so those run only via playwright.visual.config.ts. - CI workers: drop from 4 to 2. The 4-workers value was sized for the original ubuntu-latest-8x; under ubuntu-latest (2 cores) random tests stalled in the WASM render pipeline and missed the 45s preview-iframe deadline non-deterministically. Reproduced locally with 4 workers (a different test failed each run); 2 workers cleared all flakes. - Hub-client WASM gap (bd-izfv): the project-render path (RenderToHtmlRenderer) drops the user_grammars provider on the floor, so the smoke-all fixture highlighting/03-user-grammar/03-user-grammar-toml.qmd renders a bare <code> block instead of highlighted TOML. Add a SKIP_WASM_UNSUPPORTED map in smokeAllDiscovery.ts pointing at bd-izfv (which has a complete TDD plan on branch beads/bd-izfv-thread-user-grammars). Also adds the missing Playwright visual regression baselines under hub-client/e2e/ that the visual config expects on first run.

PR #172's first pull_request-triggered run hit a worker-contention timeout in metadata/format-specific/doc-format-overrides.qmd — the preview iframe didn't render within 45s on any of the 3 attempts. Same shape as the failures we saw with 4 workers (cleared by dropping to 2), but now appearing at 2 workers because the runner happened to be slower this time. 8 other smoke-all tests passed only on retry, i.e. they were right at the edge. Bump waitForPreviewRender from 45s to 75s, and the per-test ceiling from 60s to 90s so the wider preview wait can actually fire. Fast tests still pass fast; slower fixtures under contention get the breathing room they need.

Cuts wall time and removes flakes by replacing the dev-mode hub-client server with a prebuilt bundle served through `vite preview`. The win is mostly mechanical: a cold Playwright context downloads ~50 MB of assets (WASM dominates at 32 MB). Through `vite dev` that goes uncompressed and serializes through a single-threaded transform pipeline; through `vite preview` with a gzip middleware it's ~5.6 MB on the wire and served as static files. Under 2 Playwright workers on a 2-core CI runner this was the source of the "preview iframe didn't render in 45s" flakes that the previous commit (75s timeout bump) papered over. CI on chore/e2e-ci across 3 consecutive runs: 0 hard failures, Run E2E tests step 5.3-8.1 min (was 14.1 min, ~42% faster), flakes 3-7 (was 12). Local smoke-all: 1.9 min / 1 flake → 1.1 min / 0 flakes. See `claude-notes/plans/2026-05-11-vite-preview-for-e2e-tests.md` for the 5-phase plan and motivating numbers. Major pieces - hub-client/vite.config.ts: mirror `server.proxy` into `preview.proxy` (preview ignores `server.proxy`). Add a small `configurePreviewServer` plugin that runs `compression()` middleware; override its filter so `application/wasm` is included (mime-db marks it non-compressible by default — in practice it gzips ~6:1). - hub-client/playwright.config.ts: `webServer.command` → `vite preview`; comment why it's not `vite dev` so the next agent doesn't "helpfully" revert it for HMR. - hub-client/ast-renderer.html: moved from `public/` to project root. When it lived in `public/` the build emitted two copies — the transformed one at `dist/public/ast-renderer.html` and a raw copy at `dist/ast-renderer.html` (with a dev-only `<script src="/src/...tsx">` reference). The iframe `src="/ast-renderer.html"` hit the raw one in preview mode and the q2-debug E2E test broke. This was a latent prod bug — `vite dev` happened to mask it via source-path resolution. Test-hook plumbing - src/test-hooks.ts (new): registers `window.__quartoTest = { projectStorage, wasmRenderer }`. Tree-shaken out of any build without `VITE_E2E=1`. - src/main.tsx: kicks off `import('./test-hooks')` and stores the promise on `window.__quartoTestReady`. Top-level `await` here doesn't help (the `load` event fires before module top-level awaits resolve), so tests `await window.__quartoTestReady` before reading the hooks. - e2e/helpers/testHooks.ts (new): typed global augmentation. - e2e/helpers/previewExtraction.ts, projectFactory.ts, share-link-project-set.spec.ts: 5 `await import('/src/services/...ts')` call sites replaced with `await window.__quartoTestReady` + `window.__quartoTest.{projectStorage,wasmRenderer}`. The dev-only source-path imports stopped working under `vite preview` (prod bundles don't expose source paths). - package.json: `test:e2e[:ui]` scripts now set `VITE_E2E=1` and build before running playwright (since preview serves from `dist/`). - .github/workflows/hub-client-e2e.yml: `VITE_E2E: '1'` on the Build TypeScript packages step. Side effects worth knowing - Running `npx playwright test` directly (bypassing `npm run test:e2e`) serves whatever was last built into `dist/` — possibly stale. The npm-script path always rebuilds. - New devDeps `compression` + `@types/compression` (83 transitive packages, dev-only, no shipped code). No production runtime change. - `vite dev`, `vite build` defaults, and the production user bundle are untouched: the compression plugin only fires under `vite preview`, and test-hooks is dead-code-eliminated without `VITE_E2E=1`.

PR #172's first pull_request-triggered run hit a worker-contention timeout in metadata/format-specific/doc-format-overrides.qmd — the preview iframe didn't render within 45s on any of the 3 attempts. Same shape as the failures we saw with 4 workers (cleared by dropping to 2), but now appearing at 2 workers because the runner happened to be slower this time. 8 other smoke-all tests passed only on retry, i.e. they were right at the edge. Bump waitForPreviewRender from 45s to 75s, and the per-test ceiling from 60s to 90s so the wider preview wait can actually fire. Fast tests still pass fast; slower fixtures under contention get the breathing room they need.

gordonwoodhull force-pushed the chore/e2e-ci branch from b30144c to a40b38e Compare May 11, 2026 02:37

gordonwoodhull changed the title ~~ci: get Hub-Client E2E Tests running in CI for the first time~~ ci: get Hub-Client E2E Tests running in CI May 11, 2026

gordonwoodhull changed the title ~~ci: get Hub-Client E2E Tests running in CI~~ ci: get Hub-Client E2E Playwright Tests running in CI May 11, 2026

gordonwoodhull changed the title ~~ci: get Hub-Client E2E Playwright Tests running in CI~~ ci: get Hub-Client E2E Playwright Tests running May 11, 2026

gordonwoodhull force-pushed the chore/e2e-ci branch from b6fed4a to 685a9cb Compare May 12, 2026 13:56

gordonwoodhull added 3 commits May 12, 2026 11:53

gordonwoodhull force-pushed the chore/e2e-ci branch from 685a9cb to dc9fcd0 Compare May 12, 2026 15:53

gordonwoodhull merged commit f514cde into main May 12, 2026
5 checks passed

gordonwoodhull deleted the chore/e2e-ci branch May 12, 2026 16:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: get Hub-Client E2E Playwright Tests running#172

ci: get Hub-Client E2E Playwright Tests running#172
gordonwoodhull merged 3 commits into
mainfrom
chore/e2e-ci

gordonwoodhull commented May 11, 2026 •

edited

Loading

Uh oh!

gordonwoodhull commented May 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gordonwoodhull commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

gordonwoodhull commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gordonwoodhull commented May 11, 2026 •

edited

Loading

gordonwoodhull commented May 11, 2026 •

edited

Loading