feat(deploy): added deploy wizard#260
Conversation
🦋 Changeset detectedLatest commit: fa5821c The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
!snapshot |
Snapshot publishednpm install -g clerk@1.3.1-snapshot.0dd30be
|
|
Stack: wyattjoh/deploy
Part of a stacked-prs chain. Do not merge manually. |
93d7a24 to
bf05f3e
Compare
bf05f3e to
ffb869f
Compare
a3d8b46 to
7e6819c
Compare
ffb869f to
162f269
Compare
- Preserve completed providers when pausing OAuth setup mid-loop, so `clerk deploy --continue` can finish multi-provider stacks. - Surface a warning for OAuth providers enabled in dev that the wizard does not yet support, instead of silently skipping them. - Close the gutter as Paused (not Failed) when DNS verification times out, since the state is recoverable via --continue. - Tighten the production-domain regex to reject malformed inputs like example..com or example-.com before they reach the API.
Move deploy lifecycle endpoint wrappers into the shared PLAPI client while routing the deploy wizard through a command-local adapter that defaults to mocked operations until the backend endpoints are ready.
Switch the deploy command from the in-process mock lifecycle to the live PLAPI endpoints. Add a typed error mapper that translates known PLAPI failures (plan_insufficient, home_url_taken, ssl_retry_throttled, etc.) into CliError with stable codes, with a recovery path for the production_instance_exists case so the wizard re-derives state instead of surfacing the error. Surface per-component progress (DNS / SSL / mail) during deploy_status polling, and drop the four hidden --test-fail-* CLI options now that failure injection is routed through the test's module mock. Collapse the api/mock indirection layer to a thin re-export of the plapi endpoints + a no-op configureMockDeployApi stub for the test seam, and strip the dead mockDeployApi implementation that the indirection used to back.
Swap the "Continue to OAuth setup?" yes/no prompt during initial production setup for the same verify/skip select used when resuming. Choosing skip records DNS as pending and continues to OAuth instead of pausing the deploy, so the dashboard remains the single place to monitor propagation.
- Tolerate getDeployStatus failures inside the reconcile-path snapshot read so the deploy continues to the verify-or-skip prompt instead of failing before the user can react. - Split the snapshot fetch into separate "Reading development configuration" and "Reading production configuration" spinners so the gutter reflects what is actually being loaded. - Add triggerDomainDnsCheck (POST .../dns_check) and call it best-effort when the user picks "Check DNS now" so an active check job is kicked rather than waiting on background reconciliation. - Type the dns_ok/ssl_ok/mail_ok booleans on DeployStatusResponse and use them to name the specific pending component in the timeout warning.
Switch deploy command imports back to lib/plapi.ts directly and remove the ./mock.ts harness plus the --test-force-* / --test-fail-* CLI flags it backed. The wrappers were added for an earlier mockable-deploy-API experiment that is no longer needed.
The Domain Connect URL helper returned Cloudflare's template for every domain regardless of the actual registrar — a misleading prompt for any user not on Cloudflare. Remove it entirely until NS-based registrar detection lands as its own change. Other cleanups from the same review pass: runDeploy throws a NOT_LINKED CliError on unlinked directories instead of warning and exiting zero; startNewDeploy's 409 production_instance_exists recovery now persists the recovered production instance id to the profile; OAuth skip routes through deployPausedError so its exit code matches Ctrl-C (1 instead of silent 0); runDnsVerification loops on timeout retry instead of recursing; and runOAuthSetup drops the redundant startIndex slice since the completed set already skips previously-saved providers.
Switch production deploy verification to the application domain status endpoint and remove obsolete retry-specific handling. Cap domain status polling to the shared 10-second budget while keeping 3-second intervals.
Trigger a fresh domain DNS check before status polling and retry status reads with exponential backoff while updating the spinner countdown.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
scripts/cleanup-test-users.test.ts (1)
11-28: ⚡ Quick winConsider simplifying the mock pattern.
The current mock reads its own call count via
mock.calls.length, which creates a circular type reference requiring the complex casting chain. A simpler approach uses a closure variable:♻️ Simpler mock pattern
test("retries transient Platform API failures before returning JSON", async () => { + let attempts = 0; globalThis.fetch = mock(async () => { - const attempts = (globalThis.fetch as unknown as ReturnType<typeof mock>).mock.calls.length; + attempts++; if (attempts === 1) { return new Response("temporary failure", { status: 500 }); } return Response.json({ instances: [] }); - }) as unknown as typeof fetch; + }) as typeof fetch; const result = await plapiGet("/v1/platform/applications/app_123", "ak_test", { delayMs: 0, }); expect(result).toEqual({ instances: [] }); expect(globalThis.fetch).toHaveBeenCalledTimes(2); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/cleanup-test-users.test.ts` around lines 11 - 28, The mock for globalThis.fetch is overcomplicated by reading its own mock.calls.length (causing the cast chain); change the mock in the test "retries transient Platform API failures before returning JSON" to use a closure-scoped counter variable (e.g., let attempts = 0) that you increment each invocation inside the mock instead of querying (globalThis.fetch as ...).mock.calls.length, and keep the existing behavior of returning a 500 on the first call and Response.json({ instances: [] }) thereafter so plapiGet is exercised and the expectation on call count still holds.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/cli-core/src/commands/deploy/prompts.ts`:
- Around line 64-72: The retry prompt in chooseDnsVerificationRetryAction
currently lists "Skip DNS verification for now" first so Enter defaults to skip;
update chooseDnsVerificationRetryAction (returning DnsVerificationAction) to
make "Check again" the safe default by either placing the { name: "Check again",
value: "check" } choice first in the choices array and/or explicitly setting the
select helper's default/initial value to "check" (use the select API's
initial/default property for this prompt if available) so pressing Enter
triggers "check" not "skip".
---
Nitpick comments:
In `@scripts/cleanup-test-users.test.ts`:
- Around line 11-28: The mock for globalThis.fetch is overcomplicated by reading
its own mock.calls.length (causing the cast chain); change the mock in the test
"retries transient Platform API failures before returning JSON" to use a
closure-scoped counter variable (e.g., let attempts = 0) that you increment each
invocation inside the mock instead of querying (globalThis.fetch as
...).mock.calls.length, and keep the existing behavior of returning a 500 on the
first call and Response.json({ instances: [] }) thereafter so plapiGet is
exercised and the expectation on call count still holds.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 20ba068c-6400-4d41-a9d3-8f51af568904
📒 Files selected for processing (21)
.changeset/deploy-wizard.md.claude/rules/logging.mdpackages/cli-core/src/cli-program.test.tspackages/cli-core/src/cli-program.tspackages/cli-core/src/commands/auth/login.tspackages/cli-core/src/commands/deploy/README.mdpackages/cli-core/src/commands/deploy/copy.test.tspackages/cli-core/src/commands/deploy/copy.tspackages/cli-core/src/commands/deploy/errors.tspackages/cli-core/src/commands/deploy/index.test.tspackages/cli-core/src/commands/deploy/index.tspackages/cli-core/src/commands/deploy/prompts.tspackages/cli-core/src/commands/deploy/providers.test.tspackages/cli-core/src/commands/deploy/providers.tspackages/cli-core/src/lib/errors.tspackages/cli-core/src/lib/plapi.test.tspackages/cli-core/src/lib/plapi.tspackages/cli-core/src/lib/spinner.test.tspackages/cli-core/src/lib/spinner.tsscripts/cleanup-test-users.test.tsscripts/cleanup-test-users.ts
💤 Files with no reviewable changes (2)
- packages/cli-core/src/lib/errors.ts
- packages/cli-core/src/commands/deploy/errors.ts
✅ Files skipped from review due to trivial changes (3)
- .claude/rules/logging.md
- packages/cli-core/src/commands/auth/login.ts
- .changeset/deploy-wizard.md
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/cli-core/src/commands/deploy/copy.ts
| let retriesRemaining = DEPLOY_STATUS_MAX_RETRIES; | ||
| let nextRetryDelay = DEPLOY_STATUS_INITIAL_RETRY_DELAY_MS; | ||
|
|
||
| for (const component of DEPLOY_COMPONENT_ORDER) { |
There was a problem hiding this comment.
retriesRemaining and nextRetryDelay live outside the component loop, so all three spinners share one budget. With the constants here (5 retries, 3s base, x2 backoff) that totals ~93s, and whichever component is checked first (mail) can spend all of it. Once it does, the dns and ssl spinners reach while (retriesRemaining > 0) with zero left and return false right away, reporting the domain as unverified even when one more poll would have flipped them. Give each component its own budget:
| let retriesRemaining = DEPLOY_STATUS_MAX_RETRIES; | |
| let nextRetryDelay = DEPLOY_STATUS_INITIAL_RETRY_DELAY_MS; | |
| for (const component of DEPLOY_COMPONENT_ORDER) { | |
| for (const component of DEPLOY_COMPONENT_ORDER) { | |
| let retriesRemaining = DEPLOY_STATUS_MAX_RETRIES; | |
| let nextRetryDelay = DEPLOY_STATUS_INITIAL_RETRY_DELAY_MS; |
| dns: response.dns?.status === "complete", | ||
| ssl: response.ssl ? checkStatusComplete(response.ssl) : false, | ||
| mail: checkStatusComplete(response.mail), |
There was a problem hiding this comment.
These three components handle a missing field three different ways. dns reads as pending when absent and ignores required entirely, ssl reads as pending when absent, but mail goes through checkStatusComplete where !undefined?.required is true, so an absent mail field reports as verified without ever being checked. Worth making all three uniform so the absent case means the same thing everywhere:
dns: checkStatusComplete(response.dns),
ssl: checkStatusComplete(response.ssl),
mail: checkStatusComplete(response.mail),That needs dns?: DomainCheckStatus in plapi.ts and checkStatusComplete taking { status: string; required?: boolean } | undefined. Decide whether an absent component should count as complete or pending and apply it in the one helper rather than per field.
| const productionInstanceId = ctx.productionInstanceId; | ||
| if (!productionInstanceId) return undefined; | ||
|
|
||
| const domain = await loadProductionDomain(ctx); |
There was a problem hiding this comment.
loadProductionDomain and loadDevelopmentOAuthProviders both take only ctx and don't depend on each other, yet they run one after the other, adding a round trip to every resume run. Only loadProductionState needs domain.id, so the first two can go together:
const [domain, oauth] = await Promise.all([
loadProductionDomain(ctx),
loadDevelopmentOAuthProviders(ctx),
]);
if (!domain) return undefined;
const { descriptors: oauthProviderDescriptors, unsupported } = oauth;| /** | ||
| * Prompt fields for existing deploy callers, falling back to standard OAuth credentials. | ||
| */ | ||
| export function providerFields(provider: OAuthProvider): OAuthField[] { |
There was a problem hiding this comment.
These five lookups (providerFields, providerCredentialLabel, providerRedirectLabel, providerSetupCopy, providerGotcha) repeat the same if (hasProviderOverride(provider)) return MAP[provider]; return default shape. A small generic collapses the repetition:
function fromOverride<T>(
provider: OAuthProvider,
map: Record<ProviderWithOverride, T>,
fallback: T,
): T {
return hasProviderOverride(provider) ? map[provider] : fallback;
}
export function providerFields(provider: OAuthProvider): OAuthField[] {
return fromOverride(provider, PROVIDER_FIELDS, [
{ key: "client_id", label: "Client ID" },
{ key: "client_secret", label: "Client Secret", secret: true },
]);
}
// providerCredentialLabel / providerRedirectLabel / providerSetupCopy / providerGotcha follow the same shape.Co-authored-by: Rafael Thayto <rafa.thayto@gmail.com>
Summary
Adds
clerk deploy, an interactive wizard that promotes a linked Clerk application from development to production. The wizard validates clone compatibility, creates the production instance and primary domain, walks the user through the returned CNAME records (with optional BIND zone export to./clerk-<domain>.zone), verifies DNS/SSL/mail with a per-component spinner chain, and collects production OAuth credentials for Google, GitHub, Microsoft, Apple, and Linear.State is derived from the Platform API on each run, not persisted locally. The wizard reads the live application, domains, deploy status, and instance config on entry and resumes from the next pending step. The only profile write is
instances.productiononce the production instance exists. Pressing Ctrl-C or picking "Skip" at an OAuth provider exits withDeploy pausedand exit code 1; rerunning resumes from whatever the API now reports.Platform API errors specific to the deploy lifecycle are mapped to typed
CliErrors incommands/deploy/errors.ts(unsupported plan features, provider/taken domains,production_instance_existsrecovery, SSL/mail retry throttles). The wizard refuses to start in agent mode because production configuration is interactive.Test plan
bun run format:checkbun run lintbun run typecheckbun run testbun run test:e2e:operror: No Clerk project linked; linked directory walks through create → DNS records → BIND export → verification → OAuth → production summary; Ctrl-C at OAuth exits withDeploy pausedand exit code 1; rerun resumes at the skipped provider.