fix(frost): refuse scaffold FFI signing path without operator opt-in#3960
Merged
mswilkison merged 1 commit intoMay 22, 2026
Conversation
#3959 fenced creation of scaffold-era signer material (resolver refuses to build `legacy-wallet-pubkey` material without the env opt-in) and result acceptance (FFI primitive refuses to substitute the Rust-returned key group for the placeholder when source is scaffold). What neither fix covered: scaffold material persisted from a previous opted-in session can still drive the FFI signing path on later runs after the operator has unset the flag. The signing path feeds `buildTaggedTBTCSignerDKGPlaceholderPublicKeyHex(identifier)` — a non-curve-point 3-byte string like `"020001"` — into the Rust signer's `RunDKG` for every included participant. The bootstrap signer is permissive about this shape; a future production signer would not be, and the per-cycle leak of placeholder identifiers into telemetry and blame attribution is its own concern. Close the persistence-vs-execution gap by refusing to enter the FFI signing path at all when the payload is scaffold-era and the operator has not actively opted in for this process. The check sits at the top of `signWithTBTCSignerCoarseEngine` immediately after the signer material is decoded — before the engine availability check, before member-set determination, before `buildTaggedTBTCSignerRunDKGInputsForIncludedMembers` runs — so placeholder participant pubkeys are never built when the flag is unset. The fence is per-call (not cached), matching the contract on `AcceptScaffoldKeyGroupEnabled`: flipping the env back unset recovers fail-closed behavior without a restart. The error wraps `ErrNativeCryptographyUnavailable` so any caller that already handles the "native cryptography is unavailable" condition treats this the same way; downstream callers that want to surface a more specific "scaffold refused" diagnostic can match on `AcceptScaffoldKeyGroupEnvVar` in the error message. New regression test pins the fence behavior: `..._Sign_TBTCSignerPath_RefusesScaffoldMaterialWithoutOptIn` registers a working mock engine, omits the env var, calls `Sign`, and asserts the refusal references both the env var and the placeholder source, plus that `RunDKG`, `StartSignRound`, and `FinalizeSignRound` were all not called. The existing scaffold-path tests (`..._BootstrapVersion_LegacyKeyGroupSourceUsesRunDKGResult`, `..._BootstrapVersion_InvalidCoarseSignatureFallsBack`, `..._NoEngineNoLegacyShare`, `..._AttemptVariationRunDKGConflictFallsBack`, `..._BootstrapVersion_AttemptVariationStartSignRoundConflictFallsBack`, `..._InvalidAttemptPolicy_DoesNotFallback`, `..._ConsumedAttemptReplay_DoesNotFallback`) opt in via `t.Setenv` so they continue to exercise the scaffold path beyond the fence. Verification (local, GOCACHE under /private/tmp): go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/... go test -tags 'frost_native frost_tbtc_signer' ./pkg/tbtc -run \ 'TestConfigureFrostSigningBackend|TestNewNode_ConfiguresFrostSigningBackend|TestSigningExecutor_Sign|TestRegisterSignerMaterialResolverForBuild' All pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Base automatically changed from
fix/frost-readiness-scaffold-opt-in-2026-05-22
to
feat/frost-schnorr-migration-scaffold
May 22, 2026 22:48
8546179
into
feat/frost-schnorr-migration-scaffold
15 checks passed
mswilkison
added a commit
that referenced
this pull request
May 22, 2026
…#3961) ## Summary \`isBuildTaggedTBTCSignerConsumedAttemptReplayError\` previously did a case-insensitive substring match on \`err.Error()\` looking for \`attempt_id\` plus \`already consumed for sign attempt\`. Any cosmetic rewording of the Rust-side format string on either side would silently disable replay detection. This PR makes the detector prefer the structured \`code\` field on the FFI error envelope when it is reachable, falling back to substring matching only for older signer builds. Stacked on #3960. The matching Rust-side change is a separate PR on \`tlabs-xyz/tbtc\` that introduces the dedicated \`EngineError::ConsumedAttemptReplay\` variant with stable code \`consumed_attempt_replay\`. ## Changes - New \`buildTaggedTBTCSignerStructuredError\` type carries \`Code\` and \`Message\` from the FFI envelope. \`buildTaggedTBTCSignerResultStatusError\` wraps it via \`%w\` so callers can extract via \`errors.As\`. - \`buildTaggedTBTCSignerErrorMessage\` refactored into \`buildTaggedTBTCSignerErrorPayload\` returning the struct. Rendered chain string still formats as \`\"code: message\"\` so log readers see no change. - Detector implements a small policy ladder: 1. Structured envelope with \`consumed_attempt_replay\` code → replay. 2. Structured envelope with legacy \`validation_error\` code → substring-match only the \`Message\` field (pre-dedicated-variant signers route the replay path through Validation). 3. Any other recognized code → not a replay (authoritative). 4. No structured envelope reachable → substring-match the whole rendered string (pre-envelope signers). - Two new tests: - \`TestIsBuildTaggedTBTCSignerConsumedAttemptReplayError\` — eight cases covering all four ladder branches. - \`TestBuildTaggedTBTCSignerErrorPayload\` — structured / message-only / empty / malformed payloads. ## Verification - \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/... ./pkg/bitcoin\` — PASS - \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/tbtc -run 'TestConfigureFrostSigningBackend|TestNewNode_ConfiguresFrostSigningBackend|TestSigningExecutor_Sign|TestRegisterSignerMaterialResolverForBuild'\` — PASS
3 tasks
mswilkison
added a commit
that referenced
this pull request
May 23, 2026
…#3979) ## Summary **Closes Phase 5 of RFC-21.** Adds the explicit-operator-opt-in gate that protects production builds with the \`frost_roast_retry\` tag from entering the orchestration path without an explicit operator decision. Production builds with the build tag enabled now refuse to wire orchestration unless the operator sets \`KEEP_CORE_FROST_ROAST_RETRY_ENABLED=true\`. The pattern matches the existing \`KEEP_CORE_FROST_TBTC_SIGNER_ACCEPT_SCAFFOLD_KEY_GROUP\` env var from PR #3960: a build tag enables the code path, an env var enables the wiring, both must agree for the feature to be live. Stacked on #3978 (Phase 5.2). ## What lands ### New file: \`pkg/frost/signing/roast_retry_readiness.go\` (untagged) | Surface | Notes | |---|---| | \`RoastRetryReadinessOptInEnvVar\` constant | \`KEEP_CORE_FROST_ROAST_RETRY_ENABLED\` | | \`ErrRoastRetryReadinessOptOut\` sentinel | for \`errors.Is\` detection | | \`EnsureRoastRetryReadinessOptIn() error\` | returns nil if env var is \`"true"\` (case-insensitive, whitespace-trimmed); error otherwise | | \`RoastRetryReadinessOptInEnabled() bool\` | boolean helper | | Per-call (not cached) | operators can flip without restart | ### \`roast_retry_orchestration.go\` extended \`BeginOrchestrationForSession\` now calls \`EnsureRoastRetryReadinessOptIn\` **before** consulting the registry. The env var is the load-bearing gate; missing opt-in short-circuits orchestration even when the registry has a real coordinator. ## Why per-call rather than init-time The scaffold-opt-in env var from PR #3960 is also evaluated per call. Operators can flip it during a debugging session without restart. Same reasoning here -- the readiness env var is a kill switch, not a build constant. ## Test coverage | File | Cases | |---|---| | \`roast_retry_readiness_test.go\` (untagged, 7 cases) | accepts \`"true"\`; case-insensitive; whitespace-trimmed; rejects unset (with sentinel + env-var-name in error); rejects other values; bool helper mirrors error; **env var name matches RFC-21 spec (locks the name)** | | \`roast_retry_orchestration_frost_roast_retry_test.go\` (extended) | New \`TestBeginOrchestrationForSession_ErrorsWhenReadinessOptInUnset\` -- asserts a registered coordinator alone is not enough; missing env var still short-circuits orchestration | ## Verification | Command | Result | |---|---| | \`go build ./...\` | clean | | \`go test ./pkg/frost/signing/...\` | pass (default build) | | \`go test -tags 'frost_roast_retry' ./pkg/frost/signing/...\` | pass | | \`go test -race -tags 'frost_native frost_tbtc_signer frost_roast_retry' ./pkg/frost/...\` | pass (5 packages) | | \`staticcheck -checks '-SA1019' ./pkg/frost/...\` | silent | | \`go vet ./pkg/frost/...\` | clean | | \`gofmt -l ./pkg/frost/signing/\` | silent | ## Phase 5 complete | PR | Scope | State | |---|---|---| | 5.1 (#3977) | Retry adapter scaffolding | open | | 5.2 (#3978) | Orchestration helpers + TTL sweep | open | | **5.3 (this)** | **Readiness-gate env-var guard** | **open** | Phase 6 will wire the production call sites and migrate the three FROST/tbtc-signer receive loops onto the adapter in a single coordinated change. ## Test plan - [ ] CI green. - [ ] Reviewer confirms the case-insensitive / whitespace-trimmed parsing is acceptable. (Alternatives: strict \`"true"\` only.) - [ ] Reviewer confirms the per-call evaluation is acceptable. (Alternatives: read once at init.)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the persistence-vs-execution gap that #3959 left open. Stacked on #3959.
The gap
#3959 fenced creation (resolver refuses to BUILD `legacy-wallet-pubkey` signer material without `KEEP_CORE_FROST_TBTC_SIGNER_ACCEPT_SCAFFOLD_KEY_GROUP=1`) and result acceptance (FFI primitive refuses to substitute the Rust-returned key group for the placeholder). What it left open: scaffold material persisted from a previous opted-in session can still drive the FFI signing path on later runs after the operator has unset the flag. The signing path feeds `buildTaggedTBTCSignerDKGPlaceholderPublicKeyHex(identifier)` — a non-curve-point 3-byte string like `"020001"` — into the Rust signer's `RunDKG` for every included participant.
Fix
Refuse to enter `signWithTBTCSignerCoarseEngine` when the payload's `KeyGroupSource == "legacy-wallet-pubkey"` and the env opt-in is not set. The fence sits immediately after the signer material is decoded, before the engine availability check, before member-set determination, before `buildTaggedTBTCSignerRunDKGInputsForIncludedMembers` runs — so placeholder participant pubkeys are never built when the flag is unset.
Tests
Verification
Local (GOCACHE under `/private/tmp`):
Out of scope
L5 (FFI status-code semantics) and M4 / M7 (ROAST bounded transition evidence + ROAST-aware retry) are tracked as separate follow-ups.