Skip to content

feat(tbtc/signer): FROST/ROAST Rust signer#4005

Open
mswilkison wants to merge 162 commits into
mainfrom
extraction/frost-signer-mirror-2026-05-26
Open

feat(tbtc/signer): FROST/ROAST Rust signer#4005
mswilkison wants to merge 162 commits into
mainfrom
extraction/frost-signer-mirror-2026-05-26

Conversation

@mswilkison

@mswilkison mswilkison commented May 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Lands the Rust FROST/ROAST signer at pkg/tbtc/signer/ alongside the existing Go DKG coordinator. The initial import was extracted from tlabs-xyz/tbtc:feat/frost-schnorr-migration at frozen signed tag frost-extraction-source-v1, then this PR applied targeted canonical-path, CI, and reviewer hardening follow-ups in the canonical keep-core repo.

This is no longer a pure byte-for-byte mirror PR. The post-freeze changes are intentional canonical keep-core changes (canonical-path normalization, CI wiring, and reviewer hardening). After this PR lands, threshold-network/keep-core is the source of truth for the signer; the extracted monorepo tag is provenance for the initial import only.

Source-PR Provenance

The monorepo provenance above is used only to audit the initial import. Ongoing signer development, fixes, CI, and reviews happen in threshold-network/keep-core.

Why This Lives In keep-core

Per extraction plan v38 section 3.1, the signer co-locates with keep-core because:

  • HSM enforcement is external to signer code via standard PKCS#11/KMIP and does not force a separate audit lifecycle.
  • Coordinator coupling dominates; the B-2 Go DKG coordinator already lives in this repo via PR FROST/ROAST readiness branch #3866.
  • TEE adoption later is reversible without forcing a repo split.

Layout Introduced

pkg/tbtc/signer/
├── Cargo.toml + Cargo.lock + build.sh
├── src/
├── tests/
├── benches/
├── include/
├── docs/
│   └── formal/models/
├── scripts/formal/
└── test/vectors/

CI Coverage

This PR now runs:

  • cargo fmt --manifest-path pkg/tbtc/signer/Cargo.toml -- --check
  • cargo clippy --manifest-path pkg/tbtc/signer/Cargo.toml --all-targets -- -D warnings
  • cargo test --manifest-path pkg/tbtc/signer/Cargo.toml
  • cargo test --manifest-path pkg/tbtc/signer/Cargo.toml formal_verification_
  • pkg/tbtc/signer/scripts/formal/run_tla_models.sh

Verification

Latest local verification after review fixes:

  • cargo fmt --manifest-path pkg/tbtc/signer/Cargo.toml -- --check
  • cargo clippy --manifest-path pkg/tbtc/signer/Cargo.toml --all-targets -- -D warnings
  • TBTC_SIGNER_STATE_PATH=/tmp/tbtc-signer-ci-state-local-4005.json cargo test --manifest-path pkg/tbtc/signer/Cargo.toml
  • cargo test --manifest-path pkg/tbtc/signer/Cargo.toml formal_verification_

Latest GitHub CI on this branch is green, including Signer Rust checks, Signer formal invariants, and TLA model checks.

Relationship To PR #3866

PR #3866 lands the Go-side DKG coordinator. This PR lands the Rust-side signer. They are complementary protocol implementations:

  • DKG: keep-core Go side coordinates; signer participates; DKG result digest is computed on both sides and checked for parity.
  • Signing: signer produces FROST signature shares; coordinator aggregates.

PR #3866 and this PR can land independently in either order.

Plan Context

This is one of the mergeable mirror PRs comprising the FROST extraction, alongside tbtc-v2, tbtc-subgraph, and tbtc-v3-indexer mirror PRs.

Lands the Rust signer at pkg/tbtc/signer/ alongside the existing Go DKG
coordinator. Mirrors the signer slice of tlabs-xyz/tbtc:feat/frost-schnorr-migration
(PR #10) at frozen tag frost-extraction-source-v1.

Per extraction plan v38 §3.1, the signer co-locates with keep-core
because: (a) HSM enforcement is external to signer code (standard
PKCS#11/KMIP client), doesn't force a separate audit lifecycle;
(b) coordinator coupling dominates (B-2 DKG coordinator already lives
in this repo via PR #3866); (c) TEE adoption later is reversible
without forcing a repo split.

Layout
- pkg/tbtc/signer/ — Rust crate with own Cargo.toml
- pkg/tbtc/signer/docs/ — signer + ROAST + TEE specs
- pkg/tbtc/signer/docs/formal/models/ — ROAST + TEE TLA+ models
- pkg/tbtc/signer/scripts/formal/ — ROAST vector + TLA runner
- pkg/tbtc/signer/test/vectors/ — roast-attempt-context-v1.json

Files (49 total)
- 47 mirror status (Rust source, signer docs, ROAST docs, TLA models,
  test vector, etc.)
- 2 allowlisted-divergence:
  - pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs
    (path normalization to signer-repo paths)
  - pkg/tbtc/signer/scripts/formal/run_tla_models.sh (MODELS_PATH env
    var refactor; default pkg/tbtc/signer/docs/formal/models/)

Provenance
- Source repository: tlabs-xyz/tbtc
- Source branch: feat/frost-schnorr-migration
- Source tag (frozen): frost-extraction-source-v1
- Source commit (H): 52389bd5cccb5daeef195671feb7ca46be6e2f37
- Source manifest: extraction/frost-extraction-source-manifest.json
  (manifestSha256: f7295fb738104501eb6c0c2447a42122ceb5f684c7a7c5dfb50ecb0bde3a0ea0)
- Source PR(s): #425 (tbtc-signer error codes) + the full FROST migration
  series; complete list per source manifest's commit range over
  tools/tbtc-signer/ and signer-adjacent docs/scripts.

Build wiring (follow-up)
This PR lands the signer source code, docs, vectors, and scripts. Rust
toolchain CI integration (cargo build/test/clippy/fmt as a separate CI
job alongside the existing Go jobs) is a follow-up — Go workspace
ignores non-Go subdirectories automatically; the cargo crate at
pkg/tbtc/signer/Cargo.toml is independently buildable.

Known TBD (resolved pre-merge)
- 2 allowlisted-divergence files have expectedTargetSha256 = <TBD> in
  the source manifest. Path normalization transformations need to be
  applied to make the scripts run in pkg/tbtc/signer/ context; this PR
  ships the source verbatim, follow-up commits on this branch apply
  the transformations before merge.
- Dual signoff required on each allowlisted-divergence entry per plan
  v38 §4.2 (extraction lead + canonical repo maintainer).

Verification (pre-merge per plan v38 §7.2)
- 47 mirror files: sha256 equality between
  git show frost-extraction-source-v1:<sourcePath> and this PR's content
  at pkg/tbtc/signer/<targetPath>.
- 2 allowlisted-divergence files: sha256 == expectedTargetSha256
  (recorded post-transformation in source manifest).
- PR-scoped rogue-file check: git diff --name-only main...HEAD only
  contains files in manifest's fileMap with targetKey "signer".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented May 26, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces pkg/tbtc/signer, a Rust tbtc-signer bootstrap crate with a C ABI JSON-over-buffer interface, FFI translation helpers, API types, error semantics, deterministic coordinator selection, a comprehensive lib.rs FFI surface with tests, an admission-checker CLI, Criterion benchmarks, TLA+ formal models, conformance vectors/checkers, and extensive design/runbook documentation.

Changes

Core Signer Implementation

Layer / File(s) Summary
FFI & API contract definition
include/frost_tbtc.h, src/api.rs
C header declares TbtcBuffer and TbtcSignerResult structs plus ~18 frost_tbtc_* function signatures; Rust API types define serde-serialized request/response payloads for DKG, sign-round lifecycle, transcript audit, quarantine, refresh, emergency rekey, differential fuzzing, and canary rollout.
Error classification and recovery semantics
src/errors.rs
EngineError enum with thiserror integration; code() maps variants to stable string identifiers; recovery_class() classifies errors as "recoverable" or "terminal" with explicit handling for consumed attempt/round replays.
FFI translation layer
src/ffi.rs
C-compatible buffer structs, JSON serialization/deserialization helpers, ffi_entry for panic containment via catch_unwind, and safe free_buffer memory management for the FFI boundary.
Deterministic coordinator selection
src/go_math_rand.rs
Rust port of Go's math/rand with fixed cooked constants table, seedrand, bounded sampling, and shuffle; select_coordinator_identifier returns deterministic coordinators matching keep-core outputs per test vectors.
Lib.rs FFI endpoints & bootstrap gating
src/lib.rs
Module declarations, signer version, bootstrap-mode environment/profile parsing with strict production gating, test-only override, ~18 #[no_mangle] extern "C" endpoints (DKG, sign-round, transcript audit, emergency rekey, canary rollout, refresh-shares), and comprehensive test suite covering idempotency, lifecycle transitions, Taproot validation, and emergency/rollout semantics.

Admission Checking & Policy Enforcement

Layer / File(s) Summary
Admission checker binary
src/bin/admission_checker.rs
JSON-configured operator admission evaluator with policy constraints (per-provider/region caps, custody/attestation/SLA requirements), optional DAO override verification (Schnorr signature validation over SHA-256 payload digest), and replay registry persistence to prevent re-use of override decisions.

Benchmarking & Formal Test Vectors

Layer / File(s) Summary
Phase 5 ROAST benchmark suite
benches/phase5_roast.rs
Criterion benchmarks exercising DKG, start/finalize sign-round, state reload/recovery, and attempt transitions (timeout/invalid-share/stale-replay); includes FFI helpers, hashing/fingerprinting utilities, deterministic coordinator probing, and error validation.
Taproot signature fraud formal verification
tests/p2tr_signature_fraud_vectors.rs
Integration test loading p2tr-signature-fraud-v0.json vectors, computing BIP-341 key-path sighashes, verifying Schnorr signatures, deriving dual challenge identities, and asserting mutation-based identity uniqueness.
ROAST attempt-context test vectors
test/vectors/roast-attempt-context-v1.json
JSON schema and fixtures for deterministic fingerprint/attempt-id computation across multiple participant sets and edge cases, used by formal verification checkers.

TLA+ Formal Verification Models

Layer / File(s) Summary
ROAST attempt state machine model
docs/formal/models/RoastAttemptStateMachine.tla, .cfg
TLA+ module defining attempt numbering, active participants, deterministic coordinator, consumed-attempt registries (transient and durable), transitions (advance/restart/reload), and safety invariants for monotonicity and replay safety.
ROAST rollout policy state machine
docs/formal/models/RoastRolloutPolicy.tla, .cfg
TLA+ module specifying five-stage rollout progression (bootstrap→canary→broad→rollback↔halted) with control signals and temporal properties governing promotion/hold/rollback and emergency stop.
State-key provider policy & production gating model
docs/formal/models/StateKeyProviderPolicy.tla, .cfg, .production.cfg
TLA+ module modeling provider/key selection lifecycle and load outcomes; production-gate enforcement; properties asserting fail-closed behavior and exact binding enforcement.
TEE enforcement modes state machine
docs/formal/models/TeeEnforcementModes.tla, .cfg
TLA+ module for TE admission control with three enforcement modes, attestation states, break-glass override, and properties enforcing valid attestation and admission stability.
Formal models documentation & runner
docs/formal/models/README.md, scripts/formal/run_tla_models.sh
README documenting model coverage and bounds; Bash runner downloading/verifying TLA+ toolchain and executing TLC model checks.

Design Documentation & Specifications

Layer / File(s) Summary
ROAST protocol phase specifications
docs/roast-phase-{0-spec-freeze,1.5-consumed-registry,2-coordinator,3-transcript,4-liveness,5-baseline,5-runbook,5-security-gates}.md
Phase 0 spec defining attempt-context, coordinator validation, strict-mode gating, and error taxonomy; Phases 1.5–5 covering consumed-registry integration, coordinator enforcement, transcript replay hardening, liveness policy with recovery classification, baseline calibration, rollout runbook, and security gates.
ROAST implementation roadmap
docs/roast-implementation-plan.md
Complete ROAST migration plan covering baseline, goals/non-goals, design principles, threat model, target semantics (deterministic attempt/coordinator, authenticated retry, replay-safety), phased deliverables (Phase 0–5), cross-repo dependencies, and Definition-of-Done.
Hardening, security & operational plans
docs/permissioned-signer-hardening-rfc.md, docs/tbtc-signer-secret-material-hardening-plan.md, docs/tee-whitelisted-signer-enforcement-plan.md
RFC with P0–P2 hardening phases; secret-material plan (encrypted-at-rest XChaCha20-Poly1305 envelope); TEE operator enforcement covering admission tokens, runtime behavior, break-glass, and failure modes.
API contract and design decisions
docs/signer-api-contract-decision-brief.md, docs/rust-rewrite-bootstrap.md
Decision brief comparing round-level vs coarse session API (recommending session); bootstrap status documenting implemented features and production gates.
Future considerations documentation
docs/true-late-t_of_n_finalize_considerations.md
Discussion draft on true late t-of-n finalize tradeoffs and current recommendation to defer as future consideration.

Build Configuration & Scripts

Layer / File(s) Summary
Crate metadata and build setup
pkg/tbtc/signer/Cargo.toml, pkg/tbtc/signer/build.sh, pkg/tbtc/signer/.gitignore
Cargo.toml defines tbtc-signer crate (edition 2021, MIT), library config (cdylib+rlib), bench-restart-hook feature, frost-secp256k1-tr dependency; build.sh with strict bash and release cargo build; .gitignore ignores target/.
Comprehensive README documentation
pkg/tbtc/signer/README.md
Full README covering exposed C ABI endpoints, not-yet-implemented items, build/test commands, admission checker CLI, encrypted state provider contracts, benchmark/chaos suite invocation, FFI JSON contract, error codes, and buffer management.
Admission checker sample configurations
pkg/tbtc/signer/scripts/admission-*.sample.json
JSON samples for policy (constraints, DAO overrides), candidates, existing operators, override signatures, and replay registries.
Formal verification & testing scripts
pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs, pkg/tbtc/signer/scripts/run_phase5_chaos_suite.sh, pkg/tbtc/signer/scripts/formal/run_tla_models.sh
Node.js CLI for ROAST vector conformance; Bash runners for chaos suite scenarios and TLA+ model checking.

🎯 4 (Complex) | ⏱️ ~75 minutes

🐰 A bootstrap springs forth, with FFI calls so bright,
Formal models carved in TLA+ light,
ROAST coordinators dancing in Go-lang's stride,
Secrets encrypted with cryptographic pride,
From signer to harness, one system unified!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 47.28% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title clearly summarizes the main change: adding the FROST/ROAST Rust signer under tbtc/signer.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch extraction/frost-signer-mirror-2026-05-26

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

…ripts

Per extraction plan v38 §4.4, allowlisted-divergence files require
content normalization for the canonical context. This commit applies
the transformations declared in the source manifest for the 2 signer-
side allowlisted-divergence entries.

Transformations
- pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs:
  Rewrite vector path from
    `docs/frost-migration/test-vectors/roast-attempt-context-v1.json`
  to canonical signer layout
    `test/vectors/roast-attempt-context-v1.json`
  (both relative to rootDir which is two levels up from the script
  location; in canonical context that's pkg/tbtc/signer/)

- pkg/tbtc/signer/scripts/formal/run_tla_models.sh:
  Rewrite MODEL_DIR default from
    `$ROOT_DIR/docs/frost-migration/formal-verification/models`
  to canonical signer layout
    `$ROOT_DIR/docs/formal/models`
  Plus MODELS_PATH env-var override for alternate environments (CI
  matrices, local dev trees). ROOT_DIR is unchanged
  (`$(dirname $BASH_SOURCE)/../..` resolves to pkg/tbtc/signer/ here).

Verification
- Both files retain identical behavior to their monorepo counterparts
  when invoked from canonical signer layout
- Comments added documenting the path normalization with reference back
  to the source manifest's allowlisted-divergence status

Recompute expectedTargetSha256 for both entries in the source manifest
and collect dual signoff before merge per plan v38 §4.2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🧹 Nitpick comments (3)
pkg/tbtc/signer/src/api.rs (1)

196-202: 💤 Low value

Missing #[serde(default, skip_serializing_if)] on script_tree_hex.

All other Option<T> fields in this file use #[serde(default, skip_serializing_if = "Option::is_none")], but script_tree_hex does not. This inconsistency means the field will serialize as null when absent, rather than being omitted.

Suggested fix
 pub struct BuildTaprootTxRequest {
     pub session_id: String,
     pub inputs: Vec<TxInput>,
     pub outputs: Vec<TxOutput>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
     pub script_tree_hex: Option<String>,
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/tbtc/signer/src/api.rs` around lines 196 - 202, The BuildTaprootTxRequest
struct's script_tree_hex Option field lacks the serde attributes used elsewhere;
update the declaration of BuildTaprootTxRequest so the script_tree_hex field is
annotated with #[serde(default, skip_serializing_if = "Option::is_none")] to
match other Option<T> fields (preserving Clone/Debug/Deserialize/Serialize
behavior) so it is omitted from serialized output when None instead of
serializing as null.
pkg/tbtc/signer/src/lib.rs (1)

391-421: 💤 Low value

std::env::set_var and std::env::remove_var are not thread-safe.

These functions are unsound in multi-threaded contexts and deprecated since Rust 1.66. While the tests appear to serialize access via lock_test_state(), this guard must be held across all env mutations and checks within a test to prevent races with parallel test threads.

Current usage appears safe given the locking pattern, but this is fragile. Consider using a dedicated test configuration mechanism that doesn't rely on process-wide environment mutation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/tbtc/signer/src/lib.rs` around lines 391 - 421, EnvVarGuard's methods
(EnvVarGuard::set, EnvVarGuard::unset) and its Drop rely on
std::env::set_var/remove_var which are process-wide and not thread-safe; replace
this pattern with a test-scoped, non-global solution such as using a crate that
provides scoped environment variables (e.g., temp_env or similar) or refactor
tests to accept an injected configuration object instead of mutating process
env; update usages to acquire and hold the new scoped guard for the entire
duration of tests that need env changes (or pass a Config struct into functions
under test) and remove direct calls to std::env::set_var/remove_var and the
EnvVarGuard Drop behavior to avoid races.
pkg/tbtc/signer/src/bin/admission_checker.rs (1)

257-276: 💤 Low value

Consider cleaning up the temp file if rename fails.

If fs::rename fails (e.g., cross-filesystem move or permissions issue), the temp file remains on disk. Adding a cleanup attempt in the error path would improve robustness.

♻️ Proposed cleanup on error
     fs::write(&tmp_path, serialized).map_err(|error| {
         format!(
             "failed to write override replay registry temp file [{}]: {error}",
             tmp_path.display()
         )
     })?;
-    fs::rename(&tmp_path, path).map_err(|error| {
-        format!(
-            "failed to persist override replay registry [{}]: {error}",
-            path.display()
-        )
-    })
+    fs::rename(&tmp_path, path).map_err(|error| {
+        let _ = fs::remove_file(&tmp_path); // Best-effort cleanup
+        format!(
+            "failed to persist override replay registry [{}]: {error}",
+            path.display()
+        )
+    })
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/tbtc/signer/src/bin/admission_checker.rs` around lines 257 - 276,
persist_override_replay_registry currently leaves the temporary file (tmp_path)
if fs::rename fails; modify the rename error path to attempt cleanup of tmp_path
before returning the error. Specifically, call fs::remove_file(&tmp_path)
(ignoring or logging its result) inside the Err branch that handles the rename
failure so the function still returns the original formatted error for
fs::rename but also tries to remove the leftover tmp file; reference
persist_override_replay_registry, path, tmp_path, and fs::rename when making the
change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/tbtc/signer/docs/formal/models/README.md`:
- Around line 32-51: Update the incorrect repository paths in the traceability
matrix of pkg/tbtc/signer/docs/formal/models/README.md so links point to the
actual implementation and docs in this repo: change references to
tools/tbtc-signer/src/engine.rs to pkg/tbtc/signer/src/engine.rs for the entries
mentioning RoastAttemptStateMachine.tla (validate_attempt_context, replay
guards) and StateKeyProviderPolicy.tla (decode_encrypted_state_envelope,
encode_encrypted_state_envelope); change
docs/frost-migration/tee-whitelisted-signer-enforcement-plan.md to
pkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.md for
TeeEnforcementModes.tla; and change
docs/frost-migration/roast-phase-5-security-rollout-gates.md to
pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md for
RoastRolloutPolicy.tla so readers can locate the referenced code and policy
docs.

In `@pkg/tbtc/signer/docs/roast-implementation-plan.md`:
- Line 93: Update the broken doc links in
pkg/tbtc/signer/docs/roast-implementation-plan.md by replacing repo-external
paths like `docs/frost-migration/roast-phase-0-spec-freeze.md` and any
`tools/tbtc-signer/...` references with the correct repo-local paths under
pkg/tbtc/signer/docs (or use correct relative paths from this markdown file);
search the file for all occurrences of `docs/frost-migration/...` and
`tools/tbtc-signer/...` (including the instances similar to the shown
`docs/frost-migration/roast-phase-0-spec-freeze.md`) and normalize each link so
it points to the new location in this package, preserving anchor fragments and
updating any link text as needed.

In `@pkg/tbtc/signer/docs/roast-phase-5-rollout-runbook.md`:
- Around line 26-29: Update the incorrect crate path used in the runbook
commands: replace occurrences of "cd tools/tbtc-signer" with "cd
pkg/tbtc/signer" for the benchmark command (`cargo bench --features
bench-restart-hook --bench phase5_roast`) and the chaos suite script invocation
(`./scripts/run_phase5_chaos_suite.sh`) so the commands run from the correct
crate directory.

In `@pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md`:
- Around line 92-99: Update stale repository paths in
roast-phase-5-security-rollout-gates.md: replace any occurrences of the old
tooling path "tools/tbtc-signer" with the new location "pkg/tbtc/signer" (e.g.,
update the run command `cd tools/tbtc-signer && cargo bench --features
bench-restart-hook --bench phase5_roast` to `cd pkg/tbtc/signer ...`), and
update any links referencing
`docs/frost-migration/roast-phase-5-baseline-calibration.md` to the document’s
new location in the repo (search for the exact link text and swap to the correct
path). Ensure all instances at the reported locations (around the run command
and the listed links) are changed consistently so operators following the
runbook hit the correct files and commands.

In `@pkg/tbtc/signer/docs/rust-rewrite-bootstrap.md`:
- Around line 10-13: The documentation still references the old crate path
"tools/tbtc-signer" and the validation command using that path; update every
occurrence to "pkg/tbtc/signer" in rust-rewrite-bootstrap.md (including the
header lines that list the crate, the C ABI include path `include/frost_tbtc.h`,
and any validation/build commands) so links and commands point to the colocated
pkg/tbtc/signer location; search for "tools/tbtc-signer" and replace with
"pkg/tbtc/signer" and verify the validation command and any examples reference
the new path.

In `@pkg/tbtc/signer/docs/signer-api-contract-decision-brief.md`:
- Around line 43-47: Update the two referenced paths in
signer-api-contract-decision-brief.md so they point to the mirrored crate
locations: replace `docs/frost-migration/rust-rewrite-bootstrap.md` with
`pkg/tbtc/signer/docs/frost-migration/rust-rewrite-bootstrap.md` and replace
`tools/tbtc-signer/src/lib.rs` with `pkg/tbtc/signer/src/lib.rs` (look for the
occurrences shown around the paragraph mentioning the bootstrap Rust crate and
file: `tools/tbtc-signer/src/lib.rs` and update those strings accordingly).

In `@pkg/tbtc/signer/docs/tbtc-signer-secret-material-hardening-plan.md`:
- Around line 6-7: The doc still uses the old crate path string
`tools/tbtc-signer`; update that scope reference to `pkg/tbtc/signer` throughout
the file (tbtc-signer-secret-material-hardening-plan.md) so the plan points to
the mirrored crate location, and scan for any other occurrences of
`tools/tbtc-signer` in this document to replace with `pkg/tbtc/signer`.

In `@pkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.md`:
- Around line 296-298: Update the two broken cross-doc links in
pkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.md (currently
referencing docs/frost-migration/roast-phase-5-security-rollout-gates.md and
docs/frost-migration/roast-phase-5-rollout-runbook.md on lines ~296–297) to
point to their correct locations inside this PR:
pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md and
pkg/tbtc/signer/docs/roast-phase-5-rollout-runbook.md so cross-document
navigation resolves correctly.

In `@pkg/tbtc/signer/README.md`:
- Around line 53-54: Update the README.md occurrence(s) that reference the old
path string "tools/tbtc-signer" to the new crate location "pkg/tbtc/signer":
search for and replace that path in all command snippets, code blocks, and file
references (e.g., cargo build/cd commands and any path bullets) so every
instance uses "pkg/tbtc/signer" consistently; ensure both shell commands and
prose file paths are updated, and run a quick grep for "tools/tbtc-signer" to
confirm no remaining references.

In `@pkg/tbtc/signer/scripts/admission-policy-v1.sample.json`:
- Line 8: Replace the non-hex placeholder value for the JSON key
dao_override_trust_root_pubkey_hex with a syntactically valid 32-byte hex string
(64 lowercase hex characters) so sample files and copy/paste validation don't
break; update the sample value to something like a 64-character hex placeholder
and leave replacement guidance in the README or nearby comment explaining it
must be replaced with the real x-only pubkey hex.

In `@pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs`:
- Around line 15-18: vectorsPath currently resolves to
"docs/frost-migration/test-vectors/roast-attempt-context-v1.json" and will fail
because the vectors live under "test/vectors"; update the path.join call that
constructs vectorsPath (using rootDir and the filename) to point to
"test/vectors/roast-attempt-context-v1.json" so the script loads the correct
file; ensure the change is made where vectorsPath is declared and used in this
module.

In `@pkg/tbtc/signer/scripts/formal/run_tla_models.sh`:
- Around line 4-5: The MODEL_DIR assignment in run_tla_models.sh is pointing to
the wrong path; update the MODEL_DIR variable (currently computed relative to
ROOT_DIR) to "$ROOT_DIR/docs/formal/models" so the directory existence check in
the script (around the directory existence test near line 20) succeeds; modify
the MODEL_DIR definition in the script (look for the MODEL_DIR variable
assignment and references) to use the corrected path and ensure any subsequent
uses of MODEL_DIR still reference this updated variable.

In `@pkg/tbtc/signer/tests/p2tr_signature_fraud_vectors.rs`:
- Around line 375-376: The test constructs vectors_path using the vectors_path
variable in pkg/tbtc/signer/tests/p2tr_signature_fraud_vectors.rs which
currently joins
"../../docs/frost-migration/test-vectors/p2tr-signature-fraud-v0.json" relative
to CARGO_MANIFEST_DIR and points to a non-existent file; fix by either adding
the missing p2tr-signature-fraud-v0.json to the repo at that path or update the
vectors_path join to the correct relative path where the JSON actually lives
(adjust the "../" segments or point to the canonical test-vectors location),
ensuring the variable name vectors_path and its usage remain unchanged.

---

Nitpick comments:
In `@pkg/tbtc/signer/src/api.rs`:
- Around line 196-202: The BuildTaprootTxRequest struct's script_tree_hex Option
field lacks the serde attributes used elsewhere; update the declaration of
BuildTaprootTxRequest so the script_tree_hex field is annotated with
#[serde(default, skip_serializing_if = "Option::is_none")] to match other
Option<T> fields (preserving Clone/Debug/Deserialize/Serialize behavior) so it
is omitted from serialized output when None instead of serializing as null.

In `@pkg/tbtc/signer/src/bin/admission_checker.rs`:
- Around line 257-276: persist_override_replay_registry currently leaves the
temporary file (tmp_path) if fs::rename fails; modify the rename error path to
attempt cleanup of tmp_path before returning the error. Specifically, call
fs::remove_file(&tmp_path) (ignoring or logging its result) inside the Err
branch that handles the rename failure so the function still returns the
original formatted error for fs::rename but also tries to remove the leftover
tmp file; reference persist_override_replay_registry, path, tmp_path, and
fs::rename when making the change.

In `@pkg/tbtc/signer/src/lib.rs`:
- Around line 391-421: EnvVarGuard's methods (EnvVarGuard::set,
EnvVarGuard::unset) and its Drop rely on std::env::set_var/remove_var which are
process-wide and not thread-safe; replace this pattern with a test-scoped,
non-global solution such as using a crate that provides scoped environment
variables (e.g., temp_env or similar) or refactor tests to accept an injected
configuration object instead of mutating process env; update usages to acquire
and hold the new scoped guard for the entire duration of tests that need env
changes (or pass a Config struct into functions under test) and remove direct
calls to std::env::set_var/remove_var and the EnvVarGuard Drop behavior to avoid
races.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c0224eef-499a-42a2-a346-f06ef353278b

📥 Commits

Reviewing files that changed from the base of the PR and between 980587b and 335ce60.

⛔ Files ignored due to path filters (1)
  • pkg/tbtc/signer/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (48)
  • pkg/tbtc/signer/.gitignore
  • pkg/tbtc/signer/Cargo.toml
  • pkg/tbtc/signer/README.md
  • pkg/tbtc/signer/benches/phase5_roast.rs
  • pkg/tbtc/signer/build.sh
  • pkg/tbtc/signer/docs/formal/models/README.md
  • pkg/tbtc/signer/docs/formal/models/RoastAttemptStateMachine.cfg
  • pkg/tbtc/signer/docs/formal/models/RoastAttemptStateMachine.tla
  • pkg/tbtc/signer/docs/formal/models/RoastRolloutPolicy.cfg
  • pkg/tbtc/signer/docs/formal/models/RoastRolloutPolicy.tla
  • pkg/tbtc/signer/docs/formal/models/StateKeyProviderPolicy.cfg
  • pkg/tbtc/signer/docs/formal/models/StateKeyProviderPolicy.production.cfg
  • pkg/tbtc/signer/docs/formal/models/StateKeyProviderPolicy.tla
  • pkg/tbtc/signer/docs/formal/models/TeeEnforcementModes.cfg
  • pkg/tbtc/signer/docs/formal/models/TeeEnforcementModes.tla
  • pkg/tbtc/signer/docs/permissioned-signer-hardening-rfc.md
  • pkg/tbtc/signer/docs/roast-implementation-plan.md
  • pkg/tbtc/signer/docs/roast-phase-0-spec-freeze.md
  • pkg/tbtc/signer/docs/roast-phase-1.5-consumed-registry-integration.md
  • pkg/tbtc/signer/docs/roast-phase-2-coordinator-policy-enforcement.md
  • pkg/tbtc/signer/docs/roast-phase-3-attempt-transcript-replay-hardening.md
  • pkg/tbtc/signer/docs/roast-phase-4-liveness-policy-recovery.md
  • pkg/tbtc/signer/docs/roast-phase-5-baseline-calibration.md
  • pkg/tbtc/signer/docs/roast-phase-5-rollout-runbook.md
  • pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md
  • pkg/tbtc/signer/docs/rust-rewrite-bootstrap.md
  • pkg/tbtc/signer/docs/signer-api-contract-decision-brief.md
  • pkg/tbtc/signer/docs/tbtc-signer-secret-material-hardening-plan.md
  • pkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.md
  • pkg/tbtc/signer/docs/true-late-t-of-n-finalize-considerations.md
  • pkg/tbtc/signer/include/frost_tbtc.h
  • pkg/tbtc/signer/scripts/admission-candidate.sample.json
  • pkg/tbtc/signer/scripts/admission-existing.sample.json
  • pkg/tbtc/signer/scripts/admission-override-registry.sample.json
  • pkg/tbtc/signer/scripts/admission-override.sample.json
  • pkg/tbtc/signer/scripts/admission-policy-v1.sample.json
  • pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs
  • pkg/tbtc/signer/scripts/formal/run_tla_models.sh
  • pkg/tbtc/signer/scripts/run_phase5_chaos_suite.sh
  • pkg/tbtc/signer/src/api.rs
  • pkg/tbtc/signer/src/bin/admission_checker.rs
  • pkg/tbtc/signer/src/engine.rs
  • pkg/tbtc/signer/src/errors.rs
  • pkg/tbtc/signer/src/ffi.rs
  • pkg/tbtc/signer/src/go_math_rand.rs
  • pkg/tbtc/signer/src/lib.rs
  • pkg/tbtc/signer/test/vectors/roast-attempt-context-v1.json
  • pkg/tbtc/signer/tests/p2tr_signature_fraud_vectors.rs

Comment thread pkg/tbtc/signer/docs/formal/models/README.md
Comment thread pkg/tbtc/signer/docs/roast-implementation-plan.md Outdated
Comment thread pkg/tbtc/signer/docs/roast-phase-5-rollout-runbook.md Outdated
Comment thread pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md Outdated
Comment thread pkg/tbtc/signer/docs/rust-rewrite-bootstrap.md Outdated
Comment thread pkg/tbtc/signer/README.md Outdated
Comment thread pkg/tbtc/signer/scripts/admission-policy-v1.sample.json Outdated
Comment thread pkg/tbtc/signer/scripts/formal/run_tla_models.sh Outdated
Comment thread pkg/tbtc/signer/tests/p2tr_signature_fraud_vectors.rs Outdated
Adds a focused workflow that runs the Rust signer's formal-invariant
test suite + TLA model checks. Moved from
threshold-network/tbtc-v2/.github/workflows/ci-formal-verification.yml
(jobs `signer-formal-invariants` + `tla-model-checks`) per extraction
plan v38 §3.1 — the signer code lives here at pkg/tbtc/signer/, not
in tbtc-v2, so the CI jobs that exercise it belong here too.

Jobs
- signer-formal-invariants: cargo test --manifest-path pkg/tbtc/signer/
  Cargo.toml formal_verification_ (filter to formal-only cases)
- tla-model-checks: pkg/tbtc/signer/scripts/formal/run_tla_models.sh
  (iterates over .cfg files in pkg/tbtc/signer/docs/formal/models/ and
  runs TLC against each; MODELS_PATH env var allows override per the
  path-normalization commit b84b574c on this branch)

Triggers
- pull_request on pkg/tbtc/signer/** changes + this workflow file
- schedule nightly at 05:23 UTC (mirrors monorepo's pattern of running
  formal invariants both on PRs and nightly)
- workflow_dispatch for manual runs

Related changes in companion PR threshold-network/tbtc-v2#971:
- Removed these jobs from canonical tbtc-v2's ci-formal-verification.yml
- Added a comment in that file pointing here

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/tbtc-signer-formal.yml:
- Around line 25-26: The checkout steps using actions/checkout@v4 persist git
credentials by default; update each Checkout step (the uses: actions/checkout@v4
entries) to add with: persist-credentials: false so credentials are not stored
in the runner after checkout. Ensure both occurrences of actions/checkout@v4 in
the workflow are modified accordingly.
- Line 26: Update the GitHub Actions workflow to pin action versions and harden
checkout credentials: replace occurrences of actions/checkout@v4 with
actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 and add with:
persist-credentials: false to both checkout steps that use checkout, replace
dtolnay/rust-toolchain@stable with
dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8, and replace
actions/setup-java@v4 with
actions/setup-java@c1e323688fd81a25caa38c78aa6df2d33d3e20d9 so the workflow pins
SHA-based commits and disables persisting credentials on checkout.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 88553e79-6db4-4e95-98fe-00c56e1bc640

📥 Commits

Reviewing files that changed from the base of the PR and between bfd7658 and 551bd42.

📒 Files selected for processing (1)
  • .github/workflows/tbtc-signer-formal.yml

Comment on lines +25 to +26
- name: Checkout
uses: actions/checkout@v4

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE=".github/workflows/tbtc-signer-formal.yml"
echo "== File exists? =="
ls -l "$FILE"
echo "== Lines 1-120 with numbers =="
nl -ba "$FILE" | sed -n '1,120p'

Repository: threshold-network/keep-core

Length of output: 272


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE=".github/workflows/tbtc-signer-formal.yml"
echo "== File exists? =="
ls -l "$FILE"
echo "== Lines 1-120 with numbers =="
nl -ba "$FILE" | sed -n '1,120p'

Repository: threshold-network/keep-core

Length of output: 272


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE=".github/workflows/tbtc-signer-formal.yml"

echo "== Showing lines 1-80 =="
awk 'NR>=1 && NR<=80 {printf "%4d: %s\n", NR, $0}' "$FILE"

echo "== Showing checkout-related lines (search) =="
rg -n "name:\s*Checkout|uses:\s*actions/checkout|persist-credentials" "$FILE" || true

Repository: threshold-network/keep-core

Length of output: 2302


Set persist-credentials: false on both actions/checkout@v4 steps

The current workflow checks out without disabling persisted git credentials; add with: persist-credentials: false to reduce credential exposure.

Suggested patch
       - name: Checkout
         uses: actions/checkout@v4
+        with:
+          persist-credentials: false
@@
       - name: Checkout
         uses: actions/checkout@v4
+        with:
+          persist-credentials: false
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Checkout
uses: actions/checkout@v4
- name: Checkout
uses: actions/checkout@v4
with:
persist-credentials: false
🧰 Tools
🪛 zizmor (1.25.2)

[warning] 25-26: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)


[error] 26-26: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/tbtc-signer-formal.yml around lines 25 - 26, The checkout
steps using actions/checkout@v4 persist git credentials by default; update each
Checkout step (the uses: actions/checkout@v4 entries) to add with:
persist-credentials: false so credentials are not stored in the runner after
checkout. Ensure both occurrences of actions/checkout@v4 in the workflow are
modified accordingly.

timeout-minutes: 30
steps:
- name: Checkout
uses: actions/checkout@v4

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

for repo in actions/checkout actions/setup-java dtolnay/rust-toolchain; do
  echo "=== $repo ==="
  git ls-remote "https://github.com/${repo}.git" | head -n 5
  echo
done

Repository: threshold-network/keep-core

Length of output: 535


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect workflow around the referenced lines
echo "=== Workflow file context ==="
nl -ba .github/workflows/tbtc-signer-formal.yml | sed -n '1,120p'

# 2) Resolve refs to commit SHAs (avoid SIGPIPE by capturing then slicing)
resolve_ref() {
  local repo="$1" ref="$2"
  local url="https://github.com/${repo}.git"
  local tmp="/tmp/lsremote_${repo//\//_}_${ref//\//_}"
  git ls-remote "$url" "$ref" > "$tmp" 2>/dev/null || true
  echo "--- ${repo} ${ref} ---"
  cat "$tmp" || true
  echo
  rm -f "$tmp" || true
}

echo "=== Resolve action refs to SHAs ==="
resolve_ref "actions/checkout" "refs/tags/v4"
resolve_ref "actions/checkout" "refs/heads/v4"
resolve_ref "actions/setup-java" "refs/tags/v4"
resolve_ref "actions/setup-java" "refs/heads/v4"
resolve_ref "dtolnay/rust-toolchain" "refs/heads/stable"
resolve_ref "dtolnay/rust-toolchain" "refs/tags/stable"

Repository: threshold-network/keep-core

Length of output: 162


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Workflow file context (numbered) ==="
python3 - <<'PY'
from pathlib import Path
p = Path(".github/workflows/tbtc-signer-formal.yml")
lines = p.read_text().splitlines()
start, end = 1, min(len(lines), 140)
for i in range(start, end+1):
    print(f"{i}\t{lines[i-1]}")
PY

echo
echo "=== Resolve action refs to commit SHAs ==="
resolve_ref() {
  local repo="$1" ref="$2"
  local url="https://github.com/${repo}.git"
  echo "--- ${repo} ${ref} ---"
  git ls-remote "$url" "$ref" || true
  echo
}

resolve_ref "actions/checkout" "refs/tags/v4"
resolve_ref "actions/checkout" "refs/heads/v4"
resolve_ref "actions/setup-java" "refs/tags/v4"
resolve_ref "actions/setup-java" "refs/heads/v4"
resolve_ref "dtolnay/rust-toolchain" "refs/heads/stable"
resolve_ref "dtolnay/rust-toolchain" "refs/tags/stable"

Repository: threshold-network/keep-core

Length of output: 2438


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Total lines: $(python3 - <<'PY'\nimport pathlib\np=pathlib.Path('.github/workflows/tbtc-signer-formal.yml')\nprint(len(p.read_text().splitlines()))\nPY)"

echo
echo "=== uses: occurrences ==="
rg -n "uses:" .github/workflows/tbtc-signer-formal.yml || true

echo
echo "=== persist-credentials occurrences ==="
rg -n "persist-credentials" .github/workflows/tbtc-signer-formal.yml || true

Repository: threshold-network/keep-core

Length of output: 133


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Total lines: $(python3 - <<'PY'
import pathlib
p=pathlib.Path('.github/workflows/tbtc-signer-formal.yml')
print(len(p.read_text().splitlines()))
PY)"

echo
echo "=== uses: occurrences ==="
rg -n "uses:" .github/workflows/tbtc-signer-formal.yml || true

echo
echo "=== persist-credentials occurrences ==="
rg -n "persist-credentials" .github/workflows/tbtc-signer-formal.yml || true

Repository: threshold-network/keep-core

Length of output: 424


Pin GitHub Actions to commit SHAs and harden checkout credentials.

  • actions/checkout@v4 (lines 26, 44) -> actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5; also add with: persist-credentials: false to both checkout steps.
  • dtolnay/rust-toolchain@stable (line 29) -> dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8.
  • actions/setup-java@v4 (line 47) -> actions/setup-java@c1e323688fd81a25caa38c78aa6df2d33d3e20d9.
🧰 Tools
🪛 zizmor (1.25.2)

[warning] 25-26: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)


[error] 26-26: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/tbtc-signer-formal.yml at line 26, Update the GitHub
Actions workflow to pin action versions and harden checkout credentials: replace
occurrences of actions/checkout@v4 with
actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 and add with:
persist-credentials: false to both checkout steps that use checkout, replace
dtolnay/rust-toolchain@stable with
dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8, and replace
actions/setup-java@v4 with
actions/setup-java@c1e323688fd81a25caa38c78aa6df2d33d3e20d9 so the workflow pins
SHA-based commits and disables persisting credentials on checkout.

Resolves CI failures on PR #4005 (signer mirror):

1. TLA model checks: run_tla_models.sh lacked executable bit at
   canonical HEAD. CI ran the script directly (no `bash` prefix),
   which fails with `Permission denied`. Fixed via
   `git update-index --chmod=+x`.

2. Signer formal invariants: engine.rs's
   formal_verification_roast_attempt_context_shared_vectors_match_
   expected_values test referenced vectors at a path stale from
   the umbrella's docs/frost-migration/test-vectors/ layout. The
   manifest places the vector at the canonical-signer test/vectors/
   subdir (pkg/tbtc/signer/test/vectors/roast-attempt-context-v1.json
   per the source-to-target map). Updated the
   `PathBuf::from(env!("CARGO_MANIFEST_DIR")).join(...)` argument from
   `../../docs/frost-migration/test-vectors/roast-attempt-context-v1.json`
   (umbrella-relative) to `test/vectors/roast-attempt-context-v1.json`
   (signer-CARGO_MANIFEST_DIR-relative, where the vector actually
   lives at canonical HEAD).

Verified locally:
- ls -l shows executable bit set on run_tla_models.sh
- engine.rs path now resolves to the correct mirror location
- Vector exists at pkg/tbtc/signer/test/vectors/roast-attempt-context-v1.json

Same fix needs to be applied to PR #4007 (stacked on #4005)
in a follow-up commit on its branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mswilkison added a commit that referenced this pull request May 26, 2026
PR #4005 signer formal invariants test
formal_verification_p2tr_signature_fraud_vectors_match_bitcoin_crate
was failing because:

1. The umbrella source manifest only mapped p2tr-signature-fraud-v0.json
   to the tbtc-v2 target (docs/test-vectors/). It was not mirrored to
   the keep-core (signer) target.

2. tests/p2tr_signature_fraud_vectors.rs referenced the vector at the
   umbrella-relative path
   `../../docs/frost-migration/test-vectors/p2tr-signature-fraud-v0.json`
   (CARGO_MANIFEST_DIR-relative -> repo-root + docs/frost-migration/...).
   That directory does not exist on canonical keep-core.

This is a structural omission in the manifest, not a divergence: the
cross-language vector test exists in both Solidity (tbtc-v2 side) and
Rust (signer side), and the vector is genuinely needed in both places
to verify cross-implementation consistency.

Fix
- Mirror p2tr-signature-fraud-v0.json (598 lines, byte-identical
  content from tbtc-v2 mirror at docs/test-vectors/) to
  pkg/tbtc/signer/test/vectors/p2tr-signature-fraud-v0.json.
- Update the test path in tests/p2tr_signature_fraud_vectors.rs from
  `../../docs/frost-migration/test-vectors/p2tr-signature-fraud-v0.json`
  to `test/vectors/p2tr-signature-fraud-v0.json` (CARGO_MANIFEST_DIR =
  pkg/tbtc/signer/, so test/vectors/... resolves correctly).

Two stale comment references remain in
- pkg/tbtc/signer/docs/roast-implementation-plan.md:265
- pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs:19

Both are comment-only doc pointers to the source layout; they do not
affect runtime. Left as-is to preserve the umbrella -> canonical
provenance trail.

Manifest update follows in a stacked PR on tlabs-xyz/tbtc#10:
- Add p2tr-signature-fraud-v0.json -> signer target mapping
  (test/vectors/p2tr-signature-fraud-v0.json).
- Reclassify tests/p2tr_signature_fraud_vectors.rs from mirror to
  allowlisted-divergence (path differs from umbrella).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stacked on extraction/frost-signer-mirror-2026-05-26 / PR #4005. Adds
optional taproot_merkle_root_hex to start/finalize signing rounds, binds
it into request fingerprints and round IDs, signs and aggregates with
frost-secp256k1-tr Taproot tweaks, and verifies tweaked aggregates in
tests. Verification: cargo test in pkg/tbtc/signer.
mswilkison and others added 15 commits June 17, 2026 10:42
Codex re-review (valid): the helper skipped the enforce_provenance_gate() and
validate_session_id() checks that interactive_session_open - and every other
engine endpoint, including the public-material-only verify_signature_share -
runs before validating the attempt context. That let the derivation endpoint
return protocol material on an unattested engine, or for a session_id open would
reject (the session_id is also hashed into attempt_id), i.e. an
enforcement/contract mismatch vs the path it feeds.

I initially leaned to push back, reading the design consult's "no policy
decision / stateless" as "skip the provenance gate" - but the convention is
decisive: verify_signature_share (itself stateless, public-material-only) and
the stateless frost_ops primitives all run enforce_provenance_gate() +
validate_session_id() as their front door. The provenance gate is a UNIVERSAL
engine front-door, not a per-operation policy; "stateless/no policy" meant no
session/nonce state and no blame decision. So Codex is right.

Add both checks at the top of derive_interactive_attempt_context, matching
open's order. Tests: hold lock_test_state() (hermetic development-profile env,
gate off) in the derivation tests so they exercise the math; add an empty
session_id rejection case and a provenance-enforced rejection test
(ProvenanceGateRejected) proving the front door is wired.

cargo fmt + clippy --all-targets -D warnings clean; full lib suite 291 pass / 1
ignored.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
## Phase 7.3 — engine-side attempt-context derivation (Option B, PR 1 of
2)

Per the Codex+Gemini design consult, the host must not re-implement the
engine's domain-separated attempt-context derivations (the
seed-divergence class). This PR exposes them from the engine.

### What
- New FFI export `frost_tbtc_derive_interactive_attempt_context` →
`engine::derive_interactive_attempt_context`. Stateless, secret-free:
takes an attempt's public inputs, returns the canonical `AttemptContext`
(coordinator, included-participants fingerprint, attempt id) + one
canonical FROST identifier per participant.
- Composes the existing ROAST primitives and **re-validates the derived
context against strict-mode `validate_attempt_context`** before
returning, so the host receives a context `interactive_session_open` is
guaranteed to accept.

### Why
The interactive runner ships placeholder
fingerprint/attempt_id/identifier values documented as inert (only the
fake engine is wired). The real engine strict-validates those fields;
deriving them in Go would duplicate domain-separated hashing across
languages. Exposing from the engine = single source of truth.

### Tests
- Engine units: matches the standalone derivations (coordinator,
fingerprint, attempt_id, canonical ordering, identifier encoding);
deterministic; rejects empty message / zero attempt number /
threshold>set / duplicate / empty participants.
- FFI success round-trip (the helper succeeds with no DKG fixture).
- `cargo fmt` + `clippy --all-targets --all-features -D warnings` clean;
full lib suite 289 pass / 1 ignored.

### Scope / follow-up
- Go-host wiring (bridge wrapper + Go-coordinator cross-check +
replacing the runner's three placeholders + threading identifiers) is
the next PR.
- Production real-engine signing remains gated on the
`frost-secp256k1-tr =3.0.0` external audit; this helper is
parse/canonicalization surface only.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…t operators

A multi-seat operator runs concurrent interactive signing per seat in ONE process,
all sharing one member-independent SessionID. The engine state is a process-global
singleton that held exactly ONE open interactive member per session
(interactive_signing: Option<InteractiveSigningState>), so a second seat's
InteractiveSessionOpen(same session, different member) failed with SessionConflict;
round2 freed the whole state (wiping siblings); and the per-attempt consumed-nonce
marker made a sibling's round2 falsely trip ConsumedNonceReplay.

Key interactive_signing by member_identifier (BTreeMap<u16, InteractiveSigningState>),
so local seats are independent - exactly as separate processes would be. Design
reviewed by Gemini (cryptographically sound) and Codex (approve with the refinements
folded below); both rejected a session-wide live-attempt model since it would let one
seat erase another's nonces.

- Per-MEMBER live attempt: open(M, strictly-newer attempt) replaces only M's entry
  (zeroizing M's old nonces); a stale/equal open for M is rejected, but a sibling on
  a different attempt is untouched. No FROST invariant requires seats on one attempt.
- round2 removes only the member's entry (siblings stay live); the marker is the
  durable replay protection.
- consumed_interactive_attempt_markers keyed per-(attempt_id, member_id) via the
  composite marker; a legacy bare attempt_id marker is honored FAIL-CLOSED on read
  (interactive_attempt_consumed). aggregated_interactive_attempt_markers stay
  per-attempt (the aggregate is one signature over public data).
- abort is session-level over the map (remove all matching entries); TTL sweep is
  per-entry by opened_at_unix (siblings survive); capacity counts live member entries.
- interactive_state_for_attempt_mut is a member-keyed lookup; an unopened member now
  yields "no live interactive attempt for member N". Live state still never persists
  (empty map on reload). Zeroization is preserved via InteractiveRound1State::Drop.

New test interactive_multi_seat_two_members_one_process_aggregate_bip340 drives two
seats through the interactive API in one process: both open the same attempt, Round1
independently, member 1's Round2 frees only its entry and writes only its marker
(member 2 stays live and unblocked), and the two interactive shares aggregate to a
valid BIP-340 signature. Existing single-member tests updated for the map (is_empty /
member-keyed access / composite markers). All 292 lib tests pass; cargo fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…view fold)

Codex review (P2, valid): the multi-seat conversion left Round2 gated only by the
per-member consumed marker. When an attempt completes (interactive_aggregate records
aggregated_interactive_attempt_markers) with one threshold subset, an open sibling
seat that never signed has NO per-member consumed marker - so it could still release
a fresh share for the finished attempt, and enough such shares could be aggregated
into a SECOND valid signature over the same message, contrary to the completion
marker's "recovery is a fresh attempt" rule. The single-member engine never had an
unsigned sibling, so this is new to multi-seat.

Gate Round2 on the aggregate-completion marker (InteractiveAttemptAlreadyAggregated)
in addition to the consumed marker, before any share is released.

Tests (folding remaining valid review findings):
- interactive_round2_refused_after_aggregate_for_unsigned_sibling: 3 seats open the
  same attempt, {1,2} sign + interactive_aggregate completes it, then seat 3's Round2
  (in a {1,3} subset it would otherwise be valid for) is refused. Also exercises
  interactive_aggregate over two interactive multi-seat shares (a coverage gap from
  the self-review).
- interactive_open_advances_only_the_opening_member_attempt: seat 1 advancing to a
  newer attempt replaces only its own entry (fresh nonce state); a sibling on the
  older attempt is untouched; a stale re-open is rejected for the advanced seat but
  idempotent for the sibling.

Capacity note (self-review): the cap now counts member entries; the default (64) is
generous for realistic multi-seat and the semantics are documented at the check.

All 294 lib tests pass; cargo fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ge (re-review)

Codex re-review on the Round2 completion gate:

[P1] The gate trusted aggregated_interactive_attempt_markers keyed by attempt_id
alone, but interactive_aggregate never binds attempt_id to the aggregated package/
message - it only checks the shares aggregate under the session key. So a caller
with any valid aggregate could replay it under a DIFFERENT live attempt's id, set
that marker, and the new gate would then refuse the real Round2 before its nonce is
consumed (an availability/DoS regression the single-member engine never had).

Fix: make the completion marker MESSAGE-BOUND - interactive_aggregated_marker =
"{attempt_id}@{message_digest}". interactive_aggregate writes it from the package it
actually aggregated (all five marker sites: both completion pre-checks, the capacity
guard, the insert, and the persist-rollback). The Round2 gate recomputes it from the
message THIS member opened, so it preempts Round2 only for a genuine same-message
completion; a replayed aggregate carrying a different message sets an unrelated
marker that cannot match. Same-message re-aggregation is still rejected (existing
restart/repeat tests pass unchanged - they re-aggregate the same message).

[P2] On that genuine completion rejection, the unsigned sibling's entry is now dead,
so free it (zeroizing its nonces) instead of letting it hold a live-member cap slot
until the TTL sweep. Only the matching member's entry is removed.

Tests: the multi-seat refusal test now also asserts the marker is bound
(attempt_id@digest, not the bare id) and that the refused sibling's entry is freed.
All 294 lib tests pass; cargo fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…review)

Codex re-review (P2, valid): the previous commit made the aggregate completion
marker message-bound (attempt_id@digest), but a completion persisted by the
pre-binding engine is stored as the BARE attempt_id. The new checks only looked for
the bound form, so after an upgrade a previously completed attempt looked unfinished
- repeat InteractiveAggregate and the Round2 completion gate would no longer fail
closed for it.

Add interactive_attempt_aggregated(markers, attempt_id, digest) = bound form OR a
legacy bare attempt_id marker (fail-closed on read), mirroring the consumed-marker
helper. Use it at the three read sites (the Round2 completion gate and both
interactive_aggregate pre-checks). New writes stay bound-only, so the durable record
migrates forward on the next completion while legacy completions stay final.

Test: interactive_honors_legacy_bare_aggregate_completion_marker injects a bare
attempt_id marker (a pre-upgrade completion) and asserts Round2 then fails closed
with InteractiveAttemptAlreadyAggregated. All 295 lib tests pass; cargo fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e-review)

Codex re-review (P1, valid): attempt_id is derived from session/message/attempt#/
coordinator/included - NOT the taproot root. So the message-bound completion marker
(attempt_id@message_digest) still collided across taproot tweaks: a completion
aggregated for one root (or key-path None) recorded the same marker the Round2 gate
checks for a live seat opened with a different root, wrongly preempting that seat's
Round2 (and zeroizing it) even though the signatures differ per tweak.

Bind the canonical taproot root into the marker too -
interactive_aggregated_marker(attempt_id, message_digest, taproot_root) =
"{attempt_id}@{message_digest}@{root}" (root = hex, or "keypath" for None, which
cannot collide with a 64-hex root). interactive_aggregate writes it from the root it
aggregated under; the Round2 gate recomputes it from the root THIS member opened
with. The completion marker now binds the full signing-task identity (attempt + msg
+ root); the legacy bare-id fallback is unchanged.

Test interactive_round2_completion_marker_binds_taproot_root: a completion recorded
for a different root does not preempt a key-path member's Round2, while the same-root
completion does. All 296 lib tests pass; cargo fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e-review)

Codex re-review (P2, valid): when a multi-seat attempt aggregates with a threshold
subset that EXCLUDES a local member which opened/Round1'd the same attempt + root,
that member never calls Round2 (it is not a signer), so the Round2 completion gate
never runs for it - its interactive_signing entry (nonces, key, message) and its
live-member capacity slot stayed resident until the 1h TTL or an explicit abort.

interactive_aggregate's success path now frees the LOCAL siblings finalized by the
completion: after persisting the marker, remove + zeroize every interactive_signing
entry on (attempt_id, taproot root) - the signers' entries were already removed at
their Round2, and a sibling on a DIFFERENT root is a distinct signing task and is
left untouched. The Round2 completion gate stays as the defense for a sibling that
RE-OPENS the finalized attempt and tries to sign.

Test interactive_round2_refused_after_aggregate_for_unsigned_sibling now asserts the
non-signing sibling is freed at aggregation, then re-opens it and confirms the gate
still refuses its Round2. All 296 lib tests pass; cargo fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…o (re-review)

Codex re-review (P2, valid): the finalized-sibling cleanup added in the previous
commit matched entries on (attempt_id, taproot root) but NOT the message. Because
attempt_id is provided by the caller separately from the package, a mismatched
aggregate - a valid signing package for message B submitted under a live message-A
attempt's id (same root) - would delete the message-A seats' live nonce/commitment
state, forcing that unrelated attempt to restart, even though the stored marker (now
message-bound) correctly would not match its Round2.

Match the FULL finalized identity in the cleanup filter: attempt_id AND
hash_hex(entry.message_bytes) == aggregated_message_digest AND taproot root - the
same (attempt + message + root) identity the completion marker binds.

Test interactive_aggregate_cleanup_is_message_bound: a valid stateless aggregate over
message B under message A's attempt id leaves the live message-A seat intact. All 297
lib tests pass; cargo fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t operators (#4098)

## What

Keys a session's interactive signing state by `member_identifier`
(`interactive_signing: Option<InteractiveSigningState>` → `BTreeMap<u16,
InteractiveSigningState>`) so a **multi-seat operator can run
interactive signing for all its seats in one process**.

## Why

A multi-seat operator runs concurrent interactive signing per seat, all
sharing one member-independent `SessionID`, in one process (the engine
state is a process-global singleton). Today that breaks three ways: the
2nd seat's `InteractiveSessionOpen(same session, different member)` →
`SessionConflict`; `round2` frees the whole state (wiping siblings); and
the per-attempt consumed-nonce marker makes a sibling's `round2` falsely
trip `ConsumedNonceReplay`. Surfaced by the keep-core real-cgo e2e
(#4097).

## Design (reviewed: Gemini sound, Codex approve-with-refinements)

Both reviewers rejected a session-wide live-attempt model — it would let
one seat erase another's nonces, the exact bug being removed.

- **Per-member live attempt**: `open(M, strictly-newer attempt)`
replaces only M's entry (zeroizing M's old nonces); a stale/equal open
for M is rejected, but a sibling on a different attempt is untouched.
Seats are independent — exactly as separate processes would be.
- **round2** removes only the member's entry (siblings stay live); the
durable marker carries replay protection.
- **Consumed markers** keyed per-`(attempt_id, member_id)` (composite);
a **legacy bare `attempt_id` marker is honored fail-closed** on read.
`aggregated_interactive_attempt_markers` stay per-attempt (one signature
per attempt over public data).
- **abort** is session-level over the map (removes all matching
entries); **TTL sweep** is per-entry by `opened_at_unix` (siblings
survive); **capacity** counts live member entries (a new member is a
slot; a same-member replacement isn't).
- `interactive_state_for_attempt_mut` is a member-keyed lookup. Live
state still never persists (empty map on reload). Zeroization preserved
via `InteractiveRound1State::Drop`.

## Tests

New `interactive_multi_seat_two_members_one_process_aggregate_bip340`:
two seats open the same attempt in one process, Round1 independently,
member 1's Round2 frees only its entry and writes only its marker
(member 2 stays live and unblocked), and the two interactive shares
**aggregate to a valid BIP-340 signature**. Existing single-member tests
updated for the map. **All 292 lib tests pass; `cargo fmt` clean.**

Follow-ups (additional edge tests Codex itemized — per-member attempt
advance, abort-by-attempt over multiple members, capacity
new-vs-replacement) can land in review.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…attempt edge cases

Two follow-up edge tests Codex itemized for the multi-seat interactive engine fix
(#4098), test-only:

- interactive_capacity_counts_new_members_not_replacements: at a live-member cap of 1,
  a member advancing to a newer attempt replaces its own entry (no new slot, succeeds),
  while a DIFFERENT member is a new entry that trips the cap and fails closed - pinning
  that the cap counts member entries, not sessions.

- interactive_abort_by_attempt_removes_all_members_on_that_attempt: abort with an
  attempt_id filter is session-level over the member map - it removes every local seat
  on that attempt while a sibling seat on a different attempt survives.

All 299 lib tests pass; cargo fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…attempt edge cases (#4100)

## What

Two follow-up edge tests Codex itemized during the multi-seat
interactive engine fix (#4098), test-only:

- **`interactive_capacity_counts_new_members_not_replacements`** — at a
live-member cap of 1: a member advancing to a newer attempt *replaces*
its own entry (no new slot, succeeds), while a *different* member is a
new entry that trips the cap and fails closed. Pins that the cap counts
member entries, not sessions.
- **`interactive_abort_by_attempt_removes_all_members_on_that_attempt`**
— abort with an `attempt_id` filter is session-level over the member
map: it removes every local seat on that attempt while a sibling seat on
a *different* attempt survives.

All 299 lib tests pass; `cargo fmt` clean.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…tc_abi_version)

The Go cgo bridge and libfrost_tbtc are compiled separately and linked at runtime via
dlsym; a symbol that resolves but changed MEANING (struct/JSON contract change) is
silent corruption. The existing frost_tbtc_version returns a human string
("tbtc-signer/0.1.0-bootstrap") that cannot be machine-checked.

Add frost_tbtc_abi_version() returning a structured FFI CONTRACT version
{abi_major, abi_minor} (api::FrostTbtcAbiVersionResult), so a consumer can fail closed
against an incompatible lib (the Go-side assertion lands separately). Starts at 1.0 -
NOT 0.x, to avoid semver pre-1.0 ambiguity. abi_major = any incompatible contract
change (C signatures, JSON field meaning, required fields, enum/status values,
serialization, memory ownership, crypto transcript semantics); abi_minor = a purely
additive, backward-compatible change old consumers safely ignore. A consumer requires
lib.abi_major == its_major AND lib.abi_minor >= the minor it needs. The human
frost_tbtc_version string stays informational and unchanged.

Design validated via Codex consult (major.minor; exact major, minimum minor; the
{abi_major, abi_minor} JSON is the frozen root compatibility surface).

300 lib tests pass (new: abi_version_reports_the_contract_version pins 1.0);
cargo fmt clean; the symbol is exported in the built cdylib.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codex #4104 P2: the new #[no_mangle] frost_tbtc_abi_version is part of the public C ABI,
but header-based consumers that compile against include/frost_tbtc.h had no prototype for
it - the exported symbol and the advertised header could drift. Add the declaration
alongside frost_tbtc_version (the header is hand-maintained; no cbindgen).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tc_abi_version) (#4104)

## What

Adds `frost_tbtc_abi_version()` — a structured **FFI contract version**
`{abi_major, abi_minor}` — so the Go cgo bridge can **fail closed**
against an incompatible `libfrost_tbtc` instead of silently
misinterpreting a changed contract.

The bridge and the lib are compiled separately and linked at runtime via
`dlsym`; a symbol that still resolves but changed *meaning* (struct/JSON
contract change) is silent corruption. The existing `frost_tbtc_version`
returns a human string (`tbtc-signer/0.1.0-bootstrap`) that can't be
machine-checked — this is the checkable companion.

## Scheme (Codex-validated)

- **`abi_major`** — any *incompatible* change to the Go↔Rust contract: C
signatures, JSON field meaning, required fields, enum/status values,
serialization, memory ownership, crypto transcript semantics.
- **`abi_minor`** — a purely *additive*, backward-compatible change old
consumers safely ignore.
- A consumer requires `lib.abi_major == its_major` **and**
`lib.abi_minor >= the minor it needs`.
- Starts at **1.0** (not 0.x, to avoid semver pre-1.0 ambiguity). The
`{abi_major, abi_minor}` JSON is the **frozen root compatibility
surface**.
- `frost_tbtc_version` (human string) stays informational and unchanged.

## Notes

- The **Go-side assertion** (fail-closed preflight + the
`requiredMajor`/`requiredMinMinor` constants + a CI-gate check) lands as
a separate keep-core PR that also bumps the signer pin to this commit.
- 300 lib tests pass (new `abi_version_reports_the_contract_version`
pins 1.0); `cargo fmt` clean; the symbol is exported in the built
cdylib.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@mswilkison mswilkison changed the title feat(tbtc/signer): mirror FROST/ROAST Rust signer from tBTC monorepo feat(tbtc/signer): FROST/ROAST Rust signer Jun 24, 2026
mswilkison and others added 3 commits June 26, 2026 11:19
…ctive round

start_sign_round cleared the active sign round on an authorized advance
*before* validating the incoming attempt context against the deterministic
RFC-21 coordinator selection. A malformed advance whose transition evidence
was internally consistent but whose coordinator_identifier failed deterministic
validation destroyed the in-memory round, then returned an error without
persisting. Because the original attempt id stayed in consumed_attempt_ids,
that attempt could never be re-signed in-memory, bricking the signing session
until the durable (un-cleared) state was reloaded on restart.

Run validate_attempt_context on the incoming context before
clear_active_sign_round_for_attempt_transition, so a rejected advance leaves
the active round intact. Add a regression test that forges the coordinator on
an otherwise-valid advance and asserts the original attempt remains signable.

This path is gated off in production by enforce_transitional_signing_disabled_in_production,
so the impact is limited to the dev/staging transitional-nonce path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ng window

Two fail-open / DoS fixes in the signing-policy config surface:

- signer_profile_is_production() panicked on any TBTC_SIGNER_PROFILE value
  other than production/development/empty. On the env-fallback path that value
  is unvalidated, so a single typo (e.g. "staging") aborted every
  profile-gated FFI operation as a generic internal error -- a process-wide
  denial of service. Treat an unrecognized profile as production (fail closed)
  and warn once instead of panicking.

- load_signing_policy_firewall_config() accepted an equal-bounds UTC window
  (start == end). utc_hour_in_window treats start == end as "always in
  window", so the time-of-day control was silently disabled (fail open).
  Reject equal bounds, and out-of-range (>= 24) hours, at load time.

Add regression tests for both.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The signer-dependency-audit job runs cargo-deny against the committed
Cargo.lock, but the clippy/test jobs and build.sh resolved dependencies
without --locked. A PR that edits Cargo.toml so the committed lock no longer
satisfies it would let those jobs silently re-resolve to newer, unaudited
crate versions (running their build scripts and proc-macros on the runner)
while the advisory gate still audits the stale lock -- so the security gate
and the code actually exercised could diverge.

Add --locked to the clippy, test, formal-test, and release-build invocations
so a lock/manifest mismatch fails loudly instead of silently re-resolving.
cargo fmt is intentionally left unchanged (it does not accept --locked).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mswilkison and others added 4 commits June 26, 2026 12:11
…ctive round (#4111)

## Summary

Follow-up to #4005 (FROST/ROAST signer). Fixes a session-bricking bug in
the attempt-advance path of `start_sign_round`.

On an authorized attempt advance, `start_sign_round` cleared the active
sign round (`clear_active_sign_round_for_attempt_transition`) **before**
validating the incoming attempt context against the deterministic RFC-21
coordinator selection. A malformed advance whose transition evidence was
internally consistent but whose `coordinator_identifier` failed
deterministic validation destroyed the in-memory round, then returned an
error **without persisting**. Because the original attempt id stayed in
`consumed_attempt_ids`, that attempt could never be re-signed in-memory
— the signing session was bricked until the durable (un-cleared) state
was reloaded on restart.

## Fix

Run `validate_attempt_context` on the incoming context **before**
clearing the active round, so a rejected advance leaves the round
intact. The same validation already ran later on the fresh-attempt path,
so no legitimate advance is newly rejected — the check is simply moved
ahead of the destructive clear.

## Test

Adds `rejected_forged_advance_preserves_active_sign_round`: it forges
the coordinator on an otherwise-valid advance and asserts the original
attempt remains signable. Verified to **fail** against the unfixed code
(`ConsumedAttemptReplay`) and **pass** with the fix.

## Scope

Gated off in production by
`enforce_transitional_signing_disabled_in_production`, so impact is
limited to the dev/staging transitional-nonce path.

_Found during review of #4005._

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…ng window (#4112)

## Summary

Follow-up to #4005 (FROST/ROAST signer). Two fail-open / DoS fixes in
the signing-policy config surface.

### 1. `signer_profile_is_production()` panicked on an unknown profile
value

On the env-fallback path `TBTC_SIGNER_PROFILE` is unvalidated, so any
value other than `production`/`development`/empty hit `panic!`. The
panic is caught at the FFI boundary and redacted to a generic internal
error, so a single typo (e.g. `staging`) aborted **every** profile-gated
operation — a process-wide denial of service until the env var was
corrected.

Now an unrecognized profile is treated as `production` (fail-closed: the
strictest gate set) with a one-time warning, instead of panicking.
Production is the safe/strict direction for every caller (provenance
gate, bootstrap-dealer-DKG disable, transitional-signing disable, ROAST
strict mode, env state-key-provider rejection). The install path
(`init_config`) still validates the profile, so this only affects the
env-fallback path.

### 2. `load_signing_policy_firewall_config()` accepted a zero-width UTC
window

`utc_hour_in_window` treats `start == end` as "always in window", so an
equal-bounds window silently disabled the time-of-day control
(fail-open) rather than restricting it. The loader now rejects equal
bounds — and out-of-range (`>= 24`) hours — at load time.

## Tests

- `unknown_profile_value_fails_closed_to_production`
- `signing_policy_firewall_rejects_equal_utc_window_bounds`

_Found during review of #4005._

🤖 Generated with [Claude Code](https://claude.com/claude-code)
## Summary

Follow-up to #4005 (FROST/ROAST signer). Pins cargo dependency
resolution in CI and the release build.

The `signer-dependency-audit` job runs `cargo-deny` against the
committed `Cargo.lock`, but the clippy/test jobs and `build.sh` resolved
dependencies **without** `--locked`. A PR that edits `Cargo.toml` so the
committed lock no longer satisfies it would let those jobs silently
re-resolve to newer, unaudited crate versions (running their build
scripts and proc-macros on the runner) while the advisory gate still
audits the stale lock — so the security gate and the code actually
exercised could diverge.

## Fix

Adds `--locked` to the `clippy`, `test`, formal-`test`, and
release-`build` invocations so a lock/manifest mismatch fails loudly
instead of silently re-resolving. `cargo fmt` is intentionally left
unchanged (it does not accept `--locked`).

_Found during review of #4005._

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…he outcome

persist_engine_state_to_storage resolved the state-encryption key inside
encode_encrypted_state_envelope, i.e. while the caller held the global
ENGINE_STATE mutex. For the `command` (KMS/HSM) provider that forks and waits on
the key subprocess on every state mutation while the lock is held, so a slow or
hung key command stalled every concurrent signer operation for up to the
configured timeout (default 30s, max 300s).

Split persist into a thin wrapper (cold paths and tests) and
persist_engine_state_to_storage_with_key, which takes pre-resolved key material.
Every hot per-operation path (run_dkg, start/finalize_sign_round,
build_taproot_tx, trigger_emergency_rekey, promote/rollback_canary,
refresh_shares, interactive round2/aggregate) now resolves the key with
state_encryption_key_material() BEFORE acquiring the lock, so the subprocess
never runs under the lock. Only the in-memory serialize + encrypt + atomic write
stay under the lock (they must, to preserve write ordering and avoid lost
updates between concurrent ops).

The resolution outcome is DEFERRED, not propagated eagerly: the key Result is
consulted (via require_resolved_state_key) only at a site that actually
persists. Idempotent replays and pre-persist rejections return without ever
consulting it, so a transient key-provider outage cannot turn a cached read
(e.g. re-serving an already-computed signature on a ROAST retry) into a failure.
In the interactive round2/aggregate paths, the key is resolved into a local
BEFORE the consumption/aggregation marker is inserted, so a key failure returns
cleanly (no marker written) rather than escaping the marker-rollback block.

Semantics preserved: the key is still resolved fresh on every call (no caching),
so external KMS rotation and trigger_emergency_rekey keep picking up the current
key.

Tests: idempotent_build_tx_replay_survives_state_key_outage (a replay returns
its cached tx with a failing key command) and
interactive_round2_state_key_failure_does_not_burn_attempt (a Round2 that fails
at the key step leaves no consumption marker and the attempt still succeeds on
retry).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mswilkison and others added 2 commits June 27, 2026 17:19
… not before it

#4114 resolved the state-encryption key OFF the ENGINE_STATE lock to keep the
command-provider KMS subprocess from stalling the lock. But ENGINE_STATE already
serializes the persisted-state WRITES, so resolving the key before the lock left
key selection out-of-order with the writes: a caller that resolved an older key
(before a provider key rotation) could win the LAST write after a concurrent
caller already persisted the rotated key, tagging the single state file with the
stale key id. decode_encrypted_state_envelope accepts only the currently-resolved
key id, so a restart then rejects the file as a key-id mismatch and the persisted
state is unreadable.

Resolve the key UNDER the held ENGINE_STATE guard, at the persist site, so key
selection is in the same serialized order as the writes -- the last writer always
encrypts with the then-current key. Non-marker sites call
persist_engine_state_to_storage(&guard) (resolves under the guard then persists);
the two interactive marker sites resolve via state_encryption_key_material()?
BEFORE inserting the durable marker (preserving fail-closed-before-marker and the
marker-rollback-on-persist-failure), then persist_with_key. The deferred
require_resolved_state_key helper is removed.

Idempotent replays and pre-persist rejections still return before any key
resolution, so a transient key-provider outage cannot fail a non-persisting call
(the idempotent_build_tx_replay_survives_state_key_outage regression still holds).

TRADEOFF: this reverses #4114's off-lock optimization -- under the `command`
provider the KMS subprocess now runs under the guard during actual persists (a
bounded head-of-line stall; a hung command is bounded by terminate_state_key_command).
A future off-lock-preserving design would need a cheap rotation signal from the
key provider; recovering that is left as a follow-up.

Addresses the Codex P1 review on #4114.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…TATE guard, in write order (#4114)

## Summary

Resolves the at-rest state-encryption key **under the held
`ENGINE_STATE` guard, at the persist site**, so key selection is ordered
with the writes the guard already serializes.

This PR began as a perf change (resolve the key *off* the lock to keep
the `command`-provider KMS subprocess from stalling `ENGINE_STATE`). A
Codex review found that approach introduced a **P1 state-loss bug**, so
the PR now lands the correct ordering instead.

## The bug the off-lock approach introduced

`ENGINE_STATE` already serializes the persisted-state *writes*, but
resolving the key *before* the lock left key selection out-of-order with
the writes: a caller that resolved an older key (before a provider
key-rotation) could win the **last** write after a concurrent caller
already persisted the rotated key, tagging the single state file with
the stale key-id. `decode_encrypted_state_envelope` accepts only the
currently-resolved key-id, so a restart then rejects the file as a
key-id mismatch and the persisted state is **unreadable**.

## Fix

Resolve the key under the held guard, at the persist site — the last
writer always encrypts with the then-current key. Non-marker sites call
`persist_engine_state_to_storage(&guard)` (resolves under the guard,
then persists). The two interactive marker sites resolve via
`state_encryption_key_material()?` **before** inserting the durable
consumed/aggregated marker (preserving fail-closed-before-marker +
marker-rollback-on-persist-failure), then `persist_..._with_key`. The
deferred `require_resolved_state_key` helper is removed.

**Invariants preserved:** idempotent replays / pre-persist rejections
still return *before* any key resolution, so a transient key-provider
outage cannot fail a non-persisting call
(`idempotent_build_tx_replay_survives_state_key_outage` still holds).

## Verification

Clean `cargo build` (no warnings); **all 329 crate tests pass**. An
independent adversarial review confirmed all four ordering invariants at
every converted site (0 correctness findings).

## Tradeoff (tracked in #4121)

This reverses the off-lock optimization: under the `command`/KMS
provider the subprocess now runs under the guard during actual persists
(a bounded head-of-line stall; `env` provider is unaffected). Because
the `production` profile mandates the `command` provider, this is the
production path. Recovering off-lock safely needs added structure (a
cheap provider key-id signal, a coordinated rotation protocol, or a
partial-recovery persist-lock) — analyzed and tracked in **#4121**,
gated on measuring whether the stall actually bottlenecks throughput.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants