Skip to content

fix(frost): harden the cgo native-signer FFI boundary#4115

Merged
mswilkison merged 2 commits into
feat/frost-schnorr-migration-scaffoldfrom
fix/frost-ffi-boundary-hardening
Jun 27, 2026
Merged

fix(frost): harden the cgo native-signer FFI boundary#4115
mswilkison merged 2 commits into
feat/frost-schnorr-migration-scaffoldfrom
fix/frost-ffi-boundary-hardening

Conversation

@mswilkison

Copy link
Copy Markdown
Contributor

Summary

Follow-up to #3866 (Go FROST/ROAST coordinator). Three defensive fixes at the cgo Go↔Rust signer FFI boundary in pkg/frost/signing, so a malformed/oversized response — or a panic — from the native signer fails a single attempt instead of crashing the node or leaking secrets.

Fixes

  1. Bounds-check the response length before C.GoBytes (parseBuildTaggedTBTCSignerResult). The Rust-supplied buffer.len (C.size_t) was narrowed to C.int with no check: a length ≥ 2³¹ overflows to a negative value → C.GoBytes panics (length out of range) at the cgo boundary, or silently truncates to a wrong length. Now rejected with a clear error (the buffer is still freed by the deferred free).

  2. Zeroize the secret request bytes on the C heap before C.free (the request-call helper). C.CBytes(requestPayload) copies the request (which can carry signing-share / nonce material) to the C heap; plain C.free does not overwrite, so the secret lingered in freed memory. Now scrubbed via the existing zeroBytes, mirroring the Go-side hygiene already applied to the caller's own copy.

  3. recover() at the FFI boundary (nativeExecutionFFIExecutorAdapter.Execute). A panic anywhere along the cgo signing path (e.g. fix Initial CircleCI setup #1's overflow panic, or a nil-deref decoding a malformed engine response) previously took down the whole signing process. It's now converted to a failed attempt the outer tBTC signingRetryLoop handles cleanly.

Tests / verification

  • TestNativeExecutionFFIExecutorAdapter_Execute_RecoversCgoBoundaryPanic — a panicking primitive; verified to crash the process without the recover and pass with it. Full untagged pkg/frost/signing suite stays green.
  • The cgo-tagged changes (Initial CircleCI setup #1, Vesting-compatible staking token #2) are compile-verified under -tags 'frost_native frost_tbtc_signer cgo' (build + go vet). They still need a runtime pass in the cgo + linked-libfrost_tbtc environment — I can't exercise the real FFI here.

Addresses the cgo-boundary cluster from the review of #3866 (length-narrowing crash, secret-in-C-heap, missing recover). Found during review of #3866.

🤖 Generated with Claude Code

Three defensive fixes at the Go<->Rust signer FFI boundary in
pkg/frost/signing, so a malformed/oversized response or a panic from the native
signer fails one attempt instead of crashing the node or leaking secrets:

- parseBuildTaggedTBTCSignerResult: bounds-check the response buffer length
  before the size_t -> C.int narrowing in C.GoBytes. A length >= 2^31 would
  overflow to a negative value and panic ("length out of range") at the cgo
  boundary, or silently truncate to a wrong length; reject it (the buffer is
  still released by the deferred free).

- the request-call helper: scrub the secret request bytes from the C heap (the
  C.CBytes copy) before C.free. The request can carry signing-share / nonce
  material, and plain C.free does not overwrite; this mirrors the existing
  Go-side zeroBytes hygiene applied to the caller's own copy.

- nativeExecutionFFIExecutorAdapter.Execute: recover panics raised along the
  cgo signing path and surface them as a failed attempt, so a single malformed
  native-signer response cannot take down the signing process. Adds a unit test
  with a panicking primitive (verified to crash the process without the
  recover).

The cgo paths are compile-verified under -tags 'frost_native frost_tbtc_signer
cgo'; the recover is exercised by an untagged unit test. The two cgo-tagged
changes still need a runtime pass in the cgo + linked-libfrost_tbtc
environment.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 0c93c12a-057d-4336-8f03-6fc877607d18

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/frost-ffi-boundary-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@mswilkison mswilkison merged commit c741008 into feat/frost-schnorr-migration-scaffold Jun 27, 2026
17 checks passed
@mswilkison mswilkison deleted the fix/frost-ffi-boundary-hardening branch June 27, 2026 22:06
mswilkison added a commit that referenced this pull request Jun 27, 2026
…ror type (#4122)

## Summary

Addresses a Codex finding (relayed via the #4119 review):
`TbtcChain.GetWallet` derived a **legacy** wallet ID on **any** error
from the canonical `walletID` accessor. For a FROST wallet on a
canonical Bridge, a transient call failure would silently yield the
left-padded legacy ID — and callers use `WalletChainData.WalletID` to
choose **P2TR (FROST)** vs **P2WPKH (legacy)** scripts, so the node
would build or search the **wrong wallet script**.

## Why route by scheme (revised after Codex P1)

The first revision distinguished by error type (a sentinel for the
missing accessor, surface everything else). Codex correctly flagged a
**P1 regression**: a *legacy on-chain Bridge* built with the *current*
generated binding still satisfies the accessor interface, so its missing
`walletID` function returns a normal RPC/ABI error — not the sentinel —
and that revision would surface it and **break `GetWallet` on exactly
the legacy deployments the fallback exists for**. Error type cannot
reliably separate "function absent on-chain" from "transient."

So this routes by **scheme**, using the wallet's `EcdsaWalletID` (which
`GetWallet` already reads, and which the codebase already uses to infer
scheme — zero ⇒ FROST):

- **Legacy ECDSA wallet** (`EcdsaWalletID != 0`): its canonical wallet
ID *equals* its legacy derivation, so fall back on **any** accessor
error — and it's the only option on a legacy Bridge lacking the
accessor.
- **FROST wallet** (`EcdsaWalletID == 0`): requires the canonical ID;
**surface** the error rather than return a wrong legacy ID. A FROST
wallet only exists on a canonical-ID Bridge, so such an error is
genuinely transient.

Logic is extracted into `resolveWalletID(bridge, walletPublicKeyHash,
ecdsaWalletID)`.

## Tests

`TestResolveWalletID` covers all four cases: accessor success →
canonical; FROST + accessor error → surfaced; **legacy + accessor error
→ legacy fallback** (the P1 regression guard — verified to fail if the
routing surfaces errors for legacy wallets); legacy + missing-accessor
binding → legacy fallback. gofmt + `go vet` clean; full
`pkg/chain/ethereum` suite passes.

_Found during the Codex review batch on #4115#4120; revised per the
Codex P1 re-review._

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant