Skip to content

fix(cache): resolve parallel-restore discovery from the shared tier#370

Closed
worstell wants to merge 1 commit into
mainfrom
worstell/authoritative-discovery
Closed

fix(cache): resolve parallel-restore discovery from the shared tier#370
worstell wants to merge 1 commit into
mainfrom
worstell/authoritative-discovery

Conversation

@worstell

@worstell worstell commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Parallel snapshot restore discovers the object's total size and ETag from its first ranged read, then pins every subsequent chunk to that ETag with If-Match. Until now that discovery read resolved against whichever tier the landing replica happened to hit first — usually its local disk. Because each replica regenerates snapshots independently, local disks hold divergent revisions, so the pin was frequently a version no other replica (and sometimes not even the shared tier) still held. Every chunk that landed elsewhere then returned 412 and retried, and the retry churn erased the gains of parallel download.

This makes the discovery read authoritative: it resolves from the tiered cache's deepest (shared) tier, so the pin is always the shared tier's current revision. Subsequent chunks then serve locally where a replica's disk matches and fall through to the shared tier (via the existing deeper-tier probing) where it doesn't, instead of failing.

Changes:

  • New Authoritative request option, plumbed from the client through the X-Cachew-Authoritative header into Tiered.Open.
  • Tiered.Open serves an authoritative read from the deepest tier, falling back to normal tiering only on a miss or a transient failure there so a healthy local tier can still serve. Non-tiered caches ignore the option.
  • ParallelGet marks its discovery read authoritative; all other chunks are unchanged.

The one remaining race is a shared-tier overwrite during a download's lifetime, which is far rarer than the cross-replica divergence this removes and self-corrects on retry.

ParallelGet's discovery read (the first ranged Open, which has no
validator yet) is what establishes the ETag every subsequent chunk pins
to with If-Match. It previously resolved against whichever tier the
landing replica hit first — usually its local disk. Because replicas
regenerate snapshots independently, local disks hold divergent revisions,
so the pin was often a version other replicas no longer held, and the
resulting 412 retry churn erased the gains of parallel download.

Tiered.Open now treats an unpinned ranged read (a Range with no If-Match
or If-Range) as a discovery read and resolves it from the deepest
(shared) tier, so the pin is always the shared tier's current revision.
It falls back to normal tiering only when that tier misses or is
unavailable, so a healthy local tier can still serve. Full reads and
pinned chunks are unchanged: they still serve from local disk where it
matches.

This needs no client-side flag — the discovery read is already an
unpinned range — so it also covers any future parallel-download caller
automatically.
@worstell worstell force-pushed the worstell/authoritative-discovery branch from 11eb0ea to 58d38ac Compare July 2, 2026 23:19
@worstell worstell closed this Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant