fix(cache): resolve parallel-restore discovery from the shared tier#370
Closed
worstell wants to merge 1 commit into
Closed
fix(cache): resolve parallel-restore discovery from the shared tier#370worstell wants to merge 1 commit into
worstell wants to merge 1 commit into
Conversation
ParallelGet's discovery read (the first ranged Open, which has no validator yet) is what establishes the ETag every subsequent chunk pins to with If-Match. It previously resolved against whichever tier the landing replica hit first — usually its local disk. Because replicas regenerate snapshots independently, local disks hold divergent revisions, so the pin was often a version other replicas no longer held, and the resulting 412 retry churn erased the gains of parallel download. Tiered.Open now treats an unpinned ranged read (a Range with no If-Match or If-Range) as a discovery read and resolves it from the deepest (shared) tier, so the pin is always the shared tier's current revision. It falls back to normal tiering only when that tier misses or is unavailable, so a healthy local tier can still serve. Full reads and pinned chunks are unchanged: they still serve from local disk where it matches. This needs no client-side flag — the discovery read is already an unpinned range — so it also covers any future parallel-download caller automatically.
11eb0ea to
58d38ac
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Parallel snapshot restore discovers the object's total size and ETag from its first ranged read, then pins every subsequent chunk to that ETag with
If-Match. Until now that discovery read resolved against whichever tier the landing replica happened to hit first — usually its local disk. Because each replica regenerates snapshots independently, local disks hold divergent revisions, so the pin was frequently a version no other replica (and sometimes not even the shared tier) still held. Every chunk that landed elsewhere then returned 412 and retried, and the retry churn erased the gains of parallel download.This makes the discovery read authoritative: it resolves from the tiered cache's deepest (shared) tier, so the pin is always the shared tier's current revision. Subsequent chunks then serve locally where a replica's disk matches and fall through to the shared tier (via the existing deeper-tier probing) where it doesn't, instead of failing.
Changes:
Authoritativerequest option, plumbed from the client through theX-Cachew-Authoritativeheader intoTiered.Open.Tiered.Openserves an authoritative read from the deepest tier, falling back to normal tiering only on a miss or a transient failure there so a healthy local tier can still serve. Non-tiered caches ignore the option.ParallelGetmarks its discovery read authoritative; all other chunks are unchanged.The one remaining race is a shared-tier overwrite during a download's lifetime, which is far rarer than the cross-replica divergence this removes and self-corrects on retry.