perf(start-client-core): O(1) buffer drain in client frame decoder#7663
perf(start-client-core): O(1) buffer drain in client frame decoder#7663anonrig wants to merge 1 commit into
Conversation
The frame decoder dropped consumed chunks from its buffer with bufferList.shift(), which is O(n). When a single large frame (e.g. a big RawStream payload) is assembled from many small network reads, the extract loop calls shift() once per chunk, making reassembly O(n^2). Track the first un-consumed chunk with a head pointer and advance it in O(1) instead of shifting. Consumed slots are released for GC, and the buffer is compacted when fully drained (O(1) reset) or once the consumed prefix grows past a small threshold (amortized O(1) per chunk). A micro-benchmark draining 1000 small chunks is ~11x faster.
📝 WalkthroughWalkthrough
ChangesFrame decoder O(1) buffer optimization
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/start-client-core/tests/frame-decoder.test.ts (1)
567-567: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winUse braces for one-line control statements in tests.
Line 567, Line 600-601, and Line 641 use single-line bodies without braces, which violates the project rule for TS/JS control flow.
As per coding guidelines, "
**/*.{ts,tsx,js,jsx}: Always use curly braces forif,else, loops, and similar control statements. Never write one-line bodies likeif (foo) x = 1."Suggested patch
- for (let i = 0; i < payload.length; i++) payload[i] = (i * 7) % 256 + for (let i = 0; i < payload.length; i++) { + payload[i] = (i * 7) % 256 + } while (true) { const { done, value } = await rawReader.read() - if (done) break - if (value) received.push(...value) + if (done) { + break + } + if (value) { + received.push(...value) + } } while (true) { const { done, value } = await reader.read() - if (done) break + if (done) { + break + } received.push(value) }Also applies to: 600-601, 641-641
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/start-client-core/tests/frame-decoder.test.ts` at line 567, The for loop on line 567 with body payload[i] = (i * 7) % 256 uses a single-line body without braces, violating the project's coding guidelines. Add curly braces around the loop body to properly format it. Additionally, apply the same fix to the other single-line control statements mentioned at lines 600-601 and 641 by wrapping their bodies in curly braces.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/start-client-core/tests/frame-decoder.test.ts`:
- Line 567: The for loop on line 567 with body payload[i] = (i * 7) % 256 uses a
single-line body without braces, violating the project's coding guidelines. Add
curly braces around the loop body to properly format it. Additionally, apply the
same fix to the other single-line control statements mentioned at lines 600-601
and 641 by wrapping their bodies in curly braces.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d9195414-0588-45d1-b334-36aea734698b
📒 Files selected for processing (3)
.changeset/perf-frame-decoder-index-pointer.mdpackages/start-client-core/src/client-rpc/frame-decoder.tspackages/start-client-core/tests/frame-decoder.test.ts
|
View your CI Pipeline Execution ↗ for commit 38c9d48
☁️ Nx Cloud last updated this comment at |
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Memory | mem serialization-payload (solid) |
6.9 MB | 7.4 MB | -5.96% |
| ⚡ | Memory | mem aborted-requests (solid) |
1.8 MB | 1.6 MB | +12.59% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing anonrig:perf/frame-decoder-index-pointer (38c9d48) with main (f23ed0f)1
Footnotes
|
Can you fix the conflicts on this one? And then I'll merge it |
What
Replace the O(n)
bufferList.shift()in the client-side frame decoder (packages/start-client-core/src/client-rpc/frame-decoder.ts) with an O(1) head pointer.extractFlattened()dropped each fully-consumed chunk from the front ofbufferListwithshift(). When a single large frame (e.g. a bigRawStreampayload) is assembled from many small network reads, the extract loop callsshift()once per chunk — and eachshift()re-indexes the whole array, so reassembly degrades to O(n²).This PR tracks the first un-consumed chunk with a
bufferHeadindex and advances it in O(1) instead of shifting.readHeader()reads frombufferHeadas well. Consumed slots are released for GC, and the array is compacted:bufferHead === bufferList.length) → reset in O(1) (the common terminal state), orsplice()it off (amortized O(1) per consumed chunk).This mirrors the index-pointer approach already used in
transformStreamWithRouterfor the same reason.Why
Same hot path as the sibling zero-copy PR: decoding streamed server-function responses and
RawStreampayloads. The O(n²) bites specifically when one frame spans many buffered chunks.Standalone micro-benchmark (Node, median of 12 runs), draining N buffered chunks:
shift()Tests
frame-decodersuite passes.CHUNKpayload delivered one byte at a time (forces the header slow path + many whole-chunk consumptions + the fully-drained reset),splice()prefix drop).Notes
extractFlattened, so whichever merges second will need a trivial rebase.@tanstack/start-client-corepatch).Summary by CodeRabbit
Performance
Tests