🧼 tune wasm opt levels per module (layout for speed, input for size)#47
Draft
natemoo-re wants to merge 3 commits into
Draft
🧼 tune wasm opt levels per module (layout for speed, input for size)#47natemoo-re wants to merge 3 commits into
natemoo-re wants to merge 3 commits into
Conversation
commit: |
Contributor
Merging this PR will degrade performance by 21.95%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | diff render (second frame) |
655.4 µs | 1,031.4 µs | -36.46% |
| ❌ | Simulation | dashboard layout |
408 µs | 595.9 µs | -31.53% |
| ❌ | Simulation | bordered box with corner radius |
218.4 µs | 266.4 µs | -18.03% |
| ❌ | Simulation | simple text |
179.7 µs | 216.7 µs | -17.1% |
| ❌ | Simulation | render with pointer hit testing |
272.4 µs | 316.6 µs | -13.98% |
| ❌ | Simulation | long input burst (200 bytes) |
980.1 µs | 1,102.9 µs | -11.13% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing ref/wasm-opt (3d50b7c) with main (02754a2)
Split the shared -O2 CFLAGS into per-module levels. layout.wasm holds Clay_EndLayout (~33% of the code and the entire render hot path), so it is built -Os; input.wasm is opt-insensitive in the benchmarks, so it is built -Oz. Levels are overridable (LAYOUT_OPT/INPUT_OPT) to A/B on CodSpeed.
-Oz and wasm-opt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Building on #37's layout/input split, this tunes each module's clang opt level independently instead of sharing one
-O2.The
-Ozrender regression is concentrated almost entirely in a single function:Clay_EndLayoutis ~33% of the code section and the entire layout hot path. Under global-Oz, every CodSpeed regression was a render/layout benchmark (−17% to −36%); every input-parser benchmark was unchanged. #37's module boundary happens to fall along that same seam, so each side can be optimized for what it needs:layout.wasm→-Os— keeps the render path fast without the full-O2size costinput.wasm→-Oz— opt-insensitive in the benchmarks, so optimize purely for sizemake LAYOUT_OPT=-O2) to A/B on CodSpeedArtifact sizes (default
-Os/-Oz):layout.wasminput.wasmNo
wasm-optpost-pass — it's a ~600 B no-op on already-optimized output, so this drops the extra build dependency that the previous-Ozapproach needed.