Skip to content

WebGPURenderer: Add compute shader bounds check#33186

Merged
sunag merged 8 commits intomrdoob:devfrom
marcofugaro:compute-bounds-check
Mar 20, 2026
Merged

WebGPURenderer: Add compute shader bounds check#33186
sunag merged 8 commits intomrdoob:devfrom
marcofugaro:compute-bounds-check

Conversation

@marcofugaro
Copy link
Copy Markdown
Contributor

@marcofugaro marcofugaro commented Mar 15, 2026

Related issue: Closes #33185

Description

See issue for details, added the internal WGSL bounds check with:

// bounds check
if ( instanceIndex >= count ) {
	return;
}

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 15, 2026

📦 Bundle size

Full ESM build, minified and gzipped.

Before After Diff
WebGL 360.15
85.57
360.15
85.57
+0 B
+0 B
WebGPU 632.13
175.39
632.52
175.48
+389 B
+92 B
WebGPU Nodes 630.25
175.1
630.64
175.19
+389 B
+92 B

🌳 Bundle size after tree-shaking

Minimal build including a renderer, camera, empty scene, and dependencies.

Before After Diff
WebGL 492.34
120.15
492.34
120.15
+0 B
+0 B
WebGPU 704.56
190.19
704.95
190.46
+389 B
+264 B
WebGPU Nodes 653.78
177.6
654.17
177.74
+389 B
+142 B

@gkjohnson
Copy link
Copy Markdown
Collaborator

gkjohnson commented Mar 16, 2026

This is not something that should be automatically done for the user in my opinion. There's workgroup size, dispatch size, and now I see there's a "count" value? I'm actually a bit confused as to how these are supposed to work together. How is a single dimensional "count" value supposed to work with the higher dimensional dispatch and workgroup size values?

edit Perhaps this is more of a comment on why this "count" value seems to already exist for compute kernels? There already seem to be so many different ways to do the same thing in TSL and many of them do not seem to interoperate cleanly.

Comment thread src/renderers/common/Pipelines.js Outdated
@sunag sunag added this to the r184 milestone Mar 16, 2026
Comment thread src/renderers/webgpu/nodes/WGSLNodeBuilder.js Outdated
@marcofugaro
Copy link
Copy Markdown
Contributor Author

marcofugaro commented Mar 16, 2026

@gkjohnson what do you think would be a better solution? The count logic is not exposed to the user, is internal to three.js. It is just the value the user passes to the .compute() call.

The workgroups logic is not exposed to the basic user either. As that user I'd expect that .compute(42) would run 42 times.
Personally I found out about this issue because the last item of my buffer would somehow be corrupted, as I explained in #33185.

@gkjohnson
Copy link
Copy Markdown
Collaborator

gkjohnson commented Mar 17, 2026

what do you think would be a better solution?

I think better documentation may be the first step - but everything needed to run and length check a compute shader is already available.

The workgroups logic is not exposed to the basic user either.

Can you explain what you mean by this? You can set the workgroup size like so:

// set the work group size for the shader
const kernel = fn.computeKernel( [ x, y, z ] );

The work group size is associated with the shader itself with a compute kernel function annotation meaning a new shader needs to be generated when changing it, so it makes sense for this to be associated with the TSL fn kernel rather than anywhere else. And it can be accessed via kernel.workgroupSize.

One primary issue with setting "count" on the kernel is that it doesn't actually do what it says. The workgroup size * dispatch size is actually what dictates the number of threads that get run, not this variable. Looking at the code it would seem more appropriate to call this variable a "maxThreadCount" since it will (imo confusingly) truncate the number of threads that was actually requested by the user using the dispatch syntax:

renderer.computeAsync( kernel, [ dispatch_x, dispatch_y, dispatch_z ] );

Having a thread count associated with the kernel itself does not make sense because it's common to use a compute kernel for processing multiple different buffers meaning you want to issue it with different dispatch counts. So now if you had set this "maxThreadCount" value for the kernel and then switch the data being processed to a differently sized buffer it will not work as expected.

Users should be writing their kernels to be resilient to the data sizes they are processing. It's possible to interrogate the dimensions of the buffers you're writing to and guard against writing outside the bounds of the buffer:

const fn = wgslFn( /* wgsl */`
	fn compute( globalId: vec3u ) -> void {
		let size = textureDimensions( tex );
		if ( globalId.x >= size.x || globalId.y >= size.y ) {
			return;
		}

		textureStore( tex, globalId.xy, vec4( 0, 0, 0, 1 ) );
	}
` )( { globalId } );

Now the kernel is resilient to the size of the buffer being written to and you can still take advantage of the performance benefits of a more optimally set workGroupSize. If "count" were to truncate my compute shader here it may be doing so in a way that isn't aligned with my function definition.

--

If you're not interested in tuning workgroup size for your use case then you can set the work group size to "1" and then dispatch based directly on the size of your buffer. Then you have no risk of overrunning the bounds of your data if you don't want to add a guard:

const kernel = fn( { data } ).computeKernel( [ 1, 1, 1 ] );
renderer.compute( kernel, [ data.length, 1, 1 ] );

I wasn't aware of this "count" value but based on the fact that it's inappopriately shaped as a single dimension, this seems vestigial from before 2d or 3d compute kernels were able to be run at all (see #31393). In my opinion this should be removed or at the least rethought significantly.

@sunag
Copy link
Copy Markdown
Collaborator

sunag commented Mar 18, 2026

count should not be used together with dispatchSize; we won’t have conflict regarding that if you are using computeKernel(), this is for compute( number ).

Sorry that the docs are still not clear; I intend to improve that this year.

@marcofugaro
Copy link
Copy Markdown
Contributor Author

@gkjohnson yes, this does not apply when using .computeKernel(), this applies only doing .compute(number). Since we support that codepath as well, we have to fix the issues that said codepath has.

I moved the logic to ComputeNode as @sunag suggested, it now should affect only the .compute(number) usage, the .computeKernel() way remains the same.

@marcofugaro marcofugaro requested a review from sunag March 19, 2026 00:24
@gkjohnson
Copy link
Copy Markdown
Collaborator

gkjohnson commented Mar 19, 2026

This doesn't really address what I'm saying. These are objectively competing mental models of how to run a compute shader in TSL: the dispatch count being associated with the kernel vs being passed during the compute call. Nor is it even a complete implementation with support for multi dimension kernels. I understand the intent is for these to be used separately but with no warning or anything when using both paths together this is absolutely going to cause a user bug at some point - either within a single project or inconsistent use between a library and user expectations. There's value in having a well-defined and consistent way to do things.

And again, why do we need so many ways of doing the same thing here? Is there actually a reason this feature should be included now? Or is it just "because it's already implemented"?

@marcofugaro
Copy link
Copy Markdown
Contributor Author

this is more of a question for @sunag, and imo not related to this fix now since I am not adding a property to ComputePipeline anymore, therefore not changing the exsting API

@sunag
Copy link
Copy Markdown
Collaborator

sunag commented Mar 19, 2026

And again, why do we need so many ways of doing the same thing here? Is there actually a reason this feature should be included now? Or is it just "because it's already implemented"?

From simple to complex; you can do simple things in a complex way if you prefer for more control for example like WebGPU does natively, but that’s not the idea of TSL, and that’s why we have two main approaches here.

  • compute(count)

    • define the dispatch phase in a simple and generic way through count
    • does not introduce initial complexity, which also allows the lib to make more internal decisions, like in this PR for example where it checks the bounds automatically
    • compatibility with WebGLBackend using Transform Feedback or others if needed
  • computeKernel()

    • exclusive to WebGPUBackend, more exclusive and advanced options related with WebGPU

@sunag sunag merged commit 82fa236 into mrdoob:dev Mar 20, 2026
10 checks passed
@marcofugaro marcofugaro deleted the compute-bounds-check branch March 29, 2026 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WebGPURenderer: Compute shaders lack bounds checking in the shader

3 participants