-
-
Couldn't load subscription status.
- Fork 36.1k
WebGPURenderer: Introduce dispatchWorkgroupsIndirect #31488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📦 Bundle sizeFull ESM build, minified and gzipped.
🌳 Bundle size after tree-shakingMinimal build including a renderer, camera, empty scene, and dependencies.
|
|
The Could you include for |
|
The idea of calling it dispatchSize again also occurred to me, it's simply more general for everything that contains the size, whether array or buffer.
Please excuse me, sometimes I'm a little slow on the uptake. |
|
Hmm... I was referring to this: #31026 Maybe we can replace |
|
Hi Sunag, I'm sorry for the delay. I've been very preoccupied with refraction the last few days. What do you think of the following idea? } )().compute( workgroupSize );
...
await renderer.computeAsync( computeParticles, dispatchSize( particleCount ) );So we have the possibility to specify the dispatchSize self as a number, array, buffer, or to use the dispatchSize node which builds the dispatchSize with the count and the workgroupSize from the compute entity. I realize that this means that over time users will have to adjust their code, but this seems like the cleanest solution to me. |
|
It would be really nice to have this in r182. I'd obviously be hesitant to change the existing compute syntax, but I think workgroupSize acting as the main compute parameter makes sense in the sense that the ComputeNode effectively constructs the ComputeNode template, while the render loop is actually specifying many times that compute shader is run, rather than having to, say, construct the same compute node multiple times for different dispatch sizes or different element counts. |
|
I'm currently very busy with other things, so I haven't continued with this. The extension works wonderfully as is, though. It's very easy to use. The buffer is passed directly to dispatchWorkgroupsIndirect in the backend. This makes things straightforward and very robust. I'll update and see where the conflict with the current version lies. @sunag, can this be included in the new version once I've updated it? |
|
My primary thoughts are: A. In the examples from August, only workgroupSize is passed to the computeNode rather than the count. I'm not sure if that is still the case in this PR or if there is an intermediate solution, but none of the examples have been updated to use this syntax. If the PR is compatible with the existing compute syntax, than there should at minimum be an example demonstrating the functionality and/or benefits of computeIndirect. The simplest idea I can think of is quickly updating one of the particle examples to dynamically update the number of particles that are being computed/rendered, with the count being read from the buffer passed to the dispatchWorkgroupsIndirect. B. I think it would be better if there some sort of specific flag or declaration that a compute program would be executed indirectly, rather than switching based on the arguments passed to renderer.compute. That why it's more explicitly apparent to the user that the renderer or the shader program is being reconfigured to run indirectly, the same way that setting an indirect flag on geometry does in webgpu_struct_draw_indirect. Either that or there should be a separate computeIndirect function that takes a buffer as an argument with the compute function remaining as is. I think either solution would be better than overloading the function definition, and would align more closely with the WebGPU backend (dispatchWorkgroups + dispatchWorkgroupsIndirect functions versus a single dispatchWorkgroups function) |
I'm not sure this is a great idea as the same compute shader could be executed directly and indirectly without modifying it.
Almost all the other work in those functions would be the same, thus creating duplication if implemented separately. I don't think function overloading is much of a problem. Anyway, looking forward to this being released! |
|
I will write tomorrow at the latest |
|
@cmhhelgeson I'm not usually someone who insists on implementing things according to my own ideas. In this case, I consider the explicit introduction of extras that are specifically named indirect to be unnecessary. The only thing the passEncoder needs in the backend is an array ( direct ) or a buffer ( indirect ), and these can be passed directly to the renderer using: The indirectness is already apparent and clear when creating the buffer, since it must be an indirect buffer, as in @TheBlek's example. Please excuse my lack of activity lately. I'm currently working on a demo app that I'd like to present to the University of Berlin, as they like what I'm doing with Threejs. @TheBlek Thanks for your effort with the example. I already use my extension myself, but my applications aren't suitable as examples due to their scope. |
No need to excuse your lack of inactivity 👍 and apologies for maybe coming on to strong with my own syntactical preferences. |


Related issue: #30982
This small extension allows you to control the dispatchSize with the GPU.
The benefit of GPU-side dispatchSize control lies in regressions.
Regressions aren't feasible on the GPU. They require multiple computes. Here's a clear example. My voxelizer. Here I'm voxelizing the BlackPearl. This is incredibly fast with compute shaders. Surface voxels are green. I also voxel the volume with yellow voxels ( important for buoyancy ), and the floodfill mechanism requires 52 iterations for this. With adaptive dispatchSize, the dispatchSize can always be adjusted to the remaining number of voxels.
This is also important for BVH if you want to do it with the GPU because they also require regressions.
The extension was pleasingly simple. You just have to follow the WebGPU guidelines:
No read/write in the same pass. However, it's logical that a compute shader can't compute its own dispatchSize. This requires a separate compute that runs with normal dispatch. Since this usually only requires a count of 1, it's definitely worth it for regressions. Like in Unreal and Unity