Skip to content

Add many_morph_targets stress test #18536

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

greeble-dev
Copy link
Contributor

@greeble-dev greeble-dev commented Mar 25, 2025

Objective

I wanted to benchmark the morph target changes in #18465. I also wanted to test morph targets on multiple meshes, which is not covered by existing examples.

Solution

Add a stress test for morph targets, similar to many_cubes and many_foxes. Spawns a ton of meshes (defaults to 1024) and animates their morph target weights.

425571366-b043c16c-6e6a-491e-a0bd-5ece630d7bf8

Testing

cargo run --example many_morph_targets

# Test different mesh counts.
cargo run --example many_morph_targets -- --count 42

Tested on Win10/Vulkan/Nvidia, Wasm/WebGL/Chrome/Win10/Nvidia.

@IceSentry IceSentry added A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Mar 25, 2025
Copy link
Contributor

@IceSentry IceSentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you measure if the performance of this scene is actually bottlenecked on the morph targets part and not just rendering?

@greeble-dev
Copy link
Contributor Author

I'm not sure I understand? The intent isn't to reproduce a particular bottleneck. I just wanted a benchmark for #18465 and figured it would have some general use. On my machine it's embarrassingly GPU bound, but that's fine as the CPU numbers are still meaningful.

@IceSentry
Copy link
Contributor

I mean, if we want to say it's a morph targets stress test we need to be sure that's the part that's actually stessing the system when you increase the amount of things on screen. If it just gets harder to run because of rendering then it's not as useful without relying on other tools to measure perf.

@greeble-dev
Copy link
Contributor Author

greeble-dev commented Mar 25, 2025

Aren't morph targets part of rendering since some of the work is in the vertex shader? I guess I could try to cut down other GPU work but I'm not sure there's much else in the test... maybe post-processing? Or put the meshes off-screen and disable frustum culling, but that makes it less useful as a visual test?

Also happy to run GPU profiles and report results. But I'll need advice on which profiler is best - I'm limited to Win10/Nvidia.

@IceSentry
Copy link
Contributor

Well, I mean, it could be the fragment shader or whatever taking up most of the time. I just want to make sure using this actually measures the right thing if we just look at the frametime. It's possible it's already the case to be clear, I just don't know. I just want to make sure the overhead of everything else isn't higher than the one of morph targets. As in, if you render all the meshes without any morph targets would it be faster. If the overhead is higher then it's just benchmarking rendering many meshes and the morph targets part becomes irrelevant.

@greeble-dev
Copy link
Contributor Author

greeble-dev commented Mar 26, 2025

I added more options and tested frame times with various combinations.

TLDR: I think the framerate reflects at least part of the morph target work.

--camera far = camera zoomed out to minimise pixel shader costs - each mesh is about one pixel.
--weights zero = all weights zero, morph targets kinda disabled (pipeline the same, but vertex shader does no per-morph work except checking for zero weight).
--weights tiny = all weights tiny, so vertex shader does same work as one but pixel shader has roughly same cost as zero.
--weights one = all weights one.
--weights animated = mix of weights, some zero.

--camera default --camera far
--weights zero 7.5ms 4.9ms
--weights tiny 18.3ms 13.4ms
--weights one 24.1ms 13.5ms
--weights animated 12.8ms (default) 8.1ms

The main thing is that zero and tiny show a clear variation when the only major difference is morph target specific vertex shader work.

Without a GPU profile I can't prove that the default setting is meaningful since I can't isolate the vertex shader work.

All tests were GPU bound. But I'm using a potato (Nvidia 1030) with a modern CPU, so someone with a real GPU might not be.

@IceSentry
Copy link
Contributor

Awesome, thank, that should be good now.

greeble-dev added a commit to greeble-dev/bevy that referenced this pull request Mar 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times S-Needs-Review Needs reviewer attention (from anyone!) to move forward
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

2 participants