-
-
Notifications
You must be signed in to change notification settings - Fork 21.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vulkan graphics pipelines use excessive amount of memory on Galaxy S23 #101635
Comments
Can you show what numbers we're dealing with here concerning the amount of pipelines created by your project as shown in the monitors here? https://docs.godotengine.org/en/latest/tutorials/performance/pipeline_compilations.html The amount of memory in use is indeed entirely driver-dependent. The only way to avoid the excessive memory usage caused by pipeline compilation would be to turn off specialization altogether, but that'd give you lower performance than 4.3 which opted to just specialize and stutter on the spot. This option is not offered at the moment and I'd push against such a thing being offered as it'd probably lead to users turning off things they shouldn't.
This is not that strange and I wouldn't say it's excessive. It's preloading the content before it shows it so it doesn't stutter. That's by design and you're likely to end up with a similar amount of usage when you end up exhausting their usage on the scene. |
The real project which is where the 2668.945mb number is from creates 110 pipelines from meshes, 43 from surfaces, 12 from specialization and 6 from canvas. The numbers for the Oneplus devices are from the MRP which uses much less than the real project but enough to show that something weird is going on with 72mb used on the Galaxy S23. I just added the Oneplus numbers for comparison. |
These pipeline numbers are very small. This sounds like a nasty code generation bug on this particular driver. For reference, the third person shooter example project reaches about double the amount of pipelines that you've mentioned. |
That'd likely hint that the problem is the ubershader generation itself rather than the specialization. Indeed, it seems like the only way to fix it here would be to disable ubershaders altogether for this particular device and always only opt for generating the specialized variant and stutter (which is the 4.3 behavior). |
Tested with a Pixel 4 (Adreno 540) and I can't reproduce this issue. I get 16.578125 mb from the pipelines. The fact that the S23 seems to be allocating memory in multiples of 24 mb might be a hint as to what is going wrong. Edit: Actually, testing with dev 3, I get 0.511719 mb from pipelines |
I still think this fits within reason though, by design there'll be more pipelines compiled ahead of time, it's just that usually the memory they consume isn't that much compared to the benefit they provide when it comes to no stuttering. The device in question reported in the OP seems like an outlier compared to weaker phones we've tested on, but we don't have much control over the code generation in that regard. My recommendation would be to try disabling the pipeline cache feature in project settings and see if that affects it. |
I was originally comparing the memory usage with the profiler in Android Studio. I did not actually test on 4.3 using the memory report. I only started using that to find out what exactly was causing the memory use since the Android Studio profiler only labeled the memory as "Graphics". |
Testing with the Forward+ renderer I get
This didn't help unfortunately. |
Testing on a Samsung Galaxy Tab S9 Ultra (16 GB RAM) with Android 14. This tablet has the same SoC as a Samsung Galaxy S23 (Snapdragon 8 Gen 2) but has more RAM in its 1 TB variant (16 GB instead of 12 GB). 4.4.dev3
4.4.beta1
Interestingly, this is present in the output even though the output appears correct:
In fact, it still appears if I add Edit: The crash on desktop is likely similar to #95967. |
@r-eckert Is there any change you have an S22 or S24 that you can test your main project with? Alternatively, is there any way you could privately share APKs of your project (with the memory report) exported from dev 3 and beta 1? |
I have made some headway investigating this. We know a few things:
In #102217 I reduced the total number of pipelines that get generated by making better use of the information we have at startup. That reduced memory usage from pipelines by 1/3, but it doesn't help the pathological explosion of size that each ubershader pipeline has. To investigate a bit further I tried running the shader through the Adreno Offline Compiler to see if we can glean more relevant information: The following results are from the fragment shader of the default spatial material captured on the first frame of running Dev3
Beta 1
I then used the following diff to force disable optional features Diff
Master with all features force disabled
Notably you can see the total instruction count changes dramatically and the Beta 1 ubershader uses scratch memory (which I suspect is the key thing here). Ultimately, I think we are just going to need to disable the Ubershader for the Adreno 740. |
@Calinou Can you test the following branches on your device?
With 3, on my Adreno 640 I am back to 0.5 mb in the MRP. With 2 I get 8mb instead of 16. If my theory is correct 1 won't do anything, 2, will help a bit, but not much, and 3 will fix the problem :) unroll
my WIP patch
Moving all settings to spec constants``` Adreno Offline Compiler (AOC) ----------------------------- AOC Version : 2.0 Compiler Version: E031.42.11.00======== Shader Stats FS ========
Total instruction count : 458
Total instruction count : 4049 Compilation succeeded.
|
I just gave the Nr. 3 branch a test with our full project on my S23 and now it doesn't crash anymore. But unfortunately it still uses more than 1GB for pipelines. It might still kill the game on lower spec devices but now I can test the full game on my phone again so that's something.
|
@r-eckert It looks like that branch is blowing up the number of surface, draw, and specialization compiles. Are you changing quality settings at run time by any chance? Or any of the following:
Its odd that your pipeline size is only cut by half, so there must be something in your project that isn't captured by the MRP. Is there any chance you can produce an MRP that is more representative of your project or provide us with access to your project for testing? At this point I don't think it is worth spending any more time investigating this issue since it only appears with your device and your project and we don't have access to either. |
Tested versions
System information
Samsung Galaxy S23 Ultra, Android 14, Vulkan (Mobile), Adreno 740
Issue description
After trying to update our project to Godot 4.4 we found that it crashes while loading on my Android phone.
The profiler showed that graphics memory was exceeding 4GB before the app closes while the game running on Godot 4.3 only uses around 800mb of graphics memory.
I have bisected everything between 4.3 and 4.4 and traced the problem to #90400 getting merged.
Here is a memory report generated by RenderingDevice.get_driver_and_device_memory_report from a version of our game with lots of content removed so it starts at all:
You can see that device memory for pipelines is 2668.945mb. The same line in Godot 4.3 (and 4.4 before #90400 got merged) shows a little over 1mb.
I modified my engine build to log every allocation related to pipeline objects and found that it allocates a block of 24mb for some of the pipelines. Adding up those 24mb allocations gives me pretty much exactly the excess amount of memory use compared to without that PR
I suspect that it is related to the Ubershader that is used while the optimized pipeline is compiled.
A hacky attempt to disable the feature by preventing the "define UBERSHADER" in the shader from being set resulted in the weird allocations disappearing.
But I am not that familiar with the code yet and I am also running out of time that I can invest in this problem so hopefully someone here can find a proper workaround.
The attached MRP contains just a camera looking at a single cube and a script to print the memory report.
On my device this already uses 72mb for pipelines. Strangely on a Oneplus 6 it was only 6.6mb and on a Oneplus 8 it uses 12.6mb for pipelines which still seems excessive compared to 1mb but apparently this depends heavily on hardware or driver version.
I could not test yet how this scales with the real project on those other devices.
I understand that this is likely related to a driver issue that we can't fix but maybe it can be mitigated somehow? If not maybe ubershaders can be deactivated depending on hardware or a project setting?
Steps to reproduce
Minimal reproduction project (MRP)
pipeline_memory_mrp.zip
The text was updated successfully, but these errors were encountered: