-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Cesium3DTilesetStatistics improvements #12974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Beilinson
wants to merge
9
commits into
CesiumGS:main
Choose a base branch
from
Beilinson:3d-tileset-statistics-performance
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+43
−16
Open
Changes from 6 commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
dd82a80
Remove Cesium3DTilesetStatistics.textureReferenceCounterById destructure
Beilinson 1f2458c
tileset traversal visit fix
Beilinson 729a10b
Revert "tileset traversal visit fix"
Beilinson 5150e74
Changelog
Beilinson 137d023
Merge branch 'main' into 3d-tileset-statistics-performance
Beilinson f77f097
Merge branch 'main' into 3d-tileset-statistics-performance
Beilinson 2f5822b
Snapshot example
Beilinson 617a9e5
Fix undefined statisticsLast
Beilinson c1b4757
Merge branch 'main' into 3d-tileset-statistics-performance
Beilinson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems Github ate this comment and didn't post it with my original review. This is my main remaining concern with this PR that it's no longer a full clone
What do you think about cloning the map here instead of just passing by reference? In some limited testing it seemed like it was still a big improvement over the spread
...but also stays more true to the meaning ofcloneby actually cloning?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just compared all three options using your sandbox and
CPUPROFILE_FREQUENCY=1000, the results:Before this pr:

Clone with new Map (this suggestion):

No deep clone (my version):
Don't have anything to show here because it doesn't show up on the performance sample
This change makes it no longer a "deep clone", it still is a "shallow clone", and I dont think the deep behavior is needed for a completely internal part of this class which has zero effect on any part of the external runtime or the end result when comparing statistics before/after rendering. IMO, even if its only 4% of total cpu time, thats still 4% that has no added value on the outcome of the statistics counting.
This element seems to only be used by the new version (the clone) and never the old object, the actual
texturesByteLengthwhich this is attribute is used to update is copied over.For context, at 4% this unnecessary clone takes more time than Matrix4.multiplyTransformation or binding the vertices to webgl:
4% rendertime isn't critical, but I think if many of these small percentage improvements can be made that will make a big long-term difference.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jjspace You mentioned the cloning aspect in the now-closed PR at #12968 (comment) (but that referred to the credits, so I'm not sure if this is what you meant...?)
I'm also in strong favor of avoiding surprises:
cloneshould clone (no shallow copy with unpredictable side-effects).But... I had a short look at where
Cesium3DTilesetStatistics.cloneis actually called and how it is used.One call is in
update(every call is in someupdate😆 ). It's not really documented what's happening there. But it seems to fill sometileset._statisticsPerPassarray. This array, in turn, does not seem to be used anywhere.The other call is in
raiseLoadProgressEvent. There, it fills sometileset._statisticsLast. But after that, it only seems to care about thenumberOfPendingRequestsandnumberOfTilesProcessingthere (and certainly not about that map with the texture IDs).Sooo... unless I'm overlooking something, the places that are using that
clonefunction are doing a lot of unnecessary work to begin with, and on top of that, none of them seems to depend on thatMap. I'm sure there is some room for improvement.One point that might be relevant for real peformance comparisons: What's in that
Mapafter all? The performance will to some extent depend on the size of that map. On the other hand: When there are 10 textures, then the map is small, and when there are 1000 textures, then there is other costly stuff - so I'd expect this to not be a bottleneck either way.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javagl The map gets pretty big (hundreds of entries) of
[string-number]pairs.I checked the instances of the statistics cloning and saw the same as you reported. Either way, the clone is used more as a "Lets get a snapshot of what the statistics currently looks like", wherein the map is not useful.
_statisticsPerPassis used in the specs, and also in theCesium3DTilesInspectorViewModel.getStatistics. Here all the internal properties are needed except theMapagain, because these act as snapshots of what the statistics looked like while processing the pick/render/whatever pass, and aren't actively used for calculating anything.Ideas:
.snapshotor open to other suggestions) that only passes around these values without copying the map. It would return an object that doesn't have the functions to increment/decrement the values, representing that its a static snapshot object that shouldn't be worked on directly._statisticsLastwould become an object containing onlynumberOfPendingRequestsandnumberOTilesProcessing, and the passes would copy over everything except the map. I think the snapshot model is cleaner.thoughts @javagl @jjspace ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed an example of what the snapshot would look like, note that the passes and last statistics now are no longer full statistics objects, so they dont contain functions to increment/decrement counts or any other logic, just container objects. They also don't contain the map.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion, but a few random comments for now:
Cesium3DTilesInspectorViewModelon the radar (usually only searching in./packages/engine/Source...)_statisticsPerPasscould/should carry a comment saying that it's only for the UI. And for me, the inlined "block comments",// Rendering statistics,// Loading statistics... etc. are screaming that there should beRenderingStatisticsandLoadingStatisticssub-structures.The 'snapshot' approach looks conceptually clean for me, but implies new structures (first and foremost that new type,
...Snapshot) and several changes that may warrant some explaining (maybe even just/** A snapshot is a shallow copy ... */or so).If this was only about the
raiseLoadProgressEvent, I could imagine that storing both relevant fields explicitly, astileset._lastNumberOfPendingRequestsand_lastNumberOTilesProcessingcould be an option a well. (Yes, this "litters" the tileset even more, but ... now there's thatstatisticsLastwhich carries mostly unused stuff, so both are not ideal anyhow...)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree about statisticsLast (although future logic may want to use the other attributes for certain logic maybe).
But yes it's both used for specs (which in my opinion would preferably be something hidden by a debug flag) but also for the inspector view to debug either the pick/render pass (from what I saw nearly every member other than the map is used).
I would prefer not to stretch the structural changes in this PR much more than I have already, but let me know what you think is best, just comment the areas for future reference or something different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, no strong opinion (I assume others will chime in here), but
Map: GoodA compromise:
Iff the performance benefit is really worth it, maybe just renaming the
clonetoshallowCopycould be reasonable? (It's an undocumented/internal function after all.. )Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im fine with that,
shallowClone+ comment explaining that the clones should not be worked on directly sounds good to me.I could also set the map to
undefinedso if in the future someone does try to use one of the clones it shoulderror out.thoughts @jjspace /and or others?