Skip to content

Conversation

@jjhembd
Copy link
Contributor

@jjhembd jjhembd commented Dec 5, 2025

Description

This PR improves memory efficiency in VoxelPrimitive by reworking Megatexture to use the Texture3D class from #12611.

Megatextures

Unlike Cesium3DTileset which renders each tile with separate DrawCommands, voxel tilesets are rendered in a single draw command for all tiles. This is necessary because of the raymarching approach: foreground tiles affect the rendering of background tiles. As a result, all tiles in the scene must be loaded into a single texture, via the Megatexture class. Different tiles are assigned different regions within the same Megatexture.

Previous Megatexture implementation

Prior to this PR, the Megatexture implementation was backed by a 2D texture. 3D voxel tiles were split into 2D slices, which were then arranged across the 2D texture. This structure made VoxelPrimitive compatible with WebGL1 contexts, but had some drawbacks:

  • Shader complexity: To rendering a voxel with linear interpolation, we couldn't depend on the hardware sampling, but had to explicitly sample 4 points from two different slices, and interpolate them in the shader.
  • Memory inefficiency: the 2D texture was always sized as a square with the width and height being a power of two. This restriction, combined with the complexity of slicing 3D tiles onto a 2D surface, resulted in some datasets allocating as much as 75% more texture memory than needed.

New Megatexture implementation

This PR reworks Megatexture to use Texture3D, and removes restrictions on the size of the texture. This simplifies the shader code, since we can directly use built-in graphics APIs for both linear and nearest sampling. Also, the texture can be allocated to more closely fit the size of the actual data.

How 3D Textures are sized

The data for each tile is like a 3D box, and the Megatexture is like a bin into which we are packing the boxes. Bin packing in general is a hard optimization problem. The approach used here starts from some simplifying assumptions:

  • All tiles are the same size (it is an implicit octree)
  • For a given maximum number of tiles, allocating <25% unused space is "good enough"
  • For a given hard memory limit, utilizing >80% of the limit is "good enough"

All sizing is done based on the maximum number of tiles, which is either a value from the VoxelProvider, or the available memory divided by the memory size of one tile.

We first check for a special case: if all of the tiles can fit in a single 1x1xN stack, without the long dimension exceeding GL_MAX_3D_TEXTURE_SIZE, then the texture is allocated to that size. This guarantees no wasted space, because this shape can be made to fit the tiles exactly. We make sure to stack the tiles along the smallest tile dimension, to increase the chances of achieving this optimal case.

If more than one row of tiles is needed (GL_MAX_3D_TEXTURE_SIZE tends to be smallish), we then proceed as follows:

  1. Find a factorization of the tile count into a product of small primes (2, 3, 5, or 7). The tile count is rounded up to the nearest such composite number, as needed. (There are some further simplifications--we are not checking every possible product of those primes).
  2. Distribute the factors into 3 dimensions, starting from the axis of the smallest tile dimension, with the constraint tileCount[i] * tileDimension[i] < GL_MAX_3D_TEXTURE_SIZE
  3. Compute the resulting texture size textureSize[i] = tileCount[i] * tileDimension[i]

Other changes

  • Updated ContextLimits documentation and specs to assume the default WebGL2
  • Added ContextLimits.maximum3DTextureSize
  • Added Texture3D.prototype.copyFrom method, following the similar method from Texture
  • Cleaned up Voxels Sandcastle example
  • Removed some WebGL1 workarounds from voxel shaders (Voxels require WebGL2 since Use eye coordinates for voxel raymarching #12933)

Issue number and link

Resolves #12570

Testing plan

Run all specs locally. Changes in ShaderBuilder should not affect non-voxel rendering, but we should verify this.

Load all voxel-related Sandcastles, and verify:

  • Rendering is unchanged
  • Framerate is not affected (let me know if it improves!)
  • Picking works
  • Nearest and linear sampling both work as expected (Open the VoxelInspector to toggle this)

Author checklist

  • I have submitted a Contributor License Agreement
  • I have added my name to CONTRIBUTORS.md
  • I have updated CHANGES.md with a short summary of my change
  • I have added or updated unit tests to ensure consistent code coverage
  • I have updated the inline documentation, and included code examples where relevant
  • I have performed a self-review of my code

@github-actions
Copy link

github-actions bot commented Dec 5, 2025

Thank you for the pull request, @jjhembd!

✅ We can confirm we have a CLA on file for you.

@jjhembd jjhembd marked this pull request as ready for review December 8, 2025 15:16
Copy link
Contributor Author

@jjhembd jjhembd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some review comments in key areas, and in places where I need feedback.

} = options;

if (!context.webgl2) {
if (!context.webgl2 && !defined(context.options.getWebGLStub)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to allow a WebGL stub, since that is what we use for testing. Much of the following code can still be reasonably tested in a stub.
I'm open to feedback on where to allow the stub. Should we try to set a .webgl2 property on the stubbed context? Or, since WebGL2 is now the default, should we be switching to a context.webgl1 flag?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the stub would ideally be shaped like a WebGL2 context for forward-compatibility. Does that create other issues in tests? The !defined(context.options.getWebGLStub) check above also seems reasonable, if so, but I would just add a comment explaining why it's there (perhaps with a link to an issue) since that may not be obvious to the next reader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Context.webgl2 flag is set as follows:

const webgl2 = webgl2Supported && glContext instanceof WebGL2RenderingContext;

It will be hard to make the stub pass that instanceof check. This makes me think a replacement of the .webgl2 flag with a .webgl1 flag is the best route. However, that would be a breaking change... so maybe it's better to just accept that support for both WebGL1 and WebGL2 will be messy, but temporary (we will drop WebGL1 someday).

I added a comment clarifying the check in Texture3D.

Check.typeOf.number.greaterThan("width", width, 0);

if (width > ContextLimits.maximumTextureSize) {
if (width > ContextLimits.maximum3DTextureSize) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to have been an oversight in #12611. It matters for large textures: maximum3DTextureSize tends to be smaller than maximumTextureSize.

* }
* });
*/
Texture3D.prototype.copyFrom = function (options) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare to Texture.copyFrom. The Texture3D version here supports fewer source data types.

Check.typeOf.bool("nearestSampling", nearestSampling);
//>>includeEnd('debug');
if (this._nearestSampling === nearestSampling) {
return;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The texture.sampler setter makes some GL calls, so when possible, we exit early to avoid calling it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I understand why avoiding GL calls would be preferable here, is it likely that this setting would change frequently? Or it makes testing more difficult? Would it make sense to have a Sampler.LINEAR preset as well, and avoid Sampler construction entirely here without needing to check an internal cache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .nearestSampling setter is called every frame, from VoxelPrimitive.prototype.update. The redirection from a setter to a per-frame update method is how we debounce user input. Then, by checking for changes, we avoid extra GL calls every frame.

I suppose we could instead check for change in the VoxelPrimitive.prototype.nearestSampling setter--let me know if you think that would make more sense. My initial thought was that nearestSampling is really a texture thing, so it made sense to do the state / change tracking from the class that is closer to the texture.

A Sampler.LINEAR preset might make sense. It's less obvious to me what the default edge conditions should be--many linearly sampled textures might use wrapping rather than clamping? If so, we would almost need to include the wrap conditions in the name, i.e., Sampler.LINEAR_CLAMPED ?

* @returns {Cartesian3} The computed 3D texture dimensions.
*/
Megatexture.getApproximateTextureMemoryByteLength = function (
Megatexture.get3DTextureDimension = function (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not used outside the class, but we expose it for testing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly worth adding this as a comment on the method and/or adding a JSDoc @ignore hint? OK with me either way though, I see the class itself is private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment and a @private tag. I know the tag is technically redundant, but it's to communicate the intent to the reader. I don't see much use of @ignore elsewhere in the repo.

inputDimensions,
types,
componentTypes,
// TODO: refine this. If provider.maximumTileCount is not defined, we will always allocate 512 MB per metadata property.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeking feedback here. What if the tileset has 10 metadata properties? This would allocate 5 GB. Should we default to a total memory per tileset?


it("shows texture memory allocation statistic", function () {
expect(traversal.textureMemoryByteLength).toBe(textureMemoryByteLength);
expect(traversal.textureMemoryByteLength).toBe(32);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new allocation is smaller if the data is smaller than the suggested byte length

Copy link
Member

@donmccurdy donmccurdy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still reading through things, but just a couple initial comments!

@ggetz
Copy link
Contributor

ggetz commented Dec 10, 2025

For context on the matter of WebGL2/WebGL1 support—While we're not likely to fully deprecated WebGL 1 support in the near future, our implicit policy is that newer features like voxels don't need backwards compatibility with WebGL 1. The rational is that we default to WebGL 2 at this point and we want to be able to take advantage of newer features available to us like 3D textures.

Additionally, the voxel APIs in particular is marked as "experimental" which means they are subject to breaking changes without deprecation.

(The latter is documented in our Coding Guide. Perhaps we should also add a note about out WebGL 1/2 "policy".)

Copy link
Member

@donmccurdy donmccurdy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When loading the "Voxel Picking" demo and selecting the "Cylinder" dataset, I'm seeing a small rendering difference, near the center of the cylinder:

Before

Image

After

Image

After moving the mouse around a bit, the new version updates to match the old version, so possibly something isn't being fully updated during initialization?

Similarly there is a rendering difference on the "Voxel Rendering" example, a change in transparency and interpolation, though it seems more plausible this might be intended:

Before

Image

After

Image

} = options;

if (!context.webgl2) {
if (!context.webgl2 && !defined(context.options.getWebGLStub)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the stub would ideally be shaped like a WebGL2 context for forward-compatibility. Does that create other issues in tests? The !defined(context.options.getWebGLStub) check above also seems reasonable, if so, but I would just add a comment explaining why it's there (perhaps with a link to an issue) since that may not be obvious to the next reader.

Comment on lines 372 to 376
if (PixelFormat.isCompressedFormat(this._pixelFormat)) {
throw new DeveloperError(
"Cannot call copyFrom with a compressed texture pixel format.",
);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question — I believe this is just currently unsupported, and doesn't represent a technical blocker or a decision not to support compressed formats. Would it be worth hinting at that in the comment in case a user runs into it and is willing to try a PR?

Suggested change
if (PixelFormat.isCompressedFormat(this._pixelFormat)) {
throw new DeveloperError(
"Cannot call copyFrom with a compressed texture pixel format.",
);
}
if (PixelFormat.isCompressedFormat(this._pixelFormat)) {
throw new DeveloperError(
"Unsupported copyFrom with a compressed texture pixel format.",
);
}

channelCount,
componentType,
availableTextureMemoryBytes,
availableTextureMemoryBytes = 134217728,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps worth keeping the explanation of the default from below:

Suggested change
availableTextureMemoryBytes = 134217728,
availableTextureMemoryBytes = 134217728, // 1024x1024 @ 128bpp

Check.typeOf.bool("nearestSampling", nearestSampling);
//>>includeEnd('debug');
if (this._nearestSampling === nearestSampling) {
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I understand why avoiding GL calls would be preferable here, is it likely that this setting would change frequently? Or it makes testing more difficult? Would it make sense to have a Sampler.LINEAR preset as well, and avoid Sampler construction entirely here without needing to check an internal cache?

* @returns {Cartesian3} The computed 3D texture dimensions.
*/
Megatexture.getApproximateTextureMemoryByteLength = function (
Megatexture.get3DTextureDimension = function (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly worth adding this as a comment on the method and/or adding a JSDoc @ignore hint? OK with me either way though, I see the class itself is private.

Comment on lines +370 to +371
// Find a nearby number with no prime factor larger than 7.
const factors = findFactorsOfNearbyComposite(tileCount, maxTileCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why small prime factors are preferable over power-of-two dimensions, or arbitrary integers? It makes sense to me that limiting ourselves to square textures could increase memory but I'm less sure about this part.

Copy link
Contributor Author

@jjhembd jjhembd Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"power-of-two dimensions" normally refers to the actual pixel size of the texture. This proves to be very memory-inefficient in 3D. For example, suppose we have a single-tile dataset with pixel dimensions 17x17x17 (total 4,913 pixels). A power-of-two restriction would allocate 32x32x32, or 32,768 pixels, over 6x as much memory as is necessary.

This PR removes the power-of-two restriction on the actual pixel size, allowing single-tile datasets to allocate exactly the required amount of memory.

The "small prime factor" restriction applies to the number of tiles. Suppose we have a dataset with 9 tiles (for example, a simple octree with 2 levels of detail). If we required the number of tiles to be a power of two, we would have to allocate enough memory for 16 tiles, or almost double the actual need. The code in this PR would allocate space for 10 tiles (assuming the tiles are too large to fit all of them in a 1x1x9 stack), of which 10% would be "wasted"/unused. We could theoretically allocate a texture that would fit a "pancake" of 3x3x1 tiles, but the current code is too simplistic to find that solution. (Only numbers with at most one non-power-of-two factor are considered.)

The current setup for the tile count can be thought of as "power-of-two" with additional options in between powers of two, computed as 5/4, 6/4, and 7/4 the size of the previous power of two. For example, for a dataset with between 8 and 16 tiles, we can select from the options (8, 10, 12, 14, 16). With this setup, the maximum allocation will be 5/4 the size of the input data, or 25% more than necessary.

We could in theory add more options. The series (9/8, 10/8, 11/8, 12/8, 13/8, 14/8, 15/8) would reduce the maximum excess allocation to 12.5% more than necessary. Or we could allow arbitrary integers, which would eliminate all excess allocation. However, then we would run into problems packing large prime numbers into a box with no dimension exceeding GL_MAX_3D_TEXTURE_SIZE. For example, suppose the input has 13 tiles, each with pixel dimensions 32x32x32. The only way to pack these into a box exactly is to define a box that is 1x1x13 tiles (32x32x416 pixels). However, some contexts may not allow a 3D texture with more than 256 pixels along a side—see GL_MAX_3D_TEXTURE_SIZE in glGet.

In the above 13-tile case, if GL_MAX_3D_TEXTURE_SIZE is the minimum 256, this PR would allocate a 1x2x7 tile texture (32x64x224 pixels), which is one tile larger than the input data, but fits within the maximum dimensions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, it's none of my business, but I hope that it's not toooo distracting: I was curious and pulled that function into a sandcastle, and tried to feed it with some of these numbers...

https://sandcastle.cesium.com/#c=lVhtc9o4EP4rWz50ICEQ4NLMXEKmubSZ6UzbZC65T8AHYURQzpYYWSaQTv77rSRblmxDc51M8cvusy96drVWJHiqICFblmTJ6Msj3apM0gf2SmEMw7NPF1M+5f2joymHI7gRyTpTFAiMvoCyorBgCeUpExzUiiiIBFeE8RTvKDyxDeXAs2ROJYglKBbTtAtCAknRKN/ZJ/ouIhyWTMELUyvGjTbZEBaTeUydrYQmQu56xpnPayJJAr9uiFQ0ZYSP3gzal8KfFB5XnnupdkBwaoSA+W5txJbGaQhr377BfIfo91Q+kGSNehqyVDQvYY03BgJSI9QM5KLJc/zDxPKXQbjmkK01TMwSzIGw8YdRQ6oXBf02RpttTHRsNyLjamY8VUKRuJp/yNNrbqjKkSRFW5ikIJ8aI7KLvmhcc+tHf8ojw6MnqhyH3DogkZYZj5S+bKN4ZZW6+lGYZfPoQL66BYoJFe868Es/sk6oun1OX+AGg8qSXhleu3OhlTTBAfp9uGV8YRJTzdcLLdjZ86zgK10m37hdwrF+VQ0EExsG29vWH+3qj14vSkNYm49FqBjKD6JWvWUshGxbi4d41a+6aTJn42ZLaAfYlzDI8whg2VBLpVF8sxnT+nlOF3TJOF203ZJ0HJCqup4wXsp1g+isX2+AtUib9H1Z3xNcuwchlSO1X/NLKRJIExLjQqKEgJjIJ7z0VjJFXbrw2sYY7ubPNFI9ypVkNG2Hq9PpaY08+23ShXkHxldAJoMZnMAcf1yWc+8esXqJlIQ/Mf4E6EtZiATLmj/FusziLMH7WKCI6X5bZlqWvi4DCMLzgzAoNwbkO+VPaoVhVCObnM60k0dlVh0TGvQvx437Qrm0FXJMmsydzmbakRr6xXsxBg6j4c1g9m6c4V6coYejkymwg8fiqT1tPdx/vfl2/R1urh++fpi2OhfvLo6inRBsPUTOd0VP0bsbcAFriVqwJJHCvdBQUurtk8O5t6b2teYjltfi1t7dLX8aRL0Xi5QpeqiaiupYx8yUR0odKONYDKZiVpL6JdOFfylds5yFdgcp+r7egnJMS8ncay2Zc0WTmpJoVWHpEuNsx1QBw3BOL/DnEkb4c3zs+GSDnmjedyE2LGlcL+YvVtgfr3X1XOvCCRpl43jTz210AjRt/dblHfczZ/gWO0n+pp0nsdtsvLOPlBp8ZmHvpVhkkbpbFpCe4Q5WaOxXSb8f8tJ3ctrq+k539qhYPCMdht2E7UD3Ce2JTKs1vyrIaKjzN+5QGwpZikNFnsmefelY8mxZ8owsKSTyjMCzRxko3k6esfeO/UTgk9z5N68u95fuWzDpXq/XUiBpiJ52uS4WqksUC5lg7Zql073ZFEFQzCm0h10YdeEMM4gN4LxTTGqSbRCtcWzjZtAqjGBdztFq6cGiUQmp9xOuvbKz2i8rhsVnNhCHoIegdCWyeIG9RwHdRrQAdWOfhZ3M7NBHt2sclblyO1AYYxgi2hQpdXlhaYN5hOH5qOiGwYNdjZvi+pkvNffHh+KV27y8UsdR8MMYB75yl1pJ8WIGwC90Q2OBU/ZXKfX01OLGfZPHJMPan7s1SItmnw8j2ggPBqT/B4vfRTEleD1wwPq/FLcDXKx26W1E0Nrgz4LdOV0np13I/1zv04LDmuCgWXDUhDjwBd/KbQdLfMjDBmou9XN0tePNpmtJN0xk6b14ofJu+fgi9FcjHB1ZEE9S4qS4yHArwbXBKamu6MnqRZAbI2uw2tajExha22YbyWW+8QXdFs5GlMVtZ6nvhHyXOZa+YxkqNoRwXEE/cveOcSHKlU9V0+L+SWleNhbdfEVZYfw2VpQsrGhg6OSkTo1AIKTJqVtUNFgPo7riJonN9BgcRsIEnGE2/2hE1MtiIAf72bkP9dMh1EGFoTUq70M9/72v2l3He7M1uKZ0eMs3huy+76SCj098ERRDYxEF2j45g9LTrc63ivNrN8TvBAWWEMZxbPO/e31DmJVKZWoiN6hdjW0K4eNHCOyPZsj00+o34v7Ulr2z2cqnRiuDw1aqtPitlbNGK8P3xDII2+P+SqpRqDbe+cZzq7qNFXvmGAZBW8UAyil04rbbc+tN0zDt67hZyR+vC1NH40AWx2nNiiA5rB50rl3EWp6BBN/Nzecso2HX/Gm6WsXKKYmNPh/ADxxnoNxp/u/C96E4IBiM3GPcBbJY2Vm7fiTVdAxVPYOC3x1CQXAKVQRXDMnWgTJkpXcp+7C3zQ9m/HOhUnBXCu7qgrtS8LUUfK0LvpaC+hvu0Zw4jKGtzBGUOXR6rTqNk70Rm7aO1fZ42jIXu+LiVV/YD0J961DtcN/qti5TtYvplc7LZ4b7nVSQybjd6/UVxZziKJv251n0L1W9KDVql/1C5XLBNsAW42krMgS6sefIVKLNKCZpim+WWRzrb7dp6+qyj/KBWizIAmv/bkNlTHZaZDW4+m4f9nq9yz7e1rWUEPGcSA/xPw

To my understanding, it's putting "three and a half" tiles along one dimension. Did I miss anything here?

Copy link
Contributor Author

@jjhembd jjhembd Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking, @javagl! That was clearly a bug in the getDimensionFromFactors helper, which would have been an issue in contexts with small maximum dimension, with small tilesets/small memory limits, where the desired tile count may include only one factor of two. I just pushed a fix. Here is an updated Sandcastle demonstrating the fixed behavior.

The current code can still be broken if GL_MAX_3D_TEXTURE_SIZE is small and the size of each tile is large. However, it should throw an error with a reasonably clear error message in that case.

Copy link
Contributor

@javagl javagl Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly, I didn't do the math of this factor computation, e.g. what exactly is done there with these primes (and why...). And I'm lacking some context (e.g. was surprised that this bytesPerSample was only relevant for the total size, and I don't know what "usual" tile dimensions are, and what usual maximum texture sizes are...).

But I was somewhat curious: Given that the tile sizes are all the same, I think that this might not even an NP-complete problem.

First, I thought that one could compute the "wasted space" somehow directly. But the maximum texture size, the different tile dimensions, and the resulting (different!) maximum numbers of tiles along each dimension make this more tricky.

I still think that one could map this to something that can be solved deterministically, or at least approximated very well with standard algorithms. Specifically: I have a veeery vague feeling that there is some "shortest path computation" hidden in that.

But I didn't go to the level of pen+paper right now. As a "quicker shot", I thought that ~"some greedy approach" could do it. These can be pretty good for this sort of problem, and often only deliver "bad" results for "very unusual/extreme" configurations. But ... (coming back to the lack of context:) ... I don't know what "common" or "uncommon" configurations are.

I casually hacked around a bit: I created one exhaustive search (searching the configuration that minimizes the "wasted space"), and an additional simple "greedy" one (that just greedily fills the available space based on the sorted dimensions), and added some corresponding "test functions" around that. It's not really productive in some way, and certainly doesn't count as "Cesium time" - rather some recreational thing - but will dump the result here...

https://sandcastle.cesium.com/#c=7Rr9b9s29l9h88POTlQ7TtZ0WOxgubQ9BOiSoMndcIiDA23RsTqJMvTh2B78v997pEiRFOWkxda7G24YGkt833xffNQ05XlBErqKkjI5fnfHVkWZsdtow8iIHL05OR3zMe/v74852ScXabIoC0YoOX5HCglKwihhPI9SToo5Lcg05QWNeA5PjDxGS8YJL5MJy0g6I0UUszwgaUZoDkz5Wr7BpynlZBYV5Ckq5hEX2HRJo5hOYqZ5JSxJs3VPCPPTgmY0Ib9d0KxgeUT58VZQe6fkycnd3BAvRwFSzgQQiUyxlumKxblNVq5uyWQN1G9YdkuTBeAhyRpRLJIFPAgSJBdAfkJam8rGPwtd/ioonHNSLpBMHCVgg1Tqb2tNctwUkFsw9fO4R90u0pIXD0LSIi1o7NqfVOYVD6yoKGUMeIGRLHsijanc9NC751KO/phPhR89skL7kN6H6yx6jDjIMSKzkk8LdJUOoDm7FeAr29ri1Q67BYqKUHnMu+Q3fCNlKRwxgD1nT+QCdCuTXq1lpyt8nJB+n3yIeChs45rsiSkH7RkcYAkj5ZLLXRzhkqsD2NbWs7dqvlo3X21Oa0YQnndKS1DjZ1rMe7M4TbOO5LjLtfqumIjSFcSjGelYpIdkUJmQEOkPDSsKxK00GOJX5gzZLOIs7OjN6GpChSt5EvEaLrCUk3JtCUQj8+GbsKYksHW3aVZotzajfpalCckTGsM+AkRKYpo9wk9jI3PAZaGROEbkevKZTYse40UWsbxjb063hxiV8Ts0IJMuGZ0Rej94IK/JBP4oI1fC3UH40iyjHALhkYAodSRSiGv+GGOcxWUCz3EKICL9rSKRs/B3Lb+lnamDoHIhiHxk/LGYgxauYveHDyjjfm1U7Qge/OHIWxjqnXV8497H7vDhAQVpUD99KY2BpuFZGTy8mM5RK50jgw4aM4UUHqePnfHe7c37i8vzj+Ti/Pb9q/Fe9/TFsaGSCYWkQ7PJWmUULG+Ep2SRARaZ0WkBxVB4ZIb1k5O3xp7KZXRHiK7wg3y6nl0JiliM0zwq2K5gUsGxiCMRHTnTRCMOsSACZp4xM2IC8itji6jyQllCVOLHGlTRlC5ZSY2Qla+gUzM6nTteOgM9OzErSATqHJ7CnyE5hj8HB9qfpNL36PcBiYWXePcrMjfLzo7nGD3nGDhWmvT2N/2KR9eihtw/aLtDQdOMP0AiqVY6lREDP/Num1Mi8QdJ9iZLw3JaXM8USYNxFyI0NqOk37f90hRyvBeYQndbUCQ9AW2r7aOtibYBtWiGaP4l5YzCdT5BfVoyUubQVVSW7MlF7SWfpZd8Bi9REJVFyGfDZYhavf8MqXdkGgLeVMJvjbhsD92t1eqeLxZZCk5Dsd3lGCwMQxQCmULsiq3D3CyCwArmnHSOAnIckDdgQUgAb7uqVcuiJVDz9m1cdFqKCcTlBLjWEoReJHC9K3JuhJ3EfppHEHyigGgK2P7k87SMQ8g9BWGrKVNEdd8nyd4/yK6PrRbQK/NCVyBbR1tF4JnmTNslyj3sgQyvekXdBe7MalwE11W11dzsHtSSLl5GqEMT+GoErV5dpeZZ+iRav3dsyeIU2uz3WYa90x4X4gs7JiXE/kTvQa6SfdWLIBNu9UdfRhYORjGj8HugCeM/OZQD2KxOLe2UArfBj8q7K3e9PwxI9b/OfQh41AAc+AGPfRQHJuC2LjsQ4kfcTqDiJ74HUbtGZ7rI2DJKy/wmfWLZ9ezuKcVjI9nfl0QMyAwaxbAUpwAOTVIT0YDFTciWAlbQ6kiJXpMjyVuUkQrmkodspYSdsijuaE59DWSKzCH0tZcBokeFA4f6vn7WHmdTOTNdVaS4v+esChtJXRyjJDAcjgtGQwlqMXr9uukaFoDtJod6U4FhUw13x4UR/e4x2E0JDPAGrPm9lyJuiyA5aPfONqonu6gOHA9tuHIb1bfPy4riar8XpUEnpd0lXzCSdV9DWcdOWLCCwRtEFrbpnFboYaozuUL/Gtj0u1aAJTTi0LaZJ16TEVjFiUx0ZA/a2UiakHz3HbH4Hz+Apx+6R8R209a508/lxMtlsJuL6xbPcnnj5XL0El0Gdnpsj6SGCzXaO5N5xRXTmKqZIzKw0iooUHeh97rcvpXS+JppE0f3SmZ7rVjtjyxYaKfRKyzjRE2lK+yGrtV46Bcobiy8XdApa5vt1McU9YRTEg0k3oJl8bRDYUedcU6ISd6dpBgmC9fN9bW5vmmumzMWcc65q0YlABpi2g9xMBNaYEoMS3ZLkmLtBTGFKTZeEK886ACFmBmJKZEJIxrokQm6b+th1tSQiYJaj1PaYZ/EZgKwYPBa4uqMAXtkHdnqUwEh0g1Ijn7wY31gdsFqIeDpoH70wleJXOGZFqvQV/AP/JIPa/NhIx/kIVa+0saShNtEFKoLDPzVBlVZSsDJ391G5Mj3jcCZRUX+R4xBn4mjP/XQ0nVPvyehL11QjichvHHAzauXTOe0XqMbyVkwjnCMAaMP35glukToJGygONZ3UeprBnnB4KKP98Qwx329w8IKrGsekbXHCr9xTwWV0UU9ktDPD1hcJ+oGX4m5/mrMjcSUBQ06xvM4T8kcDmioCnbooqkUV1d4OSRPuwmeYWV7bfgbNmDg8/ppYD1BQzE0Mtn/kkvKUWA14hO01FDBnPuhbVzCvj34Ct+Cc5Eoc1kplypYfK3y5o4LpverOYVzdrRk3+yKCXdVqvVK5/GmoQP12sPzmQANTIoVb2FUkQmdrrXk1f3LnzFmsc2Fk9Avqie5EnOq3s317eXd5T/e/+vy6sPl1eXdP5vAFymfRVgHbfvo5nklum74MxwRI8ThjdE1a+i1hF5b0HDSIWt7HqngNxJ+Y8EfAfzGgpe+hOUV6+kGDyzNJKLOu84QNltDQB6sDkS7c7Cu/m66JlJd9a1es/U2dBWQdUA2KnhNIrotbOn52/JN0FysOxX7vSWjuWzLgyarpBnW++2Yi1huo7swL4B2lYYQFsa2ftA/t/aQ+auun5tXBT10Tkc8XxO28mOvPdiefm3tx954sDdt3d5zE/UdiftvGWPi3Pb/pL39FrfRegRaJqos7PJJzGJ6pGIeHU+/6EKPyus4zzXe/aFzk1e86NrOVr++X0JCSjV91aYnaVqTAPjUN7m1euYEWb3ta9jfObSVnF8Y0xrty4JZo311FOsIXWTQLX5ieRkXHU4TFuBkHx4CI687LIzzpypeiCmO5npI4J6sJVXSMTvQLt4vygXTnV9e3KoNaY5vAEPShZ/9lv1oTnRqrLWn/WnB2tRYG0/r4zVGU0Nx0+q+VUZpVGcnw7lbZc/hRBvebcwrspLfsbz4w7/cqvX+VHIRhABftLmJc7JRZjGcz4vmHLgRzX7lR9s1sxBX5u3r7eKrQVjgzsH+S+qVLfO7lOX8L+KYGsjLZPzEE1pR2Kler+d+JOPmTun5xieBOz8Z7NSfgDUU9qr7vLKWqvb9CUpmnSWfOW1+e+l0w7SjnfoWUuFfsxCM99Jqx+oMrfZwV2E4bRJi2rw1qdrkX0jsUVikJiQt9AyRrf4OxMx9t+ILNpEBG/NUq1fzV56T74MfgsGJuaVODrJvgHZlGoA8rP5zqqAatg7eHMuqXuVsN/87frDTBeysBMbBTzh+b9uQHwLyH7PONwgYMJtjJ3i5F+wN82IdszME+ylKFvgNa5nFnV6vXzDgSIFkf1JOf2VFb5oL7xz2FcowjJYkCkfjvakw6YX88p5lUFimMc1zWJmVcSxmcntnwz7AW2hxSkOQ6HrJspiuEWQ+OPsoX0IqH/bhsYlVpGk8oZlB8d8

(This also contains the original function, but still the one without the fix)

As the name runSingleTest suggests: I considered to add more test cases, with a mix of extreme tile sizes like (1,1,128) and un-extreme ones like (16,16,16), and of course, some tileCount values that are either prime (like 1259) or highly composite (like 1260 - yeah...).

But ... it's ~2am here after all 🤓

Copy link
Contributor

@javagl javagl Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that I roughly understand why that case is not working, but may not fully understand why certain limitations exist.

And as mentioned above: I don't know which configurations really appear in practice. So all this is no reason to hold up this PR. (In the hope that when an issue is opened for some error message like ~"WebGL error: negative texture size", it will be possible to 1. quickly zoom to the relevant code part, 2. find out why a certain case caused a negative texture size, and 3. fix it as a appropriate).

In the last comment, I mentioned the 'greedy2' result, but not what this does: It starts with the maximum number of tiles that do fit (along each dimension), and then always decreases the tile count by 1 (along one dimension) that causes the greatest reduction in "wasted space". It is guaranteed to find a solution (if one exists), often finds the ideal solution, often one that is better than the current one, and I haven't seen a case where it finds one that is worse, but I didn't really (really really) systematically, exhaustively, and fairly compare things here. (For cases like a maximum size of 2048, and a single tile of size 1x1x1, would mean that it decrements the size 2047+2047+2047 times, but maybe that's not an issue either).

The core of the approach (other approaches and sanity checks omitted here):

https://sandcastle.cesium.com/#c=vVdrb+JGFP0rUz5BwhpI2q1UQ1Rt0lZI3bBaUFdVQNVgX4OV8RjNjJPYFf997/g9ftCXVCQSe+bcuee+ByfkUpGAvvlBFNw+bOBNRQLWfgJkQW6+e29vxZbrrxdxR/khJ04YnCIFX6hU4K5P1IGh3idE+Qwe/AC4RJgcV4v3YcRV7V1+AlEC0/UR+TPbdlI67hsqN4+z3mwDEbcRsYlI2ojEQKhQUbZBRG4tar3SB+OfBrCgYxBvMFJxJ8gkpZJOUA8vqcGalNKklImKJLh6vwJfmRYZaN8FynLdaTguol/T2CI8VfIuk84RAjBBeA5J186tFDmJ0I0cNaRlXBkocsIDZ/kpXijIMFPmkdAjFZIg7mpBvBx4NrSeehTmOfkBpFpHjgNShmLYyEfiREIAJmLlBINe4PMvhd2PUbAHYX1arZeb5W8//bF8/Hn5uNz8bnfAS4XaX9wFz+e5Y3IzNdrHzamN/+YFDYsBP6gjrl1f14zPfMKxDFHiybKsHL6zC4Tee/J35F3lTUJ8jwwLr2vAiNwtOuysNJQh7i3n7NNZ1NVWvbSzD4dXcg/SjwLrngqFT5Tfppyeprtxxn5WPNzsRoawR5mEamFkV8/awpz0vHK+YRkxYljL0NZ2PWaaSA11Lh7PXQnYPsHMyMy5B1BlKy2d94sAcOMb1Fim7aXGuY9V2iXWNDgxyBfpC/UZ3TPID/8IQSjiDxra7LdGW51MyGb1sCKrwFfaO9uB5yu5HRDnCM4zOYIATLV6C8CJULSgp8IjH6k6Wh4LsbY6B8ak1bOr6P5j2fg/yCaF7K5WsHkhoUGFbfnm6xFfyFCJCPoK8d/2F6M+v8kKs5awewH02W4kHKkxreXm2RgRjdxKoV11VzBoClh6pOV6sC71NOgety3BuCY46xCM+wSTmuBNh2BiDpimfFZndses8bn6DDJi2PloAGM8QL/UwjBuqDJuGyEDi4WHVJZcY2X8sB2M7PY+Fkt+MqZKxo1IzL7RdlCoHJljvDnm/zpI7SsGymRn4+OkN0btW0clF3eUVq9cUsklHWXV45a2pdonrdVRz/AWEd9gYf1vzdCk/zni3OcHzDep+iNvsirtq+VUj6DJPBU0l/oEL1iXnnJh/5IR2eWv4G90qSxts/BXw+rSLCtvCj23hM6w/Z3ANUJX3QTq1b4dHDIaVQXmvC7Vfn8SrjENNCdMxcbvkUb0++r4+zGZ4fd2Znq0kQLVtS3bvhRoxE7zT6u3ZJFckG+n0+LAopKak6kRh4sBaOZF6aqGh+ziulNlmBtyKEpoMB7MpYoZ3GliP/rBKRT4o0WwoWVNFCAPitom+8h5BmU5UgeGzyeFyNz1X/Bnx2I7cFI334dcUbxYC31nYVRK3PEixvT83w7u5hPEG2IspC6yXb2AYDTWkOPs7tdsEa868wm+tqVUGLI9FbUTvwI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @javagl, I think your 'greedy2' approach is both simpler and better.

As for the number of decrement iterations, I think we can use some geometric hand-waving to limit it to a 3-count loop:

  1. We start from a "cube" which is filled as full as possible with tiles. The number of tiles along a given axis is inversely proportional to the dimension of one tile along that axis.
  2. The dimension along which a decrement will cause 'the greatest reduction in "wasted space"' is the dimension along which the tile size is largest. A "slice" perpendicular to this axis will necessarily have a larger number of tiles than a "slice" across either of the other axes.
  3. After removing one "slice", the "slice" which will cause 'the greatest reduction' is still in the same dimension. That "slice" has not changed in size; while the other "slices" have only gotten smaller.
  4. We can therefore compute the number of "slices" to remove from each axis sequentially, in a single operation per axis.

Here is a Sandcastle demonstrating the optimized 'greedy2' approach.

It is less obvious to me to quantify just how "good" this approach is. My previous attempt had a known bound on the amount of wasted space, when the maximum 3D texture size was not an issue. When bumping up against the maximum texture size, however, it would sometimes allocate a texture that was much smaller than ideal, because the factorization approach didn't tend to find cubical (or nearly cubical) shapes.

In every example that I can think of, the 'greedy2' approach will find a solution that is equivalent or better. But I don't know how to prove this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "greedy2", in the posted form, was certainly not optimized in any way! It was rather "brainstorming in code". What you suggested (basically, determining beforehand how many of these next[i] -= 1;'s there will be, and doing them in a systematic order (i.e. not "just trying around")) sounds reasonable, and as far as I understood, should always give the same result.

Any proof - on a mathematical level - about "how good" the result will be is difficult.

(Anecdodal: I once did a 1-hour presentation about https://www.sciencedirect.com/science/article/pii/S0166218X0500377X , so... maybe we'll first have to proof on whether the problem that we're dealing with actually does have a polynomial-time constant-factor approximation algorithm?)

One of the challenging parts for that would be the two (independent!) limits, namely, the maximum texture size and the available memory. So ... I'm a nerd with some affinity to mathematics, but there's still enough "engineer" in that mix that I'd say "Let's try out". If this was relevant, I'd run that (messy) Sandcastle that literally dumped CSV to the console, with even more configurations, and add a column that contains the original- and greedy2 results, divided by the "exhaustive" result, look at the min/max/average of these columns, and see whether anything stands out.

Is it relevant? Well, I'll probably do this anyhow - not today or tomorrow, but maybe in ~"the next few days". I just don't know whether this should affect this PR in any way...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a bit (too little) sleep, and a quick look at the actual sandcastle code, I'm not sure whether this will always deliver the same result. It only modifies each dimension once. I think that there could be cases where the "best" dimension changes during that process. Imagine something in 2D that is close to a square. The process could be

  1. cut off a bit along x, because that's the best (basically because sizeY>sizeX)
  2. cut off a bit along x, because that's still the best
  3. now cutting along y is better than along x (because sizeY<sizeX) - so cut off a bit along y
  4. cutting along y is still the best - cut off a bit along y
  5. now, cutting along x is better again - so cut off along x
  6. ...

but... that was just an unverified thought/gut feeling behind the original "greedy2". Maybe this can not happen. I'll have to check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran a few more tests, and will just I'll dump the result here:

Output00003.xlsx

Most relevant points:

  • There doesn't seem to be a difference between "greedy2" and your optimized version of the "greedy2", which is called "greedy3" here
  • Some key numbers below the table:
    • Maximum waste:
      • Original: 0.12
      • Greedy: 0.12
      • Exhaustive: 0.024
    • Average waste*:
      • Original: 0.0164
      • Greedy: 0.0072
      • Exhaustive: 0.0008

* The "average" doesn't really make sense, given that the test configurations are completely arbitrary (I could have picked configurations where 'all but one' configuration has a waste of 0 for all algorithms).


I'll probably leave it at that for now. The "summary" is that it looks like the "greedy2/3" is simpler (and easier to understand) than the original, and does never generate worse solutions, but in many cases generates better solutions. The tests are not "exhaustive" (and maybe don't even properly reflect "real-world configurations"), and of course, there is no formal proof for anything, whatsoever. But ... Experimental mathematics FTW 🙂

Comment on lines +398 to +402
if (Math.floor(n) !== n) {
throw new DeveloperError("n and maxN must be integers");
} else if (n < 1) {
throw new DeveloperError("n and maxN must be at least 1");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to gate these checks behind a //>>includeStart('debug', pragmas.debug);? Asking more for my own understanding of CesiumJS best practice than an actual preference. :)

@donmccurdy
Copy link
Member

I'm not 100% sure I'm measuring the right thing, but opening the "Voxel Rendering" sandcastle and logging megatexture.textureMemoryByteLength, I'm seeing allocations of 64 MB before and 512 MB after. Other sandcastles (like "Voxel Picking" and "Voxels in 3D Tiles") have decreased, but "Voxel Rendering" is the only example I found with a large voxel set.

Is this expected and/or is there a better way to check total memory allocations? I haven't ventured into setting up something like webgl-memory in a sandcastle quite yet.

@javagl
Copy link
Contributor

javagl commented Dec 24, 2025

I haven't ventured into setting up something like webgl-memory in a sandcastle quite yet.

See the "Testing Plan" of #12370 (I think that I did only use it in a sandcastle by "hacking" it into the local build, but think that this snippet there was supposed to work in an actual sandcastle)

@jjhembd
Copy link
Contributor Author

jjhembd commented Dec 24, 2025

Thanks a lot for the feedback @donmccurdy! I think I addressed the code comments.

For your testing results:

  • I couldn't reproduce the issue with the cylinder in "Voxel Picking". It looks like an LOD didn't load immediately. Did you run both versions with the same viewport size? The fact that it resolved when you moved the camera makes me think it might be working as intended.
  • The change in the translucency in "Voxel Rendering" was intentional--see 51aa022. This is restoring the example to the intended behavior, as it was before the shader's ray distance was changed from arbitrary "shape" coordinates to actual distances in meters (See Use eye coordinates for voxel raymarching #12933 -- I neglected to update the Sandcastle in that PR.
  • I still need to investigate the memory usage for "Voxel Rendering" -- thanks for flagging that!

@donmccurdy
Copy link
Member

About (1), it might be LOD-related, yes. The viewport size is the same, though. Here's a screen capture comparing the PR to production. I'm not actually moving the camera, just hovering the mouse, so perhaps the picking operation is resetting some cache or GL state...

voxel_picking_cylinder_delayed_update.webm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize texture allocation for voxels

5 participants