Skip to content

Latest commit

 

History

History
58 lines (40 loc) · 4.66 KB

InferenceOnLoad.md

File metadata and controls

58 lines (40 loc) · 4.66 KB

Decompressing Texture Sets with Graphis APIs (Inference on Load)

LibNTC provides the means to decompress neural texture sets using either DX12 or Vulkan. The library itself doesn't submit any GPU commands or create any GPU resources; instead, it provides description of how to execute the decompression pass on the application side.

To use graphics API functions, first you need to make sure that the context is created with the right graphics API and device object provided, see the Context section of this guide. The graphics device is currently only used for host-only functions by the library.

To decompress a neural texture set, first create an ITextureSetMetadata object from it:

ntc::FileStreamWrapper inputFile(context);
ntcStatus = context->OpenFile(fileName, /* write = */ false, inputFile.ptr());
if (ntcStatus != ntc::Status::Ok)
    // Handle the error.

ntc::TextureSetMetadataWrapper metadata(context);
ntcStatus = context->CreateTextureSetMetadataFromStream(inputFile, metadata.ptr());
if (ntcStatus != ntc::Status::Ok)
    // Handle the error.

The metadata object can be used to inspect the texture set, see what textures are stored in it and in which channels, and to describe the decompression pass. See the example in the Decompression with CUDA section to find out how to enumerate textures in the texture set - those functions are actually provided by the metadata interface. Based on the enumerated textures, you should create GPU texture objects that the decompressed data will be written into.

To obtain the description for the decompression pass:

ntc::MakeDecompressionComputePassParameters params;
params.textureSetMetadata = metadata;

ntc::ComputePassDesc computePass{};
ntc::Status ntcStatus = context->MakeDecompressionComputePass(params, &computePass);    

if (ntcStatus != ntc::Status::Ok)
    // Handle the error.

The ComputePassDesc returned by MakeDecompressionComputePass contains the following information:

  • Compute shader bytecode. Note that the bytecode may be different depending on the parameters.
  • Constant buffer data.
  • Weight buffer data.
  • Compute dispatch dimensions.

It is the application's responsibility to create the compute pipeline, constant buffer, weight buffer, and fill the buffers with the provided data. The application must also populate the descriptor table (DX12) or descriptor sets (Vk) to bind these resources to the right slots of the compute shader. The bindings are static and described in the comments to IContext::MakeDecompressionComputePass in ntc.h. The actual code needed to perform these operations depends on the graphics API and the abstraction layer (RHI) used by the application. The SDK apps are using NVRHI, and the implementation of the decompression pass with NVRHI can be found in GraphicsDecompressionPass.cpp.

The most important resource for the decompression pass is provided implicitly: it's the latents buffer which contains most of the NTC texture set file. The latents data need to be uploaded into a GPU buffer and bound to the decompression pipeline as a ByteAddressBuffer (DX12) or StorageBuffer (Vk). The exact means of reading the file, uploading it to the GPU, and managing live files in GPU memory is left up to the application; notably, apps can use DirectStorage to load the data.

It's possible to load only a subset of the NTC texture set file for decompression if not all mip levels need to be decompressed. The example above uses the ntc::EntireStream constant to specify that the entire file is loaded; to use a subset of the file, first obtain the required stream range for the desired set of mip levels:

ntc::StreamRange streamRange;
ntcStatus = metadata->GetStreamRangeForLatents(firstMip, numMips, streamRange);

if (ntcStatus != ntc::Status::Ok)
    // Handle the error.

Then use the streamRange fields offset and size to read that portion of the input file or stream into a buffer, and provide the range via MakeDecompressionComputePassParameters::latentStreamRange to make it provide correct data offsets to the decompression shader.

Note that the decompression pass supports specifying a rectangle to limit the set of pixels to decompress. This feature can be used to incrementally decompress a large texture in time-constrained scenarios, such as texture streaming while the game is rendering. In order to advance to the next rectangle, you need to call MakeDecompressionComputePass again, but that call itself is cheap and doesn't require any new resources to be created.