Skip to content

CUDA OOM in get_edges / fill_holes on large meshes #20

@popomore

Description

@popomore

We hit CUDA OOM during trellis2 mesh post-process (fill_holes → get_edges). The error comes from src/connectivity.cu:get_edges(), and fill_holes() triggers multiple CUB operations (sort/scan/select), allocating large temporary buffers.

Stack (excerpt)

  • trellis2_image_to_3d.py: decode_latent -> m.fill_holes()
  • trellis2/representations/mesh/base.py: fill_holes
  • cumesh.py: get_edges
  • src/connectivity.cu: get_edges
  • Error: [CuMesh] CUDA error ... Error text: out of memory

Hotspots

  • src/connectivity.cu::CuMesh::get_edges()
    • edges.resize(F*3)
    • temp_storage.resize(F3sizeof(uint64_t))
    • cub::DeviceRadixSort::SortKeys
    • cub::DeviceRunLengthEncode::Encode
  • src/clean_up.cu::CuMesh::fill_holes()
    • multiple cudaMalloc + DeviceSegmentedReduce/Select/Scan

Likely causes

  • get_edges allocates several buffers proportional to F*3 plus CUB temp storage → high peak VRAM.
  • CuMesh caches are tied to the instance and not automatically freed. Even if users create a new CuMesh() per call, Python GC may not immediately destroy objects, and cudaMalloc buffers
    can linger, causing fragmentation or sustained high usage.

Addendum: lack of clear_cache()

In long-lived services, if the call path doesn’t explicitly call clear_cache(), cached buffers persist and OOM becomes more likely. It would help to aggressively free temp/cache buffers
after heavy ops like fill_holes, or clearly document the requirement.

Suggestions

  1. Free temp buffers earlier in get_edges / fill_holes
  2. Provide low-mem / chunked path
  3. Reuse CUB temp storage
  4. Expose stats (E/B/L) for upstream guard

Env

  • CuMesh commit: 8290b77
  • GPU: NVIDIA L20 (48GB)
  • PyTorch 2.6.0 / CUDA 12.4
  • OS: Linux x86_64
  • Stage: fill_holes / get_edges

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions