CUDA OOM in get_edges / fill_holes on large meshes


We hit CUDA OOM during trellis2 mesh post-process (fill_holes → get_edges). The error comes from src/connectivity.cu:get_edges(), and fill_holes() triggers multiple CUB operations (sort/scan/select), allocating large temporary buffers.

# Stack (excerpt)

  - trellis2_image_to_3d.py: decode_latent -> m.fill_holes()
  - trellis2/representations/mesh/base.py: fill_holes
  - cumesh.py: get_edges
  - src/connectivity.cu: get_edges
  - Error: [CuMesh] CUDA error ... Error text: out of memory

# Hotspots

  - src/connectivity.cu::CuMesh::get_edges()
      - edges.resize(F*3)
      - temp_storage.resize(F*3*sizeof(uint64_t))
      - cub::DeviceRadixSort::SortKeys
      - cub::DeviceRunLengthEncode::Encode
  - src/clean_up.cu::CuMesh::fill_holes()
      - multiple cudaMalloc + DeviceSegmentedReduce/Select/Scan

# Likely causes

  - get_edges allocates several buffers proportional to F*3 plus CUB temp storage → high peak VRAM.
  - CuMesh caches are tied to the instance and not automatically freed. Even if users create a new CuMesh() per call, Python GC may not immediately destroy objects, and cudaMalloc buffers
    can linger, causing fragmentation or sustained high usage.

# Addendum: lack of clear_cache()
  In long-lived services, if the call path doesn’t explicitly call clear_cache(), cached buffers persist and OOM becomes more likely. It would help to aggressively free temp/cache buffers
  after heavy ops like fill_holes, or clearly document the requirement.

# Suggestions

  1. Free temp buffers earlier in get_edges / fill_holes
  2. Provide low-mem / chunked path
  3. Reuse CUB temp storage
  4. Expose stats (E/B/L) for upstream guard

# Env

  - CuMesh commit: 8290b7735636b1e07af1703841e9f4411323eb3a
  - GPU: NVIDIA L20 (48GB)
  - PyTorch 2.6.0 / CUDA 12.4
  - OS: Linux x86_64
  - Stage: fill_holes / get_edges

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA OOM in get_edges / fill_holes on large meshes #20

Stack (excerpt)

Hotspots

Likely causes

Addendum: lack of clear_cache()

Suggestions

Env

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

CUDA OOM in get_edges / fill_holes on large meshes #20

Description

Stack (excerpt)

Hotspots

Likely causes

Addendum: lack of clear_cache()

Suggestions

Env

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions