Skip to content

Commit 27abc1d

Browse files
Handle dealloc in stream-ordered cudf-polars ops (#20467)
This updates cudf-polars' usage of CUDA streams to safely handle deallocation. Consider the following sequence of operations: 1. Read some data on stream A 2. Read some data on stream B 3. Concat data from A and B on new stream C cudf-polars currently ensures that C is downstream of `A` and `B` before doing the concat. But then our execution will typically drop all references to the data (on streams A or B), at which point Python's reference counting will call a `cudaFreeAsync` on streams A and B to free the memory used by the data from 1 and 2. We need to ensure that this stream ordered free happens after the result from 3 (on stream C) is ready, and so we join stream C into each of A and B in each place where we previously just joined the streams. Authors: - Tom Augspurger (https://github.com/TomAugspurger) Approvers: - Richard (Rick) Zamora (https://github.com/rjzamora) - Lawrence Mitchell (https://github.com/wence-) URL: #20467
1 parent 1226588 commit 27abc1d

File tree

1 file changed

+186
-137
lines changed
  • python/cudf_polars/cudf_polars/dsl

1 file changed

+186
-137
lines changed

0 commit comments

Comments
 (0)