-
Notifications
You must be signed in to change notification settings - Fork 199
Closed
Labels
P0High priority - Must do!High priority - Must do!bugSomething isn't workingSomething isn't workingcuda.bindingsEverything related to the cuda.bindings moduleEverything related to the cuda.bindings module
Milestone
Description
Is this a duplicate?
- I confirmed there appear to be no duplicate issues for this bug and that I agree to the Code of Conduct
Type of Bug
Runtime Error
Component
cuda.bindings
Describe the bug
Multi-threaded applications may attempt to access the symbol table simultaneously before it's populated and cause concurrency issues, as observed previously.
Sample failure stack
/opt/conda/envs/test/lib/python3.12/site-packages/numba_cuda/numba/cuda/cudadrv/devices.py:125: in ensure_context
with driver.get_active_context():
^^^^^^^^^^^^^^^^^^^^^^^^^^^
/opt/conda/envs/test/lib/python3.12/site-packages/numba_cuda/numba/cuda/cudadrv/driver.py:539: in __enter__
hctx = driver.cuCtxGetCurrent()
^^^^^^^^^^^^^^^^^^^^^^^^
/opt/conda/envs/test/lib/python3.12/site-packages/numba_cuda/numba/cuda/cudadrv/driver.py:396: in safe_cuda_api_call
return self._check_cuda_python_error(fname, libfn(*args))
^^^^^^^^^^^^
cuda/bindings/driver.pyx:20135: in cuda.bindings.driver.cuCtxGetCurrent
???
cuda/bindings/cydriver.pyx:107: in cuda.bindings.cydriver.cuCtxGetCurrent
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E RuntimeError: Function "cuCtxGetCurrent" not found
This should have been already addressed by #835.
How to Reproduce
No known trivial reproducer, consistently observed with this test from RapidsMPF and numba-cuda=0.18.1
(required to ensure it's using cuda-bindings, instead of its own ctypes implementation):
Reproducer
@pytest.mark.parametrize("partition_count", [None, 3])
@pytest.mark.parametrize("sort", [True, False])
@pytest.mark.parametrize("cluster_kind", ["auto", "single"])
def test_dask_cudf_integration_single(
partition_count: int,
sort: bool, # noqa: FBT001
cluster_kind: Literal["distributed", "single", "auto"],
) -> None:
# Test single-worker cuDF integration with Dask-cuDF
pytest.importorskip("dask_cudf")
df = (
dask.datasets.timeseries(
freq="3600s",
partition_freq="2D",
)
.reset_index(drop=True)
.to_backend("cudf")
)
partition_count_in = df.npartitions
expect = df.compute().sort_values(["id", "name", "x", "y"])
shuffled = dask_cudf_shuffle(
df,
["id", "name"],
sort=sort,
partition_count=partition_count,
cluster_kind=cluster_kind,
config_options=Options({"single_spill_device": "0.1"}),
)
assert shuffled.npartitions == (partition_count or partition_count_in)
got = shuffled.compute()
if sort:
assert got["id"].is_monotonic_increasing
got = got.sort_values(["id", "name", "x", "y"])
dd.assert_eq(expect, got, check_index=False)
Expected behavior
Attempting to create a CUDA context from multiple threads should not cause an error.
Operating System
No response
nvidia-smi output
No response
Metadata
Metadata
Assignees
Labels
P0High priority - Must do!High priority - Must do!bugSomething isn't workingSomething isn't workingcuda.bindingsEverything related to the cuda.bindings moduleEverything related to the cuda.bindings module
Type
Projects
Status
Done