[path_finder_dev] 2025-05-01 version of `cuda.bindings.path_finder` #578

rwgk · 2025-04-25T16:15:30Z

This work was merged into the path_finder_dev branch (see comment below). Follow-on work is under #604.

Description

Major milestone for the work tracked under #451

This PR introduces only two public APIs:

cuda.bindings.path_finder.SUPPORTED_LIBNAMES (currently ('nvJitLink', 'nvrtc', 'nvvm'))
cuda.bindings.path_finder.load_nvidia_dynamic_library(libname: str) -> LoadedDL

With:

@dataclass                                                                      
class LoadedDL:                                                                 
    handle: int                                                                 
    abs_path: Optional[str]                                                     
    was_already_loaded_from_elsewhere: bool

However, the implementations were actually thoroughly tested (under #558) for all

SUPPORTED_LIBNAMES + PARTIALLY_SUPPORTED_LIBNAMES

enumerated under cuda.bindings._path_finder.supported_libs (note that this module is private).

To make this PR easier to review, the changes to the nvJitLink, nvrtc, and nvvm bindings are NOT included in this PR. Those changes were also already tested under #558. They will be merged into main with two follow-on PRs (one for the nvrtc bindings, one for nvJitLink and nvvm).

Thorough testing of all SUPPORTED_LIBNAMES + PARTIALLY_SUPPORTED_LIBNAMES requires changes to the GitHub Actions configs, to set up suitable CTK installations. This will also be handled separately in follow-on PRs.

Suggested order for reviewing files:

cuda/bindings/_path_finder/supported_libs.py
cuda/bindings/_path_finder/load_nvidia_dynamic_library.py
cuda/bindings/_path_finder/load_dl_common.py
cuda/bindings/_path_finder/load_dl_linux.py
cuda/bindings/_path_finder/load_dl_windows.py
tests/test_path_finder.py
cuda/bindings/_path_finder/find_nvidia_dynamic_library.py
everything else

Discussion points:

Copyright notice for cuda/bindings/_path_finder/cuda_paths.py (the original file under numba-cuda does not have one)
Documentation for the new public APIs
Documentation for maintaining SUPPORTED_LIBNAMES + PARTIALLY_SUPPORTED_LIBNAMES as new CTK versions are released

copy-pr-bot · 2025-04-25T16:15:34Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

rwgk · 2025-04-25T16:21:48Z

/ok to test 17478da

github-actions · 2025-04-25T16:40:22Z

Doc Preview CI
🚀 View preview at https://nvidia.github.io/cuda-python/pr-preview/pr-578/
https://nvidia.github.io/cuda-python/pr-preview/pr-578/cuda-core/
https://nvidia.github.io/cuda-python/pr-preview/pr-578/cuda-bindings/
Preview will be ready when the GitHub Pages deployment is complete.

…_PATH or PATH

rwgk · 2025-04-25T16:49:19Z

/ok to test 7da74bd

…ions 12.0, 12.1, 12.2

rwgk · 2025-04-25T19:51:27Z

/ok to test a649e7d

rwgk · 2025-04-25T19:56:50Z

For completeness:

I used these command while working on commit a649e7d:

0db6015-lcedt.nvidia.com:~/ctk_downloads/extracted $ sos=`find . -type f -name 'libnvJitLink.so*' | sort`
0db6015-lcedt.nvidia.com:~/ctk_downloads/extracted $ for so in $sos; do echo $so; nm --defined-only -D $so | grep nvJitLinkVersion; done
./12.0.1_525.85.12/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.0.140
./12.1.1_530.30.02/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.1.105
./12.2.2_535.104.05/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.2.140
./12.2.2_535.104.05/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
./12.3.2_545.23.08/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.3.101
000000000025ed30 T nvJitLinkVersion@@libnvJitLink.so.12
./12.3.2_545.23.08/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
00000000000017da T nvJitLinkVersion@@libnvJitLink.so.12
./12.4.1_550.54.15/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.4.127
0000000000265220 T nvJitLinkVersion@@libnvJitLink.so.12
./12.4.1_550.54.15/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000001bd7 T nvJitLinkVersion@@libnvJitLink.so.12
./12.5.1_555.42.06/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.5.82
00000000002923b0 T nvJitLinkVersion@@libnvJitLink.so.12
./12.5.1_555.42.06/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000002064 T nvJitLinkVersion@@libnvJitLink.so.12
./12.6.2_560.35.03/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.6.77
0000000000264200 T nvJitLinkVersion@@libnvJitLink.so.12
./12.6.2_560.35.03/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
00000000000023f0 T nvJitLinkVersion@@libnvJitLink.so.12
./12.8.1_570.124.06/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.8.93
00000000004ba630 T nvJitLinkVersion@@libnvJitLink.so.12
./12.8.1_570.124.06/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000002c1a T nvJitLinkVersion@@libnvJitLink.so.12
0db6015-lcedt.nvidia.com:~/ctk_downloads/extracted $ for so in $sos; do echo $so; nm --defined-only -D $so | grep __nvJitLinkCreate_12_0; done
./12.0.1_525.85.12/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.0.140
0000000000226010 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.1.1_530.30.02/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.1.105
00000000002438d0 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.2.2_535.104.05/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.2.140
0000000000256fa0 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.2.2_535.104.05/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000000d39 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.3.2_545.23.08/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.3.101
000000000025ed60 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.3.2_545.23.08/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000001389 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.4.1_550.54.15/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.4.127
0000000000265250 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.4.1_550.54.15/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
00000000000016a9 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.5.1_555.42.06/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.5.82
00000000002923e0 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.5.1_555.42.06/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000001a59 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.6.2_560.35.03/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.6.77
0000000000264230 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.6.2_560.35.03/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000001d08 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.8.1_570.124.06/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.8.93
00000000004ba660 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.8.1_570.124.06/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000002378 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12

rwgk · 2025-04-25T20:58:18Z

Test results for commit a649e7d:

log archive for https://github.com/NVIDIA/cuda-python/actions/runs/14672386976
logs_37702343982.zip

$ python path_finder_abs_path_from_test_info.py Test*.txt

Test__linux-64__Python_3.10__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-64__Python_3.10__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-64__Python_3.11__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-64__Python_3.11__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-64__Python_3.12__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-64__Python_3.12__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-64__Python_3.13__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-64__Python_3.13__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-64__Python_3.9__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-64__Python_3.9__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-aarch64__Python_3.10__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-aarch64__Python_3.10__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-aarch64__Python_3.11__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-aarch64__Python_3.11__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-aarch64__Python_3.12__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-aarch64__Python_3.12__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-aarch64__Python_3.13__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-aarch64__Python_3.13__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-aarch64__Python_3.9__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-aarch64__Python_3.9__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__win-64__Python_3.12__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__win-64__Python_3.12__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-64__Python_3.10__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.10__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.10.17/x64/lib/python3.10/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.10.17/x64/lib/python3.10/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.10.17/x64/lib/python3.10/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-64__Python_3.10__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.11__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.11__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-64__Python_3.11__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.12__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.12__CUDA_12.8.0__Runner_H100__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.12__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-64__Python_3.12__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.13__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.13__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-64__Python_3.13__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.9__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.9__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.9.22/x64/lib/python3.9/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.9.22/x64/lib/python3.9/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.9.22/x64/lib/python3.9/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-64__Python_3.9__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.10__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.10__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.10.17/arm64/lib/python3.10/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.10.17/arm64/lib/python3.10/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.10.17/arm64/lib/python3.10/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-aarch64__Python_3.10__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.11__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.11__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.11.12/arm64/lib/python3.11/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.11.12/arm64/lib/python3.11/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.11.12/arm64/lib/python3.11/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-aarch64__Python_3.11__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.12__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.12__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.12.10/arm64/lib/python3.12/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.12.10/arm64/lib/python3.12/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.12.10/arm64/lib/python3.12/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-aarch64__Python_3.12__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.13__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.13__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.13.3/arm64/lib/python3.13/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.13.3/arm64/lib/python3.13/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.13.3/arm64/lib/python3.13/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-aarch64__Python_3.13__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.9__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.9__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.9.22/arm64/lib/python3.9/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.9.22/arm64/lib/python3.9/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.9.22/arm64/lib/python3.9/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-aarch64__Python_3.9__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__win-64__Python_3.12__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\nvidia\nvjitlink\bin\nvJitLink_120_0.dll
    C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\nvidia\cuda_nvrtc\bin\nvrtc64_120_0.dll
    C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\nvidia\cuda_nvcc\nvvm\bin\nvvm64_40_0.dll
Test__win-64__Python_3.12__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvJitLink_120_0.dll
    C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvrtc64_120_0.dll
    C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\nvvm\bin\nvvm64_40_0.dll

$ cat path_finder_abs_path_from_test_info.py

import sys


def get_info_abs_path(filename):
    print_buffer = [filename]
    done = set()
    for line in open(filename).read().splitlines():
        if "Z INFO " in line:
            flds = line.split(": abs_path=", 1)
            assert len(flds) == 2
            abs_path = eval(flds[1])  # eval undoes repr double backslashes
            if abs_path not in done:
                print_buffer.append(f"    {abs_path}")
                done.add(abs_path)
    return print_buffer


def run(args):
    no_info = []
    has_info = []
    for filename in sorted(args):
        print_buffer = get_info_abs_path(filename)
        assert print_buffer
        if len(print_buffer) == 1:
            no_info.append(print_buffer)
        else:
            has_info.append(print_buffer)
    for print_buffer in no_info + has_info:
        print("\n".join(print_buffer))


if __name__ == "__main__":
    run(args=sys.argv[1:])

…d_dl_windows.py with the help of Cursor.

rwgk · 2025-04-25T23:46:11Z

/ok to test b5cef1b

… passed), followed by ruff auto-fixes

rwgk · 2025-04-26T04:06:56Z

/ok to test 001a6a2

…ll tests passed), followed by ruff auto-fixes" This reverts commit 001a6a2. There were many GitHub Actions jobs that failed (all tests with 12.x): https://github.com/NVIDIA/cuda-python/actions/runs/14677553387 This is not worth spending time debugging. Especially because * Cursor has been unresponsive for at least half an hour: We're having trouble connecting to the model provider. This might be temporary - please try again in a moment. * The refactored code does not seem easier to read.

rwgk · 2025-04-26T05:29:53Z

/ok to test 0cd20d8

leofang · 2025-05-01T17:13:50Z

We shouldn't need to determine the CUDA driver version, we just need the major version which we're already capturing in the cuda.bindings version. You can run cuda.bindings 11.x against the 12.x CUDA driver without issue for example.

We just need a map of CUDA major version --> soname per library I think?

There is a catch here, which is what the current cuda.bindings and nvmath.bindings are based on. CUDA 12 driver can run CUDA 11 libraries, but not the other way around. So a mapping from the driver version to the supported CUDA major versions (then to the sonames) is still needed. Ex:

driver 12.x -> support CTK 11 & 12 -> 11 & 12 sonames
driver 11.x -> support CTK 11 -> 11 sonames only

…ual inspection of cuda_paths.py. Minor additional edits.

rwgk · 2025-05-01T17:36:10Z

There is a catch here, which is what the current cuda.bindings and nvmath.bindings are based on. CUDA 12 driver can run CUDA 11 libraries, but not the other way around. So a mapping from the driver version to the supported CUDA major versions (then to the sonames) is still needed. Ex:

driver 12.x -> support CTK 11 & 12 -> 11 & 12 sonames

driver 11.x -> support CTK 11 -> 11 sonames only

Question, scoped to the _find_dll_using_nvidia_bin_dirs and _find_so_using_nvidia_lib_dirs implementations, which are effectively searching for wheels (note that these pre-empt cuda_paths.py searches):

Currently I'm looking through sys.path (which includes site-packages) in order, and I stop as soon as I'm finding libname.so (first), or libname.so* (as fallback).

If we're now targeting, e.g., "just 11" or "12 or 11", do we want to fully traverse sys.path to enumerate all possible matches, then rank them and return the "best" match?

leofang · 2025-05-01T17:50:07Z

(btw I merged #593)

This reverts commit aeaf4f0.

rwgk · 2025-05-01T17:52:12Z

/ok to test aeaf4f0

kkraus14 · 2025-05-01T18:05:10Z

If we're now targeting, e.g., "just 11" or "12 or 11", do we want to fully traverse sys.path to enumerate all possible matches, then rank them and return the "best" match?

We want to find the library as if the extension module was normally built and linked against the library we're finding, which means that it would only search for the SONAME, so I would think we should return the first match of "just 11" in the case of having been built for CUDA 11.

toolshed/build_path_finder_sonames.py

toolshed/build_path_finder_dlls.py

…ind_sub_dirs_all_sitepackages() from find_nvidia_dynamic_library.py

This reverts commit bf9734c.

rwgk · 2025-05-04T05:47:02Z

There are two changes I still need to work on:

Change the Library Search Priority — I already worked this into the README.md in commit bf9734c, but I backed it out again here.
Determine driver version -> determine SONAME / DLL name -> search for the exact name

Doing that work under this PR would be unwieldy. This PR has so many commits and comments already, I have to click "Load more ..." several times to find the things I'm looking for. I'll merge this PR into the path_finder_dev branch, then pick up work on the Search Priority.

@pytest

* First version of `cuda.bindings.path_finder` (#447) * Unmodified copies of: * https://github.com/NVIDIA/numba-cuda/blob/bf487d78a40eea87f009d636882a5000a7524c95/numba_cuda/numba/cuda/cuda_paths.py * https://github.com/numba/numba/blob/f0d24824fcd6a454827e3c108882395d00befc04/numba/misc/findlib.py * Add Forked from URLs. * Strip down cuda_paths.py to minimum required for `_get_nvvm_path()` Tested interactively with: ``` import cuda_paths nvvm_path = cuda_paths._get_nvvm_path() print(f"{nvvm_path=}") ``` * ruff auto-fixes (NO manual changes) * Make `get_nvvm_path()` a pubic API (i.e. remove leading underscore). * Fetch numba-cuda/numba_cuda/numba/cuda/cuda_paths.py from NVIDIA/numba-cuda#155 AS-IS * ruff format NO MANUAL CHANGES * Minimal changes to adapt numba-cuda/numba_cuda/numba/cuda/cuda_paths.py from NVIDIA/numba-cuda#155 * Rename ecosystem/cuda_paths.py -> path_finder.py * Plug cuda.bindings.path_finder into cuda/bindings/_internal/nvvm_linux.pyx * Plug cuda.bindings.path_finder into cuda/bindings/_internal/nvjitlink_linux.pyx * Fix `os.path.exists(None)` issue: ``` ______________________ ERROR collecting test_nvjitlink.py ______________________ tests/test_nvjitlink.py:62: in <module> not check_nvjitlink_usable(), reason="nvJitLink not usable, maybe not installed or too old (<12.3)" tests/test_nvjitlink.py:58: in check_nvjitlink_usable return inner_nvjitlink._inspect_function_pointer("__nvJitLinkVersion") != 0 cuda/bindings/_internal/nvjitlink.pyx:257: in cuda.bindings._internal.nvjitlink._inspect_function_pointer ??? cuda/bindings/_internal/nvjitlink.pyx:260: in cuda.bindings._internal.nvjitlink._inspect_function_pointer ??? cuda/bindings/_internal/nvjitlink.pyx:208: in cuda.bindings._internal.nvjitlink._inspect_function_pointers ??? cuda/bindings/_internal/nvjitlink.pyx:102: in cuda.bindings._internal.nvjitlink._check_or_init_nvjitlink ??? cuda/bindings/_internal/nvjitlink.pyx:59: in cuda.bindings._internal.nvjitlink.load_library ??? /opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/cuda/bindings/path_finder.py:312: in get_cuda_paths "nvvm": _get_nvvm_path(), /opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/cuda/bindings/path_finder.py:285: in _get_nvvm_path by, path = _get_nvvm_path_decision() /opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/cuda/bindings/path_finder.py:96: in _get_nvvm_path_decision if os.path.exists(nvvm_ctk_dir): <frozen genericpath>:19: in exists ??? E TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType ``` * Fix another `os.path.exists(None)` issue: ``` ______________________ ERROR collecting test_nvjitlink.py ______________________ tests/test_nvjitlink.py:62: in <module> not check_nvjitlink_usable(), reason="nvJitLink not usable, maybe not installed or too old (<12.3)" tests/test_nvjitlink.py:58: in check_nvjitlink_usable return inner_nvjitlink._inspect_function_pointer("__nvJitLinkVersion") != 0 cuda/bindings/_internal/nvjitlink.pyx:257: in cuda.bindings._internal.nvjitlink._inspect_function_pointer ??? cuda/bindings/_internal/nvjitlink.pyx:260: in cuda.bindings._internal.nvjitlink._inspect_function_pointer ??? cuda/bindings/_internal/nvjitlink.pyx:208: in cuda.bindings._internal.nvjitlink._inspect_function_pointers ??? cuda/bindings/_internal/nvjitlink.pyx:102: in cuda.bindings._internal.nvjitlink._check_or_init_nvjitlink ??? cuda/bindings/_internal/nvjitlink.pyx:59: in cuda.bindings._internal.nvjitlink.load_library ??? /opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/cuda/bindings/path_finder.py:313: in get_cuda_paths "libdevice": _get_libdevice_paths(), /opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/cuda/bindings/path_finder.py:126: in _get_libdevice_paths by, libdir = _get_libdevice_path_decision() /opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/cuda/bindings/path_finder.py:73: in _get_libdevice_path_decision if os.path.exists(libdevice_ctk_dir): <frozen genericpath>:19: in exists ??? E TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType ``` * Change "/lib64/" → "/lib/" in nvjitlink_linux.pyx * nvjitlink_linux.pyx load_library() enhancements, mainly to avoid os.path.join(None, "libnvJitLink.so") * Add missing f-string f * Add back get_nvjitlink_dso_version_suffix() call. * pytest -ra -s -v * Rewrite nvjitlink_linux.pyx load_library() to produce detailed error messages. * Attach listdir output to "Unable to load" exception message. * Guard os.listdir() call with os.path.isdir() * Fix logic error in nvjitlink_linux.pyx load_library() * Move path_finder.py to _path_finder_utils/cuda_paths.py, import only public functions from new path_finder.py * Add find_nvidia_dynamic_library() and use from nvjitlink_linux.pyx, nvvm_linux.pyx * Fix oversight in _find_using_lib_dir() * Also look for versioned library in _find_using_nvidia_lib_dirs() * glob.glob() Python 3.9 compatibility * Reduce build-and-test.yml to Windows-only, Python 3.12 only. * Comment out `if: ${{ github.repository_owner == nvidia }}` * Revert "Comment out `if: ${{ github.repository_owner == nvidia }}`" This reverts commit b0db24f. * Add back `linux-64` `host-platform` * Rewrite load_library() in nvjitlink_windows.pyx to use path_finder.find_nvidia_dynamic_library() * Revert "Rewrite load_library() in nvjitlink_windows.pyx to use path_finder.find_nvidia_dynamic_library()" This reverts commit 1bb7151. * Add _inspect_environment() in find_nvidia_dynamic_library.py, call from nvjitlink_windows.pyx, nvvm_windows.pyx * Add & use _find_dll_using_nvidia_bin_dirs(), _find_dll_using_cudalib_dir() * Fix silly oversight: forgot to undo experimental change. * Also reduce test test-linux matrix. * Reimplement load_library() functions in nvjitlink_windows.pyx, nvvm_windows.pyx to actively use path_finder.find_nvidia_dynamic_library() * Factor out load_nvidia_dynamic_library() from _internal/nvjitlink_linux.pyx, nvvm_linux.pyx * Generalize load_nvidia_dynamic_library.py to also work under Windows. * Add `void*` return type to load_library() implementations in _internal/nvjitlink_windows.pyx, nvvm_windows.pyx * Resolve cython error: object handle vs `void*` handle ``` Error compiling Cython file: ------------------------------------------------------------ ... err = (<int (*)(int*) nogil>__cuDriverGetVersion)(&driver_ver) if err != 0: raise RuntimeError('something went wrong') # Load library handle = load_library(driver_ver) ^ ------------------------------------------------------------ cuda\bindings\_internal\nvjitlink.pyx:72:29: Cannot convert 'void *' to Python object ``` * Resolve another cython error: `void*` handle vs `intptr_t` handle ``` Error compiling Cython file: ------------------------------------------------------------ ... handle = load_library(driver_ver) # Load function global __nvJitLinkCreate try: __nvJitLinkCreate = <void*><intptr_t>win32api.GetProcAddress(handle, 'nvJitLinkCreate') ^ ------------------------------------------------------------ cuda\bindings\_internal\nvjitlink.pyx:78:73: Cannot convert 'void *' to Python object ``` * Resolve signed/unsigned runtime error. Use uintptr_t consistently. https://github.com/NVIDIA/cuda-python/actions/runs/14224673173/job/39861750852?pr=447#logs ``` =================================== ERRORS ==================================== _____________________ ERROR collecting test_nvjitlink.py ______________________ tests\test_nvjitlink.py:62: in <module> not check_nvjitlink_usable(), reason="nvJitLink not usable, maybe not installed or too old (<12.3)" tests\test_nvjitlink.py:58: in check_nvjitlink_usable return inner_nvjitlink._inspect_function_pointer("__nvJitLinkVersion") != 0 cuda\\bindings\\_internal\\nvjitlink.pyx:221: in cuda.bindings._internal.nvjitlink._inspect_function_pointer ??? cuda\\bindings\\_internal\\nvjitlink.pyx:224: in cuda.bindings._internal.nvjitlink._inspect_function_pointer ??? cuda\\bindings\\_internal\\nvjitlink.pyx:172: in cuda.bindings._internal.nvjitlink._inspect_function_pointers ??? cuda\\bindings\\_internal\\nvjitlink.pyx:73: in cuda.bindings._internal.nvjitlink._check_or_init_nvjitlink ??? cuda\\bindings\\_internal\\nvjitlink.pyx:46: in cuda.bindings._internal.nvjitlink.load_library ??? E OverflowError: can't convert negative value to size_t ``` * Change <void*><uintptr_t>win32api.GetProcAddress` back to `intptr_t`. Changing load_nvidia_dynamic_library() to also use to-`intptr_t` conversion, for compatibility with win32api.GetProcAddress. Document that CDLL behaves differently (it uses to-`uintptr_t`). * Use win32api.LoadLibrary() instead of ctypes.windll.kernel32.LoadLibraryW(), to be more similar to original (and working) cython code. Hoping to resolve this kind of error: ``` _ ERROR at setup of test_c_or_v_program_fail_bad_option[txt-compile_program] __ request = <SubRequest 'minimal_nvvmir' for <Function test_c_or_v_program_fail_bad_option[txt-compile_program]>> @pytest.fixture(params=MINIMAL_NVVMIR_FIXTURE_PARAMS) def minimal_nvvmir(request): for pass_counter in range(2): nvvmir = MINIMAL_NVVMIR_CACHE.get(request.param, -1) if nvvmir != -1: if nvvmir is None: pytest.skip(f"UNAVAILABLE: {request.param}") return nvvmir if pass_counter: raise AssertionError("This code path is meant to be unreachable.") # Build cache entries, then try again (above). > major, minor, debug_major, debug_minor = nvvm.ir_version() tests\test_nvvm.py:148: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ cuda\bindings\nvvm.pyx:95: in cuda.bindings.nvvm.ir_version cpdef tuple ir_version(): cuda\bindings\nvvm.pyx:113: in cuda.bindings.nvvm.ir_version status = nvvmIRVersion(&major_ir, &minor_ir, &major_dbg, &minor_dbg) cuda\bindings\cynvvm.pyx:19: in cuda.bindings.cynvvm.nvvmIRVersion return _nvvm._nvvmIRVersion(majorIR, minorIR, majorDbg, minorDbg) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E cuda.bindings._internal.utils.FunctionNotFoundError: function nvvmIRVersion is not found ``` * Remove debug print statements. * Remove some cruft. * Trivial renaming of variables. No functional changes. * Revert debug changes under .github/workflows * Rename _path_finder_utils → _path_finder * Remove LD_LIBRARY_PATH in fetch_ctk/action.yml * Linux: First try using the platform-specific dynamic loader search mechanisms * Add _windows_load_with_dll_basename() * Revert "Revert debug changes under .github/workflows" This reverts commit cc6113c. * Add debug prints in load_nvidia_dynamic_library() * Report dlopen error for libnvrtc.so.12 * print("\nLOOOK dlfcn.dlopen('libnvrtc.so.12', dlfcn.RTLD_NOW)", flush=True) * Revert "Remove LD_LIBRARY_PATH in fetch_ctk/action.yml" This reverts commit 1b1139c. * Only remove ${CUDA_PATH}/nvvm/lib64 from LD_LIBRARY_PATH * Use path_finder.load_nvidia_dynamic_library("nvrtc") from cuda/bindings/_bindings/cynvrtc.pyx.in * Somewhat ad hoc heuristics for nvidia_cuda_nvrtc wheels. * Remove LD_LIBRARY_PATH entirely from .github/actions/fetch_ctk/action.yml * Remove CUDA_PATH\nvvm\bin in .github/workflows/test-wheel-windows.yml * Revert "Remove LD_LIBRARY_PATH entirely from .github/actions/fetch_ctk/action.yml" This reverts commit bff8cf0. * Revert "Somewhat ad hoc heuristics for nvidia_cuda_nvrtc wheels." This reverts commit 43abec8. * Restore cuda/bindings/_bindings/cynvrtc.pyx.in as-is on main * Remove debug print from load_nvidia_dynamic_library.py * Reapply "Revert debug changes under .github/workflows" This reverts commit aaa6aff. * Make `path_finder` work for `"nvrtc"` (#553) * Revert "Restore cuda/bindings/_bindings/cynvrtc.pyx.in as-is on main" This reverts commit ba093f5. * Revert "Reapply "Revert debug changes under .github/workflows"" This reverts commit 8f69f83. * Also load nvrtc from cuda_bindings/tests/path_finder.py * Add heuristics for nvidia_cuda_nvrtc Windows wheels. Also fix a couple bugs discovered by ChatGPT: * `glob.glob()` in this code return absolute paths. * stray `error_messages = []` * Add debug prints, mostly for `os.add_dll_directory(bin_dir)` * Fix unfortunate silly oversight (import os missing under Windows) * Use `win32api.LoadLibraryEx()` with suitable `flags`; also update `os.environ["PATH"]` * Hard-wire WinBase.h constants (they are not exposed by win32con) * Remove debug prints * Reapply "Reapply "Revert debug changes under .github/workflows"" This reverts commit b002ff6. * Add `path_finder.SUPPORTED_LIBNAMES` (#558) * Revert "Reapply "Revert debug changes under .github/workflows"" This reverts commit 8f69f83. * Add names of all CTK 12.8.1 x86_64-linux libraries (.so) as `path_finder.SUPPORTED_LIBNAMES` https://chatgpt.com/share/67f98d0b-148c-8008-9951-9995cf5d860c * Add `SUPPORTED_WINDOWS_DLLS` * Add copyright notice * Move SUPPORTED_LIBNAMES, SUPPORTED_WINDOWS_DLLS to _path_finder/supported_libs.py * Use SUPPORTED_WINDOWS_DLLS in _windows_load_with_dll_basename() * Change "Set up mini CTK" to use `method: local`, remove `sub-packages` line. * Use Jimver/[email protected] also under Linux, `method: local`, no `sub-packages`. * Add more `nvidia-*-cu12` wheels to get as many of the supported shared libraries as possible. * Revert "Use Jimver/[email protected] also under Linux, `method: local`, no `sub-packages`." This reverts commit d499806. Problem observed: ``` /usr/bin/docker exec 1b42cd4ea3149ac3f2448eae830190ee62289b7304a73f8001e90cead5005102 sh -c "cat /etc/*release | grep ^ID" Warning: Failed to restore: Cache service responded with 422 /usr/bin/tar --posix -cf cache.tgz --exclude cache.tgz -P -C /__w/cuda-python/cuda-python --files-from manifest.txt -z Failed to save: Unable to reserve cache with key cuda_installer-linux-5.15.0-135-generic-x64-12.8.0, another job may be creating this cache. More details: This legacy service is shutting down, effective April 15, 2025. Migrate to the new service ASAP. For more information: https://gh.io/gha-cache-sunset Warning: Error during installation: Error: Unable to locate executable file: sudo. Please verify either the file path exists or the file can be found within a directory specified by the PATH environment variable. Also check the file mode to verify the file is executable. Error: Error: Unable to locate executable file: sudo. Please verify either the file path exists or the file can be found within a directory specified by the PATH environment variable. Also check the file mode to verify the file is executable. ``` * Change test_path_finder::test_find_and_load() to skip cufile on Windows, and report exceptions as failures, except for cudart * Add nvidia-cuda-runtime-cu12 to pyproject.toml (for libname cudart) * test_path_finder.py: before loading cusolver, load nvJitLink, cusparse, cublas (experiment to see if that resolves the only Windows failure) Test (win-64, Python 3.12, CUDA 12.8.0, Runner default, CTK wheels) / test ``` ================================== FAILURES =================================== ________________________ test_find_and_load[cusolver] _________________________ libname = 'cusolver' @pytest.mark.parametrize("libname", path_finder.SUPPORTED_LIBNAMES) def test_find_and_load(libname): if sys.platform == "win32" and libname == "cufile": pytest.skip(f'test_find_and_load("{libname}") not supported on this platform') print(f'\ntest_find_and_load("{libname}")') failures = [] for algo, func in ( ("find", path_finder.find_nvidia_dynamic_library), ("load", path_finder.load_nvidia_dynamic_library), ): try: out = func(libname) except Exception as e: out = f"EXCEPTION: {type(e)} {str(e)}" failures.append(algo) print(out) print() > assert not failures E AssertionError: assert not ['load'] tests\test_path_finder.py:29: AssertionError ``` * test_path_finder.py: load *only* nvJitLink before loading cusolver * Run each test_find_or_load_nvidia_dynamic_library() subtest in a subprocess * Add cublasLt to supported_libs.py and load deps for cusolver, cusolverMg, cusparse in test_path_finder.py. Also restrict test_path_finder.py to test load only for now. * Add supported_libs.DIRECT_DEPENDENCIES * Remove cufile_rdma from supported libs (comment out). https://chatgpt.com/share/68033a33-385c-8008-a293-4c8cc3ea23ae * Split out `PARTIALLY_SUPPORTED_LIBNAMES`. Fix up test code. * Reduce public API to only load_nvidia_dynamic_library, SUPPORTED_LIBNAMES * Set CUDA_BINDINGS_PATH_FINDER_TEST_ALL_LIBNAMES=1 to match expected availability of nvidia shared libraries. * Refactor as `class _find_nvidia_dynamic_library` * Strict wheel, conda, system rule: try using the platform-specific dynamic loader search mechanisms only last * Introduce _load_and_report_path_linux(), add supported_libs.EXPECTED_LIB_SYMBOLS * Plug in ctypes.windll.kernel32.GetModuleFileNameW() * Keep track of nvrtc-related GitHub comment * Factor out `_find_dll_under_dir(dirpath, file_wild)` and reuse from `_find_dll_using_nvidia_bin_dirs()`, `_find_dll_using_cudalib_dir()` (to fix loading nvrtc64_120_0.dll from local CTK) * Minimal "is already loaded" code. * Add THIS FILE NEEDS TO BE REVIEWED/UPDATED FOR EACH CTK RELEASE comment in _path_finder/supported_libs.py * Add SUPPORTED_LINUX_SONAMES in _path_finder/supported_libs.py * Update SUPPORTED_WINDOWS_DLLS in _path_finder/supported_libs.py based on DLLs found in cuda_*win*.exe files. * Remove `os.add_dll_directory()` and `os.environ["PATH"]` manipulations from find_nvidia_dynamic_library.py. Add `supported_libs.LIBNAMES_REQUIRING_OS_ADD_DLL_DIRECTORY` and use from `load_nvidia_dynamic_library()`. * Move nvrtc-specific code from find_nvidia_dynamic_library.py to `supported_libs.is_suppressed_dll_file()` * Introduce dataclass LoadedDL as return type for load_nvidia_dynamic_library() * Factor out _abs_path_for_dynamic_library_* and use on handle obtained through "is already loaded" checks * Factor out _load_nvidia_dynamic_library_no_cache() and use for exercising LoadedDL.was_already_loaded_from_elsewhere * _check_nvjitlink_usable() in test_path_finder.py * Undo changes in .github/workflows/ and cuda_bindings/pyproject.toml * Move cuda_bindings/tests/path_finder.py -> toolshed/run_cuda_bindings_path_finder.py * Add bandit suppressions in test_path_finder.py * Add pytest info_summary_append fixture and use from test_path_finder.py to report the absolute paths of the loaded libraries. * Fix tiny accident: a line in pyproject.toml got lost somehow. * Undo changes under .github (LD_LIBRARY_PATH, PATH manipulations for nvvm). * 2025-05-01 version of `cuda.bindings.path_finder` (#578) * Undo changes to the nvJitLink, nvrtc, nvvm bindings * Undo changes under .github, specific to nvvm, manipulating LD_LIBRARY_PATH or PATH * PARTIALLY_SUPPORTED_LIBNAMES_LINUX, PARTIALLY_SUPPORTED_LIBNAMES_WINDOWS * Update EXPECTED_LIB_SYMBOLS for nvJitLink to cleanly support CTK versions 12.0, 12.1, 12.2 * Save result of factoring out load_dl_common.py, load_dl_linux.py, load_dl_windows.py with the help of Cursor. * Fix an auto-generated docstring * first round of Cursor refactoring (about 4 iterations until all tests passed), followed by ruff auto-fixes * Revert "first round of Cursor refactoring (about 4 iterations until all tests passed), followed by ruff auto-fixes" This reverts commit 001a6a2. There were many GitHub Actions jobs that failed (all tests with 12.x): https://github.com/NVIDIA/cuda-python/actions/runs/14677553387 This is not worth spending time debugging. Especially because * Cursor has been unresponsive for at least half an hour: We're having trouble connecting to the model provider. This might be temporary - please try again in a moment. * The refactored code does not seem easier to read. * A couple trivial tweaks * Prefix the public API (just two items) with underscores for now. * Add SPDX-License-Identifier to all files under toolshed/ that don't have it already * Add SPDX-License-Identifier under cuda_bindings/tests/ * Respond to "Do these need to be run as subprocesses?" review question (#578 (comment)) * Respond to "dead code?" review questions (e.g. #578 (comment)) * Respond to "Do we need to implement a cache separately ..." review question (#578 (comment)) * Remove cuDriverGetVersion() function for now. * Move add_dll_directory() from load_dl_common.py to load_dl_windows.py (response to review question #578 (comment)) * Add SPDX-License-Identifier and # Forked from: URL in cuda_paths.py * Add Add SPDX-License-Identifier and Original LICENSE in findlib.py * Very first draft of README.md * Update README.md, mostly as revised by perplexity, with various manual edits. * Refork cuda_paths.py AS-IS: https://github.com/NVIDIA/numba-cuda/blob/8c9c9d0cb901c06774a9abea6d12b6a4b0287e5e/numba_cuda/numba/cuda/cuda_paths.py * ruff format cuda_paths.py (NO manual changes) * Add back _get_numba_CUDA_INCLUDE_PATH from 2279bda (i.e. cuda_paths.py as it was right before re-forking) * Remove cuda_paths.py dependency on numba.cuda.cudadrv.runtime * Add Forked from URLs, two SPDX-License-Identifier, Original Numba LICENSE * Temporarily restore debug changes under .github/workflows, for expanded path_finder test coverage * Restore cuda_path.py AS-IT-WAS at commit 2279bda * Revert "Restore cuda_path.py AS-IT-WAS at commit 2279bda" This reverts commit 1b88ec2. * Force compute-sanitizer off unconditionally * Revert "Force compute-sanitizer off unconditionally" This reverts commit 2bc7ef6. * Add timeout=10 seconds to test_path_finder.py subprocess.run() invocations. * Increase test_path_finder.py subprocess.run() timeout to 30 seconds: Under Windows, loading cublas or cusolver may exceed the 10 second timeout: #578 (comment) * Revert "Temporarily restore debug changes under .github/workflows, for expanded path_finder test coverage" This reverts commit 47ad79f. * Force compute-sanitizer off unconditionally * Add: Note that the search is done on a per-library basis. * Add Note for CUDA_HOME / CUDA_PATH * Add 0. **Check if a library was loaded into the process already by some other means.** * _find_dll_using_nvidia_bin_dirs(): reuse lib_searched_for in place of file_wild * Systematically replace all relative imports with absolute imports. * handle: int → ctypes.CDLL fix * Make load_dl_windows.py abs_path_for_dynamic_library() implementation maximally robust. * Change argument name → libname for self-consistency * Systematically replace previously overlooked relative imports with absolute imports. * Simplify code (also for self-consistency) * Expand the 3. **System Installations** section with information produced by perplexity * Pull out `**Environment variables**` into an added section, after manual inspection of cuda_paths.py. Minor additional edits. * Revert "Force compute-sanitizer off unconditionally" This reverts commit aeaf4f0. * Move _path_finder/sys_path_find_sub_dirs.py → find_sub_dirs.py, use find_sub_dirs_all_sitepackages() from find_nvidia_dynamic_library.py * WIP (search priority updated in README.md but not in code) * Revert "WIP (search priority updated in README.md but not in code)" This reverts commit bf9734c. * WIP (search priority updated in README.md but not in code) * Completely replace cuda_paths.py to achieve the desired Search Priority (see updated README.md). * Define `IS_WINDOWS = sys.platform == "win32"` in supported_libs.py * Use os.path.samefile() to resolve issues with doubled backslashes. * `load_in_subprocess(): Pass current environment * Add run_python_code_safely.py as generated by perplexity, plus ruff format, bandit nosec * Replace subprocess.run with run_python_code_safely * Factor out `class Worker` to fix pickle issue. * ChatGPT revisions based on Deep research: https://chatgpt.com/share/681914ce-f274-8008-9e9f-4538716b4ed7 * Fix race condition in result queue handling by using timeout-based get() The previous implementation checked result_queue.empty() before calling get(), which introduces a classic race condition: the queue may become non-empty immediately after the check, resulting in missed results or misleading errors. This patch replaces the empty() check with result_queue.get(timeout=1.0), allowing the parent process to robustly wait for results with a bounded delay. Also switches from ctx.SimpleQueue() to ctx.Queue() for compatibility with timeout-based get(), which SimpleQueue does not support on Python ≤3.12. Note: The race condition was discovered by Gemini 2.5 * Resolve SIM108 * Change to "nppc" as ANCHOR_LIBNAME * Implement CUDA_PYTHON_CUDA_HOME_PRIORITY first, last, with default first * Remove retry_with_anchor_abs_path() and make retry_with_cuda_home_priority_last() the default. * Update README.md to reflect new search priority * SUPPORTED_LINUX_SONAMES does not need updates for CTK 12.9.0 * The only addition to SUPPORTED_WINDOWS_DLLS for CTK 12.9.0 is nvvm70.dll * Make OSError in load_dl_windows.py abs_path_for_dynamic_library() more informative. * run_cuda_bindings_path_finder.py: optionally use args as libnames (to aid debugging) * Bug fix in load_dl_windows.py: ctypes.windll.kernel32.LoadLibraryW() returns an incompatible `handle`. Use win32api.LoadLibraryEx() instead to ensure self-consistency. * Remove _find_nvidia_dynamic_library.retry_with_anchor_abs_path() method. Move run_python_code_safely.py to test/ directory. * Add missing SPDX-License-Identifier

Undo changes to the nvJitLink, nvrtc, nvvm bindings

17478da

Undo changes under .github, specific to nvvm, manipulating LD_LIBRARY…

7da74bd

…_PATH or PATH

PARTIALLY_SUPPORTED_LIBNAMES_LINUX, PARTIALLY_SUPPORTED_LIBNAMES_WINDOWS

211164d

leofang assigned leofang and rwgk and unassigned leofang Apr 25, 2025

leofang self-requested a review April 25, 2025 19:34

leofang added cuda.bindings Everything related to the cuda.bindings module P0 High priority - Must do! feature New feature or request labels Apr 25, 2025

leofang added this to the cuda-python parking lot milestone Apr 25, 2025

Update EXPECTED_LIB_SYMBOLS for nvJitLink to cleanly support CTK vers…

a649e7d

…ions 12.0, 12.1, 12.2

Save result of factoring out load_dl_common.py, load_dl_linux.py, loa…

b5cef1b

…d_dl_windows.py with the help of Cursor.

rwgk force-pushed the path_finder_review1 branch from 2452fdb to b5cef1b Compare April 25, 2025 23:45

rwgk added 2 commits April 25, 2025 16:51

Fix an auto-generated docstring

bc0137a

first round of Cursor refactoring (about 4 iterations until all tests…

001a6a2

… passed), followed by ruff auto-fixes

rwgk added 3 commits April 25, 2025 22:21

A couple trivial tweaks

c409346

Merge branch 'path_finder_dev' into path_finder_review1

0cd20d8

rwgk requested a review from kkraus14 April 26, 2025 06:21

Pull out **Environment variables** into an added section, after man…

b910a6b

…ual inspection of cuda_paths.py. Minor additional edits.

rwgk mentioned this pull request May 1, 2025

CI: Always fetch the compute-sanitizer from the CTK 12.8.0 #593

Merged

2 tasks

rwgk force-pushed the path_finder_review1 branch from 4e39090 to e2d6682 Compare May 1, 2025 17:49

rwgk force-pushed the path_finder_review1 branch from e2d6682 to a7b0633 Compare May 1, 2025 17:50

rwgk added 2 commits May 1, 2025 10:51

Merge branch 'main' into path_finder_review1

0fa2c83

Revert "Force compute-sanitizer off unconditionally"

fc22b1d

This reverts commit aeaf4f0.

rwgk force-pushed the path_finder_review1 branch from a7b0633 to fc22b1d Compare May 1, 2025 17:51

leofang reviewed May 1, 2025

View reviewed changes

toolshed/build_path_finder_sonames.py Show resolved Hide resolved

toolshed/build_path_finder_dlls.py Show resolved Hide resolved

rwgk added 5 commits May 2, 2025 23:28

Merge branch 'main' into path_finder_review1

ad0d2f3

Move _path_finder/sys_path_find_sub_dirs.py → find_sub_dirs.py, use f…

5d970f2

…ind_sub_dirs_all_sitepackages() from find_nvidia_dynamic_library.py

WIP (search priority updated in README.md but not in code)

bf9734c

Merge branch 'main' into path_finder_review1

b8ab986

Revert "WIP (search priority updated in README.md but not in code)"

6aa1f13

This reverts commit bf9734c.

rwgk changed the title ~~First version of cuda.bindings.path_finder~~ 2025-05-01 version of cuda.bindings.path_finder May 4, 2025

rwgk changed the base branch from main to path_finder_dev May 4, 2025 05:48

Merge branch 'path_finder_dev' into path_finder_review1

0a21021

rwgk merged commit 2a6452d into NVIDIA:path_finder_dev May 4, 2025
1 check passed

rwgk mentioned this pull request May 4, 2025

[path_finder_dev] path_finder Search Priority v2 #604

Merged

rwgk deleted the path_finder_review1 branch May 4, 2025 15:45

rwgk mentioned this pull request May 6, 2025

Merge path_finder_dev into main #613

Merged

rwgk changed the title ~~2025-05-01 version of cuda.bindings.path_finder~~ [path_finder_dev] 2025-05-01 version of cuda.bindings.path_finder May 8, 2025

leofang modified the milestones: cuda-python 12-next, 11-next, cuda-python 12.9.0 & 11.8.7 Jul 21, 2025

[path_finder_dev] 2025-05-01 version of cuda.bindings.path_finder #578

[path_finder_dev] 2025-05-01 version of cuda.bindings.path_finder #578

Uh oh!

Conversation

rwgk commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

copy-pr-bot bot commented Apr 25, 2025

Uh oh!

rwgk commented Apr 25, 2025

Uh oh!

github-actions bot commented Apr 25, 2025

Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

rwgk commented Apr 25, 2025

Uh oh!

rwgk commented Apr 25, 2025

Uh oh!

rwgk commented Apr 25, 2025

Uh oh!

rwgk commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rwgk commented Apr 25, 2025

Uh oh!

rwgk commented Apr 26, 2025

Uh oh!

rwgk commented Apr 26, 2025

Uh oh!

leofang commented May 1, 2025

Uh oh!

rwgk commented May 1, 2025

Uh oh!

leofang commented May 1, 2025

Uh oh!

rwgk commented May 1, 2025

Uh oh!

kkraus14 commented May 1, 2025

Uh oh!

Uh oh!

Uh oh!

rwgk commented May 4, 2025

Uh oh!

Uh oh!

Uh oh!

[path_finder_dev] 2025-05-01 version of `cuda.bindings.path_finder` #578

[path_finder_dev] 2025-05-01 version of `cuda.bindings.path_finder` #578

rwgk commented Apr 25, 2025 •

edited

Loading

rwgk commented Apr 25, 2025 •

edited

Loading