cuda_fp16.h is not able to include from nvrtcCompileProgram #845
-
Is this a duplicate?
Type of BugRuntime Error Componentcuda.bindings Describe the bugI am trying to create a fp16 kernel program with nvrtcCompileProgram but it fails to compile with the cuda-python wrapper. I also tried the cpp solution to compile the same program, it works. So I believe the bug lies in cuda-python wrapper. How to ReproduceMy code repo: https://github.com/tigert1998/mytorch/blob/main/cuda_utils.py#L120 Expected behavior"cuda_fp16.h" should be correctly included with no errors. Operating SystemWindows 11 nvidia-smi outputSun Aug 17 13:32:58 2025 +-----------------------------------------------------------------------------------------+ |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 4 replies
-
|
Your C++ code is not equivalent to your Python code. Could you make sure:
|
Beta Was this translation helpful? Give feedback.
It's my bad. Sorry for wasting your time. I put a
import torch.nn as nnin somewhere of my code. And that causes the result ofnvrtcbeing weird.