Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA][shared memory allocation]fix 'ptxas error : Entry function 'fu… #17267

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AIYoungcino
Copy link

I convert a vit model from onnx, and then run relay.build with NVIDAI-RTX4090 for compilation.

with tvm.transform.PassContext(opt_level=3):
lib = relay.build(mod, target=target, params=params)
and then meet an error like this: Compilation error:
ptxas error : Entry function 'tvmgen_default_fused_nn_conv2d_add_kernel' uses too much shared data (0x2ab44 bytes, 0x29000 max)

I apologize for resorting to this temporary solution to address the issue I encountered. As a stepping stone, I hope the experts can offer some advice to help me resolve this problem more effectively. Thank you.

@AIYoungcino
Copy link
Author

I searched for information provided by NVIDIA, and the following are the maximum shared memory limits corresponding to each generation of GPU architecture,
5.x : 64kb
6.x : 64kb
7.x : 96kb
8.x : 164kb

@vinx13
Copy link
Member

vinx13 commented Aug 16, 2024

Dynamic shared memory (shared.dyn scope) should be used in this case to bypass the size limit

@AIYoungcino
Copy link
Author

Dynamic shared memory (shared.dyn scope) should be used in this case to bypass the size limit

Thank you for your advice. if the size of the result of conv2d exceeds the maximum shared memory limit, storing it in shared memory would lead to overflow. It's typically passed as a parameter to the kernel function or allocated GDRAm space through static extern at compile time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants