Skip to content

Possible solution of using DLPack to handle TVM FFI#181

Open
haoran35-jpg wants to merge 3 commits intoHazyResearch:mainfrom
haoran35-jpg:main
Open

Possible solution of using DLPack to handle TVM FFI#181
haoran35-jpg wants to merge 3 commits intoHazyResearch:mainfrom
haoran35-jpg:main

Conversation

@haoran35-jpg
Copy link
Copy Markdown

  1. Hey! I was trying to research for opportunity of lowering JAX to ThunderKittens, and met the same issue as mentioned in [TVM FFI support #176]. I searched internet and figured that DLPack kind of borrows the idea of TVM FFI, although there is certain differences: https://tvm.apache.org/ffi/concepts/tensor.html. I tried to implement DLPack methods and tested with both routes, getting the following difference in VRAM use:
    Without DLPack Using DLPack
    100.000 MB 50.000 MB

The two routes are respectively: tensor.clone() -> TKParallelTensor(tensor, ...) and JAX → __dlpack__() → TKParallelTensor(capsule). The VRAM use is integer because I'm only testing with a single fixed tensor size (50 MB)—i.e. 50×1024×1024/4 float32 elements—so the copy path allocates exactly two buffers (100 MB) and the DLPack path one (50 MB), giving round numbers. Not sure if this is in a good shape of PR? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants