-
Notifications
You must be signed in to change notification settings - Fork 952
Implement __dlpack__
dunder for pylibcudf columns
#18566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-25.06
Are you sure you want to change the base?
Conversation
This implements the `__dlpack__` dunder (and `__dlpack_device__`), which could then be also forwarded to libcudf columns. There is a bit of a clash with the old dlpack implementation. It is similar, but also different because the old one is table centric and always copies, while this is column centric and copies only if requested. Thus, I kept it as detailed API. The `from_dlpack()` can/should be extended to support at least 1-D objects that implement `__dlpack__` (although) One of the more complex things here is the stream synchronization unfortunately, it seems very hard to test reliably in practice. Signed-off-by: Sebastian Berg <[email protected]>
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Just noting a relevant issue I've worked on recently. |
Thanks Sebastian! Yes, this is definitely of interest. Probably the most relevant similar things that have happened recently are #18402 and #15370, the corresponding Arrow protocols.
Possibly because everything in cudf executes on the default stream anyway. You would have to explicitly request a different (non-blocking) stream to test this I think. Not sure if you tried that.
I'm not sure that I understand. The dlpack spec is intended to support C-level interfacing as well as Python, so are you just asking whether libcudf wants to do that?
As in vendoring vs cloning as part of the build? We don't vendor too much in RAPIDS, but the dlpack header is small enough that it could probably be justified. |
Yeah,unfortunately we don't have as clearly defined interface for exchange in C/C++. So I think it makes sense to make it public (with a slightly different signature). But yeah, we could just expose it outside
Maybe actually safer to make it
Yeah, was using a second non-blocking cupy stream. But I should try mixing host/device copy and kernel launches rather than two kernel launches to improve the chance of seeing something, maybe.
Yeah, most projects do and I think it should be a build-time and not a runtime dependency in |
I would be fine with vendoring the header. dlpack is simpler than arrow and we are also not using a large helper library like nanoarrow that is a moving target. Vendoring a single header for this would probably be simpler and also address issues like #12175. Do we need to support multiple versions? Can we just go straight to the newer versioned dlpack structs? |
Maybe we should just to keep things simple and since nobody complained much about the lack, yet.
👍 |
Practically speaking I think Arrow data interchange is more valuable for cudf than dlpack, especially on the export side. Since nested types involve multiple buffers, using dlpack for that data requires looping over each buffer, which really is only supportable from Python at the pylibcudf level since we don't expose those buffers in the public pandas- or polars-like APIs. We're still building out the pylibcudf API for real public consumption, so if we can get our dlpack ducks in a row in time for that I think that would be sufficient. |
This implements the
__dlpack__
dunder (and__dlpack_device__
), which could then be also forwarded to libcudf columns.There is a bit of a clash with the old dlpack implementation. It is similar, but also different because the old one is table centric and always copies, while this is column centric and copies only if requested.
Thus, I kept it as detailed API.
The
from_dlpack()
can/should be extended to support at least 1-D objects that implement__dlpack__
(although)One of the more complex things here is the stream synchronization unfortunately, it seems very hard to test reliably in practice. (My tries didn't produce a failure when it should fail.)
Marking as draft, since there is likely some discussion/thought needed. For one, while it existed I am not sure I believe in exposing this in C++ (at this time). And then it might work to create a helper/intermediate object rather than doing it all here?
(Also wondering if it wouldn't be easier to just vendor the dlpack header?)
CC @vyasr, since I think you were interested in this.