data-interchange description of DLPack
See original GitHub issueThe implementation section of the data_interchange
document contains some incorrect statements:
-
The
__dlpack__
method will produce aPyCapsule
containing aDLPackManagedTensor
, …
The DLPackManagedTensor
should be DLManagedTensor
consistent with dlpack.h and the DLPack diagram.
I also think it would be beneficial to make it more precise:
The
__dlpack__
method will produce aPyCapsule
object containing a reference toDLPackManagedTensor
struct, …
-
The consumer must set the
PyCapsule
name to"used_dltensor"
, and call thedeleter
of theDLPackManagedTensor
when it no longer needs the data.
The DLPackManagedTensor
here also should be DLManagedTensor
, but more importantly, the consumer should not be calling the deleter
function. The DLManagedTensor.deleter
must be called by the PyCapsule_Destructor
function provided by the producer.
I also would like to call the mandatory nature of capsule renaming from "dltensor"
to "used_dltensor"
into question. With proper memory management the capsule can be consumed more than once, creating multiple views into the same allocation.
I would therefore like to propose to change that statement from a requirement to a recommendation.
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (7 by maintainers)
Top GitHub Comments
That may be OK, but you have to be very clear about the fact that a producer must produce a fresh capsule every time, because consumers may rename the capsule and “steal” ownership of the
DLManagedTensor
! And there can only be one owner for it (either the capsule or the importer). Currently, everyone renames the capsule and this is not an issue (there is no reason to forget it). You also may have to be clear, that this means it is only OK for an import function that calls__dlpack__
explicitly (just in case).With the above exception, what should be added is something like: The consumer must take ownership of the
DLManangedTensor
by renaming the capsule to “used_dltensor” (note that there can only be a single owner of theDLManagedTensor
, renaming the capsule transfers that ownership from the capsule to your object). After taking ownership by renaming the capsule, the consumer must call the deleter function when it does not need the memory anymore.I agree they do, but getting the mix right is a bit tricky, although probably possible.
It may help to split it out into what you have to do as a producer and as a consumer.
The producer must always export a new
DLManagedTensor
and a new capsule, because the consumer may rename the capsule and steal ownership. (To me, violating this seems tempting if you do not use the renaming option for consumers.)Also, a producer must set up the capsule so that it calls
DLManagedTensor.delete
if and only if its name isdltensor
.The consumer must either:
__dlpack__()
. (The “only” could be violated in certain cases.)However, while I agree that the second thing should work and is simpler. It is tricky!
For example, if, say
pytorch
accepts a bare capsule, and you exposemyobj.base
to be the capsule, thenpytorch.from_dlpack(myobj.base)
would steal your ownership!And because I think it is legal to import from the capsule directly (as in some libraries probably may be doing it), you would have to add another requirement: If you do not rename the capsule, you must ensure the the capsule is not available through public API. And that last part is tricky…
Of course you could also hope that nobody does
pytorch.from_dlpack(myobj.base)
or that nobody allows importing the capsules explicitly.