You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using Tensor-wide operations, the relative cost of dynamic dispatch is very small. However, there are cases, especially in your own kernels, where efficient element-wise access is needed, and the cost of dynamic dispatch inside the element-wise loop is very high. ATen provides accessors that are created with a single dynamic check that a Tensor is the type and number of dimensions. Accessors then expose an API for accessing the Tensor elements efficiently.
We should use these accessors whenever possible, typically when we do stuff like output.ptsSeconds[f] = singleOut.ptsSeconds.
I don't think it's going to lead to some crazy speedup, because we're usually not decoding enough frames for the dispatcher cost to be visible (I suspect).
On previous work (torchvision image decoders, torch.nn.interpolate C++ implementation), I did observe that using accessors instead of plain indexing led to crazy speedups (but that was on every single pixel values, which are order of magnitude bigger than the number of frames).
Anyway, we should probably still do it, since it's good practice.
The text was updated successfully, but these errors were encountered:
https://pytorch.org/cppdocs/notes/tensor_basics.html#efficient-access-to-tensor-elements
We should use these accessors whenever possible, typically when we do stuff like
output.ptsSeconds[f] = singleOut.ptsSeconds
.I don't think it's going to lead to some crazy speedup, because we're usually not decoding enough frames for the dispatcher cost to be visible (I suspect).
On previous work (torchvision image decoders, torch.nn.interpolate C++ implementation), I did observe that using accessors instead of plain indexing led to crazy speedups (but that was on every single pixel values, which are order of magnitude bigger than the number of frames).
Anyway, we should probably still do it, since it's good practice.
The text was updated successfully, but these errors were encountered: