Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
operator_propertiesoperator_properties
miscco
left a comment
There was a problem hiding this comment.
I believe all of the assumptions here are only valid for arithmetic types
bernhardmgruber
left a comment
There was a problem hiding this comment.
Please merge this new functionality with the existing cub::detail::identity_v
operator_properties
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| template <class _Op, class _Tp, class Enable = void> | ||
| struct absorbing_element | ||
| {}; |
There was a problem hiding this comment.
Strong suggestion: I think we should really aim to make the APIs for absorbing and identity element variable templates. The API would look and feel just a lot better if I could write cuda::identity_element<cuda::maximum, int>. Then the query and the template to specialize for any user provided type would also be the same entity.
For the non-existing case, you could just add a marker type:
struct __no_element{};
template <class _Op, class _Tp, class Enable = void>
inline constexpr auto absorbing_element = __no_element;
... // all your specializations
template <class _Op, class _Tp>
inline constexpr bool has_absorbing_element_v = !::cuda::std::is_same_v<absorbing_element<_Op, _Tp>, __no_element>;There was a problem hiding this comment.
I would like to second that.
We already use a constexpr function for is_associative so we should do the same, that would always leave the inline variable defined and we can provide a clear error message
There was a problem hiding this comment.
I would prefer constexpr template variable too. The problem is that they are not supported on Windows for all types that have conversion operator, like __half. Please see the internal discussion with the compiler team.
There was a problem hiding this comment.
I tried a middle ground. Use inline constexpr for all traits, except for __half and __nv_bfloat16
| template <class _Op, class _Tp, class Enable = void> | ||
| struct absorbing_element | ||
| {}; |
There was a problem hiding this comment.
I would like to second that.
We already use a constexpr function for is_associative so we should do the same, that would always leave the inline variable defined and we can provide a clear error message
This comment has been minimized.
This comment has been minimized.
| template <class _Tp> | ||
| _CCCL_GLOBAL_CONSTANT auto | ||
| __absorbing_element_v<::cuda::maximum<>, | ||
| _Tp, | ||
| ::cuda::std::enable_if_t<::cuda::std::__is_cv_extended_floating_point_v<_Tp>>> = | ||
| __absorbing_element_v<::cuda::maximum<_Tp>, _Tp>; |
There was a problem hiding this comment.
Looking at this more, I am strongly considering to move the values to functions
That would make sure that they are available even for __half and friends and also allow us to avoid storing them somewhere
There was a problem hiding this comment.
Remark: As a user, I don't care how it's implemented, but I would like to spell the public API as a variable template: cuda::absorbing_element_v<Op, T>.
There was a problem hiding this comment.
we all agree that variable template is the best solution but it doesn't work in multiple cases, for example __half.
I'm also considering contexpr functions because it solves all issues
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
/ok to test d587bca |
1 similar comment
|
/ok to test d587bca |
This comment has been minimized.
This comment has been minimized.
🥳 CI Workflow Results🟩 Finished in 16h 05m: Pass: 100%/148 | Total: 9d 18h | Max: 5h 22m | Hits: 24%/318269See results here. |
Description
Address #7173
The PR provides a standardized way to query various properties of operators. For example,
cuda::std::plusis associative withunsignedbut not forfloat.