Why use floor divide in shape_div? #1770
-
|
Q: I am using the composition() function and encountered an unexpected result. My input was: lhs = (_256,(_32,_4)):(_32,(_1,_8192)) However, the result was: (200,(4,3)):(_32,(_8,_8192)) This is not what I expected. I was expecting: (200,(4,4)):(_32,(_8,_8192)) I believe the issue arises because the domain<1> of rhs is 13 instead of 12. And, I found shape_div() which is used in composition_impl() use floor divided rather than ceil divided. Can you help clarify this behavior? Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
|
Good catch. At the moment, this is considered a violation of the "divisibility condition" mentioned in the documentation. The static version fails That said, I agree that the divisibility condition is actually too tight in cases like these and the above SHOULD work -- this is a known class of bugs. Fortunately, we have not found any applications that need this generalization, but, unfortunately, supporting it is a bit more complex than simply rounding up. In the near future, I plan to release a much more formal treatment of CuTe in a whitepaper along with some non-critical code updates and generalizations like this one. |
Beta Was this translation helpful? Give feedback.
After a cup of coffee on an actual workday, I realize that this particular
compositioncase should fail and the current behavior is correct. One post-condition ofcompositionis that the result iscompatiblewith the rhs input, socompositioncan never perform any rounding at all. Because there is no possible output that satisfies all of the post-conditions ofcomposition, it should fail on these inputs (perhaps with better runtime assertions, of course).There is a related set of known artificial limitations around
compositionandlogical_dividethat can be loosened, but this problem is not an example of them.