fix: trim padding_out in conv_transpose x-backward (#4845)#4916
Conversation
Signed-off-by: SAY-5 <say.apm35@gmail.com>
laggui
left a comment
There was a problem hiding this comment.
Thanks for addressing the issue!
| #[test] | ||
| fn test_conv_transpose1d_padding_out_stride1_backward_shape() { |
There was a problem hiding this comment.
The test should also validate the correctness, not just the shape. I think the current fix is incorrect (see my other comment).
You can use pytorch as a reference to compute the reference values.
Codecov Report✅ All modified and coverable lines are covered by tests. ❌ Your project check has failed because the head coverage (65.36%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #4916 +/- ##
==========================================
+ Coverage 65.34% 65.36% +0.01%
==========================================
Files 1170 1170
Lines 174264 174982 +718
==========================================
+ Hits 113869 114371 +502
- Misses 60395 60611 +216 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@laggui thanks for the close look — I tried implementing the "slice padding_out off output_grad before the inverse conv" approach and it produces correct values for the stride=1 / dilation=1 case the issue reports, but breaks A few directions I considered before pinging:
For the correctness test, I planned to add cases mirroring What's the intended shape of the fix here? Specifically: should the inverse conv1d use a different padding spec when padding_out > 0, or is there a different primitive (e.g. |
|
@SAY-5 Ahhhh thanks for following up 🙏 my initial assessment was wrong for the conv transpose backward. The slice actually needs to happen after the conv as you initially suggested. The After looking into it I made a small change to update the way the output size is computed based on the options as it can be derived to match the input size. Should only be missing a correctness test with reference values. We're planning to release 0.21 today so I might merge your fix first and the test could be added in a follow-up PR. |
Pull Request Template
Checklist
cargo run-checkscommand has been executed.Related Issues/PRs
Fixes #4845.
Changes
conv_transpose{1,2,3}d_x_backwardran the inverse conv on the fulloutput_grad, but the trailingpadding_outcells of a transpose-conv output do not depend on any input cell. The inverse conv therefore produced a result longer than the original input byfloor(padding_out / stride)on each spatial axis, which then panicked when the gradient was registered againstx.Slice off the trailing cells after the inverse conv to recover the original input shape. Weight and bias gradients are unaffected (weight grad already slices to
weight_shape; bias grad sums over spatial dims).Testing
cargo test -p burn-backend-tests --test autodiff conv(98 conv autodiff tests pass, including the new regression test)cargo test -p burn-backend-tests --test tensor conv_transpose(19 tests pass)test_conv_transpose1d_padding_out_stride1_backward_shapemirroring the issue reproducer (stride=1, padding_out=1).