Make FasterRCNN_mobilenet exportable via torch.export with dynamic shapes#9395
Make FasterRCNN_mobilenet exportable via torch.export with dynamic shapes#9395mergennachin wants to merge 5 commits intomainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9395
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit c713c38 with merge base d63f7ed ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
torchvision/ops/boxes.py
Outdated
| # In eager mode, use the original performance-based dispatch. | ||
| # In export (is_compiling), always use coordinate_trick to avoid | ||
| # data-dependent branching on boxes.numel(). | ||
| if not torch.compiler.is_compiling() and boxes.numel() > (4000 if boxes.device.type == "cpu" else 100_000) and not torchvision._is_tracing(): |
There was a problem hiding this comment.
is _batched_nms_vanilla not compile friendly or we just avoiding the branches here?
it its just to avoid DDE on the branches i would rather that we do
x = 4000 if boxes.device.type == "cpu" else 100_000
if not torch.compiler.is_compiling() and not torchvision._is_tracing() and (guard_or_false(boxes.numel() > x):
..
There was a problem hiding this comment.
_batched_nms_vanilla is not compile-friendly. It has for class_id in torch.unique(idxs) which is a data-dependent loop. So we're avoiding both the DDE on the branch condition AND the untraceable vanilla implementation.
Partially fixes pytorch/pytorch#146152 Fix data-dependent branches, symbolic shape issues, and unbacked symbol handling so that fasterrcnn_mobilenet_v3_large_fpn can be exported with dynamic H/W dimensions using torch.export.export() in both strict=False and strict=True modes. All changes preserve eager and JIT behavior exactly by gating export-specific code behind torch.jit.is_scripting() and torch.compiler.is_compiling() guards. Source changes: - boxes.py: batched_nms dispatch and _batched_nms_coordinate_trick empty check made export-safe via guard_or_false - transform.py: batch_images uses symbolic-safe integer arithmetic instead of float ceil - _utils.py: BoxCoder.decode uses guard_or_true and explicit reshape dims to avoid data-dependent guards and -1 ambiguity - poolers.py: len(rois) -> rois.shape[0] to preserve symbolic ints - _meta_registrations.py: NMS fake kernel uses new_dynamic_size(min=0) to allow empty NMS results at runtime Tests: 32 new export tests covering NMS, batched_nms, MultiScaleRoIAlign, BoxCoder.decode, batch_images, and full model export with dynamic shapes, edge cases, and both strict modes.
- transform.py: remove unused `import math` (batch_images no longer uses math.ceil) - boxes.py: wrap long line per black formatting - test files: apply black formatting (line wrapping, trailing commas) - Use _skip_resize=True in export tests to bypass resize guards
| if box_sum > 0: | ||
| pred_boxes = pred_boxes.reshape(box_sum, -1, 4) | ||
| else: | ||
| if guard_or_true(box_sum > 0): |
There was a problem hiding this comment.
You could have stronger asserts for this function right? Assert that the list of boxes is always greater than 0 (why would someone call decode on empty list?) and that every image in the list has a box.
In general why export this one over decode_single though?
There was a problem hiding this comment.
you can also have the assertions only on the unbacked path
those will translate to runtime assertions if not resolved at compile time
something like
if guard_or_false(box_sum == 0):
..
else:
torch._check(box_sum > 0, "box_sum assumed to be >0 due to unbacked semantics")
| # Concat a zero sentinel so .max() is safe when boxes is empty | ||
| # (export traces the non-empty path, but at runtime boxes can be empty) | ||
| max_coordinate = torch.cat( | ||
| [boxes.reshape(-1), torch.zeros(1, dtype=boxes.dtype, device=boxes.device)] |
There was a problem hiding this comment.
This doesnt seem equivalent to me. Does the downstream decoder handle [0] tensor the same as [] ?
There was a problem hiding this comment.
It works for empty, let me add a unit test
There was a problem hiding this comment.
Yes, the downstream handles it correctly. Updated the comment to make this clearer. Also added test_export_zero_detections_structure
| @@ -173,11 +174,23 @@ def decode(self, rel_codes: Tensor, boxes: list[Tensor]) -> Tensor: | |||
| box_sum = 0 | |||
| for val in boxes_per_image: | |||
There was a problem hiding this comment.
IIUC your change to the detection op will make this see a tensor of size 1 value 0 here instead of 0 so the asserts below are safe, but itll return a garbage prediction box right?
- Move _skip_resize check before training/eval branch so it works in eval mode (was only checked inside `if self.training`) - Replace slice-copy in batch_images with F.pad to avoid shape guards about whether padding is needed - Remove guard_or_true from BoxCoder.decode; unconditional reshape with explicit dims works for all box_sum including 0 - Constrain dynamic shapes to multiples of 64 for FPN compatibility (strided convolutions specialize on ceil_to_32(dim)/32 parity) - Add test_export_zero_detections_structure verifying empty output shapes match between eager and export
Partially fixes pytorch/pytorch#146152
Fix data-dependent branches, symbolic shape issues, and unbacked
symbol handling so that fasterrcnn_mobilenet_v3_large_fpn can be
exported with dynamic H/W dimensions using torch.export.export()
in both strict=False and strict=True modes.
All changes preserve eager and JIT behavior exactly by gating
export-specific code behind torch.jit.is_scripting() and
torch.compiler.is_compiling() guards.
Source changes:
empty check made export-safe via guard_or_false
instead of float ceil
dims to avoid data-dependent guards and -1 ambiguity
to allow empty NMS results at runtime
Tests: 32 new export tests covering NMS, batched_nms,
MultiScaleRoIAlign, BoxCoder.decode, batch_images, and full model
export with dynamic shapes, edge cases, and both strict modes.