Skip to content

Comments

Make FasterRCNN_mobilenet exportable via torch.export with dynamic shapes#9395

Open
mergennachin wants to merge 5 commits intomainfrom
fix_export
Open

Make FasterRCNN_mobilenet exportable via torch.export with dynamic shapes#9395
mergennachin wants to merge 5 commits intomainfrom
fix_export

Conversation

@mergennachin
Copy link

Partially fixes pytorch/pytorch#146152

Fix data-dependent branches, symbolic shape issues, and unbacked
symbol handling so that fasterrcnn_mobilenet_v3_large_fpn can be
exported with dynamic H/W dimensions using torch.export.export()
in both strict=False and strict=True modes.

All changes preserve eager and JIT behavior exactly by gating
export-specific code behind torch.jit.is_scripting() and
torch.compiler.is_compiling() guards.

Source changes:

  • boxes.py: batched_nms dispatch and _batched_nms_coordinate_trick
    empty check made export-safe via guard_or_false
  • transform.py: batch_images uses symbolic-safe integer arithmetic
    instead of float ceil
  • _utils.py: BoxCoder.decode uses guard_or_true and explicit reshape
    dims to avoid data-dependent guards and -1 ambiguity
  • poolers.py: len(rois) -> rois.shape[0] to preserve symbolic ints
  • _meta_registrations.py: NMS fake kernel uses new_dynamic_size(min=0)
    to allow empty NMS results at runtime

Tests: 32 new export tests covering NMS, batched_nms,
MultiScaleRoIAlign, BoxCoder.decode, batch_images, and full model
export with dynamic shapes, edge cases, and both strict modes.

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 19, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9395

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit c713c38 with merge base d63f7ed (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the cla signed label Feb 19, 2026
@mergennachin mergennachin changed the title Make FasterRCNN exportable via torch.export with dynamic shapes Make FasterRCNN_mobilenet exportable via torch.export with dynamic shapes Feb 19, 2026
# In eager mode, use the original performance-based dispatch.
# In export (is_compiling), always use coordinate_trick to avoid
# data-dependent branching on boxes.numel().
if not torch.compiler.is_compiling() and boxes.numel() > (4000 if boxes.device.type == "cpu" else 100_000) and not torchvision._is_tracing():
Copy link

@laithsakka laithsakka Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is _batched_nms_vanilla not compile friendly or we just avoiding the branches here?
it its just to avoid DDE on the branches i would rather that we do

x = 4000 if boxes.device.type == "cpu" else 100_000
if  not torch.compiler.is_compiling()  and not torchvision._is_tracing() and (guard_or_false(boxes.numel() > x):
..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_batched_nms_vanilla is not compile-friendly. It has for class_id in torch.unique(idxs) which is a data-dependent loop. So we're avoiding both the DDE on the branch condition AND the untraceable vanilla implementation.

Partially fixes pytorch/pytorch#146152

Fix data-dependent branches, symbolic shape issues, and unbacked
symbol handling so that fasterrcnn_mobilenet_v3_large_fpn can be
exported with dynamic H/W dimensions using torch.export.export()
in both strict=False and strict=True modes.

All changes preserve eager and JIT behavior exactly by gating
export-specific code behind torch.jit.is_scripting() and
torch.compiler.is_compiling() guards.

Source changes:
- boxes.py: batched_nms dispatch and _batched_nms_coordinate_trick
  empty check made export-safe via guard_or_false
- transform.py: batch_images uses symbolic-safe integer arithmetic
  instead of float ceil
- _utils.py: BoxCoder.decode uses guard_or_true and explicit reshape
  dims to avoid data-dependent guards and -1 ambiguity
- poolers.py: len(rois) -> rois.shape[0] to preserve symbolic ints
- _meta_registrations.py: NMS fake kernel uses new_dynamic_size(min=0)
  to allow empty NMS results at runtime

Tests: 32 new export tests covering NMS, batched_nms,
MultiScaleRoIAlign, BoxCoder.decode, batch_images, and full model
export with dynamic shapes, edge cases, and both strict modes.
- transform.py: remove unused `import math` (batch_images no longer
  uses math.ceil)
- boxes.py: wrap long line per black formatting
- test files: apply black formatting (line wrapping, trailing commas)
- Use _skip_resize=True in export tests to bypass resize guards
if box_sum > 0:
pred_boxes = pred_boxes.reshape(box_sum, -1, 4)
else:
if guard_or_true(box_sum > 0):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have stronger asserts for this function right? Assert that the list of boxes is always greater than 0 (why would someone call decode on empty list?) and that every image in the list has a box.

In general why export this one over decode_single though?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also have the assertions only on the unbacked path
those will translate to runtime assertions if not resolved at compile time

something like
if guard_or_false(box_sum == 0):
..
else:
torch._check(box_sum > 0, "box_sum assumed to be >0 due to unbacked semantics")

# Concat a zero sentinel so .max() is safe when boxes is empty
# (export traces the non-empty path, but at runtime boxes can be empty)
max_coordinate = torch.cat(
[boxes.reshape(-1), torch.zeros(1, dtype=boxes.dtype, device=boxes.device)]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesnt seem equivalent to me. Does the downstream decoder handle [0] tensor the same as [] ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works for empty, let me add a unit test

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the downstream handles it correctly. Updated the comment to make this clearer. Also added test_export_zero_detections_structure

@@ -173,11 +174,23 @@ def decode(self, rel_codes: Tensor, boxes: list[Tensor]) -> Tensor:
box_sum = 0
for val in boxes_per_image:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC your change to the detection op will make this see a tensor of size 1 value 0 here instead of 0 so the asserts below are safe, but itll return a garbage prediction box right?

mergennachin and others added 2 commits February 19, 2026 12:37
- Move _skip_resize check before training/eval branch so it works in
  eval mode (was only checked inside `if self.training`)
- Replace slice-copy in batch_images with F.pad to avoid shape guards
  about whether padding is needed
- Remove guard_or_true from BoxCoder.decode; unconditional reshape with
  explicit dims works for all box_sum including 0
- Constrain dynamic shapes to multiples of 64 for FPN compatibility
  (strided convolutions specialize on ceil_to_32(dim)/32 parity)
- Add test_export_zero_detections_structure verifying empty output
  shapes match between eager and export
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[torch.export] Cannot export TorchVision fasterrcnn_mobilenet_v3_large_fpn

6 participants