Unified collision pipeline performance improvements #1355

nvtw · 2026-01-14T15:43:39Z

Description

Unified collision pipeline performance improvements

Newton Migration Guide

Please ensure the migration guide for warp.sim users is up-to-date with the changes made in this PR.

The migration guide in docs/migration.rst is up-to date

Before your PR is "Ready for review"

Necessary tests have been added and new examples are tested (see newton/tests/test_examples.py)
Documentation is up-to-date
Code passes formatting and linting checks with pre-commit run -a

Summary by CodeRabbit

New Features
- New benchmarking suite for broad‑phase and pipeline comparisons.
- Option to enable graph‑friendly contact debug info; viewer shows rigid/soft contact counts.
- Ellipsoid shape SDF support and multi‑contact capsule results exposed.
Performance Improvements
- Reduced contention in broad‑phase writes and tiled kernel launches.
- New narrow‑phase primitive path and margin‑based gating to avoid spurious contacts.
Tests
- Updated collision tests to validate multi‑contact capsule outputs.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…for simple shapes like spheres or capsules

…hes in the scene

coderabbitai · 2026-01-14T15:43:53Z

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Adds ASV benchmarks, a block-coordinated broad-phase write path and fast lower-triangular index, a staged primitive-first narrow-phase, multi-contact capsule returns, margin-based contact gating, graph-friendly contact debug counters and viewer plumbing, plus related example and test updates.

Changes

Cohort / File(s)	Summary
ASV benchmarks `asv/benchmarks/collision/benchmark_broad_phase.py`, `asv/benchmarks/collision/benchmark_collision_pipeline_comparison.py`, `asv/benchmarks/collision/benchmark_collision_pipeline_unified.py`	New benchmark modules: scene builders, NXN/SAP/explicit comparisons, CUDA-graph-aware timing, warmup/skip logic, and candidate/contact tracking.
Broad-phase core `newton/_src/geometry/broad_phase_common.py`, `newton/_src/geometry/broad_phase_nxn.py`	Add `write_pair_thread_block` (block-coordinated tile-scan writer) and a closed-form `_get_lower_triangular_indices_fast`; NXN kernel rewritten to tiled launch and uses block write helper.
Collision primitives & tests `newton/_src/geometry/collision_primitive.py`, `newton/tests/test_collision_primitives.py`	`collide_capsule_capsule` now returns up to 2 contacts (wp.vec2 distances, 2x3 positions); tests updated to consume multi-contact outputs and validate first contact where applicable.
Contact validation & kernels `newton/_src/geometry/contact_data.py`, `newton/_src/geometry/kernels.py`	Add `contact_passes_margin_check`; add ellipsoid SDF and gradient and integrate ellipsoid soft-contact path.
Contact reduction & narrow-phase refactor `newton/_src/geometry/contact_reduction_global.py`, `newton/_src/geometry/narrow_phase.py`	Early-exit fast-paths and duplicate-export optimization; major narrow-phase refactor introducing a primitive-only analytic kernel, staged routing (primitive → GJK/MPR → mesh), new buffers/counters, tile-based writes, and `has_meshes` gating.
Unified & standard collision pipeline `newton/_src/sim/collide.py`, `newton/_src/sim/collide_unified.py`, `newton/_src/sim/contacts.py`	Add `enable_contact_debug_info` to pipelines; unify contact write via `write_contact_no_margin_check`/`write_contact` with margin gating; `Contacts` gains `ContactDebugInfo` (pinned host counters), `clear` semantics changed, and debug-info update path for graph capture.
Viewer, examples, and example usage `newton/_src/viewer/viewer_gl.py`, `newton/examples/__init__.py`, `newton/examples/cable/example_cable_pile.py`	Viewer can display debug contact counts via `set_contacts`; `create_collision_pipeline` gains `enable_contact_debug_info`; cable example uses explicit pipeline and per-shape margin.
Tests & minor updates `newton/tests/test_collision_pipeline.py`	Adjusted SDF-vs-SDF tolerance and minor test expectation updates.

Sequence Diagram(s)

sequenceDiagram
  participant Scene as SceneBuilder / Model
  participant Broad as BroadPhase (NXN/SAP/Explicit)
  participant Narrow as NarrowPhase (Primitive → GJK/MPR → Mesh)
  participant Contacts as Contacts/Writer
  participant Debug as ContactDebugInfo / Viewer

  Scene->>Broad: build AABBs & emit candidate pairs
  Broad-->>Narrow: candidate pair buffers
  Note over Narrow: Stage 1: primitive analytic kernel\nroutes 0/1/2 contacts or buffers leftovers
  Narrow->>Narrow: route leftover pairs to GJK/MPR (stage 2)
  Narrow->>Narrow: run mesh-related kernels if has_meshes (stage 3)
  Narrow->>Contacts: write_contact (margin check → write_no_margin_check)
  Contacts->>Debug: update_debug_info (if enabled)
  Debug->>Viewer: expose counts for display

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Add particle collision support to unified pipeline #1328: Modifies unified pipeline collide/write paths — closely overlaps collide_unified integration here.
Contact margin improvements #1314: Changes margin combination and contact gating — directly related to the margin-based write gating added.
Broad phase collision API #286: Prior broad-phase NXN work — relates to the fast index and block write extensions in broad_phase_nxn.

Suggested reviewers

eric-heiden
mmacklin
adenzler-nvidia

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Unified collision pipeline performance improvements' accurately reflects the main objectives of the PR, which introduces performance enhancements for the unified collision pipeline across multiple files.
Docstring Coverage	✅ Passed	Docstring coverage is 81.18% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4e6c067 and b2eb056.

📒 Files selected for processing (1)

newton/_src/sim/collide_unified.py

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@newton/_src/geometry/broad_phase_nxn.py`:
- Around line 98-126: The parameter matrix_size is unused in
_get_lower_triangular_indices_fast and the function lacks bounds checking; add
the same validation as in _get_lower_triangular_indices_binary_search by
computing total = matrix_size * (matrix_size - 1) // 2 and raising/asserting (or
returning an error) if index >= total or index < 0 before computing c/r, keeping
the existing closed-form logic (use the function name
_get_lower_triangular_indices_fast and adjust the signature to still accept
matrix_size), or if callers guarantee correctness, remove the matrix_size
parameter and update all call sites to reflect the trimmed signature.

🧹 Nitpick comments (5)

newton/_src/geometry/contact_reduction_global.py (1)
712-721: Efficient duplicate detection with early exit.

The while-loop with early break improves performance by exiting as soon as a duplicate is found, rather than iterating through all previously exported contacts.

One minor nit from static analysis: int(0) on line 716 is redundant since 0 is already an integer. However, if this explicit typing is required for Warp's handling of mutated variables in while loops, it can be left as-is.
Optional: Remove redundant int() cast
-                j = int(0)
+                j = 0
newton/_src/viewer/viewer_gl.py (1)
1298-1305: Good defensive null checks for contact debug display.

The nested conditionals properly guard against missing data. However, the hasattr(self, '_contacts') check on line 1300 is redundant since _contacts is always initialized in __init__. This can be simplified.
Optional: Simplify redundant hasattr check
-                if hasattr(self, '_contacts') and self._contacts is not None and self._contacts.debug_info is not None:
+                if self._contacts is not None and self._contacts.debug_info is not None:
newton/tests/test_collision_primitives.py (1)
844-904: Test assertions correctly updated for multi-contact validation.

The test properly:

Validates the first contact distance against expected values (line 849)

Accesses the first contact position using positions_np[i][0] (line 877)

Uses the first contact's penetration depth for midpoint validation (line 879)

Consider adding test coverage for the second contact in parallel capsule cases. The current test validates only the first contact, but for parallel capsules (test cases 0-5), a second contact is generated. Adding assertions for distances_np[i][1] in parallel cases would improve coverage.
💡 Optional: Add second contact validation for parallel capsules
# After the existing assertions for first contact, add:
# For parallel capsule cases, also validate the second contact
for i in range(len(test_cases)):
    cap1_axis = np.array(test_cases[i][1])
    cap2_axis = np.array(test_cases[i][5])
    axis_alignment = abs(np.dot(cap1_axis, cap2_axis))
    
    # Parallel capsules should have two valid contacts
    if axis_alignment > 0.99:
        self.assertNotEqual(
            distances_np[i][1],
            float("inf"),
            msg=f"Test case {i}: Parallel capsules should have second contact"
        )
newton/_src/sim/collide_unified.py (1)
498-509: Allocation strategy selection logic is clear but could be simplified.

The condition use_model_allocation handles two cases:

rigid_contact_max_per_pair is None - user didn't specify, use model allocation

NXN/SAP modes - prefer model allocation over per-pair heuristics

Setting rigid_contact_max_per_pair = 0 as a signal is a bit implicit. Consider adding a comment explaining this convention.
💡 Optional: Add clarifying comment
         if use_model_allocation:
             rigid_contact_max = model.rigid_contact_max
-            rigid_contact_max_per_pair = 0  # Signal to not use per-pair heuristic
+            rigid_contact_max_per_pair = 0  # Signal __init__ to use rigid_contact_max directly
newton/_src/sim/contacts.py (1)

62-68: Consider clarifying potential staleness in docstring.

The "non-blocking read" is accurate for the host memory access, but the value may be from a previous frame if the async copy hasn't completed. This is typically acceptable for debug UI purposes, but the docstring could mention that values may lag by one frame.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49a7645 and 6f9c7ac.

📒 Files selected for processing (18)

asv/benchmarks/collision/benchmark_broad_phase.py
asv/benchmarks/collision/benchmark_collision_pipeline_comparison.py
asv/benchmarks/collision/benchmark_collision_pipeline_unified.py
newton/_src/geometry/broad_phase_common.py
newton/_src/geometry/broad_phase_nxn.py
newton/_src/geometry/collision_primitive.py
newton/_src/geometry/contact_data.py
newton/_src/geometry/contact_reduction_global.py
newton/_src/geometry/kernels.py
newton/_src/geometry/narrow_phase.py
newton/_src/sim/collide.py
newton/_src/sim/collide_unified.py
newton/_src/sim/contacts.py
newton/_src/viewer/viewer_gl.py
newton/examples/__init__.py
newton/examples/cable/example_cable_pile.py
newton/tests/test_collision_pipeline.py
newton/tests/test_collision_primitives.py

🧰 Additional context used

📓 Path-based instructions (5)

newton/_src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

The newton/_src/ directory is internal library implementation only and must not be imported by user code (examples, documentation, or external users)

Files:

newton/_src/geometry/broad_phase_common.py
newton/_src/geometry/collision_primitive.py
newton/_src/geometry/contact_reduction_global.py
newton/_src/geometry/kernels.py
newton/_src/geometry/contact_data.py
newton/_src/viewer/viewer_gl.py
newton/_src/geometry/broad_phase_nxn.py
newton/_src/sim/collide_unified.py
newton/_src/sim/collide.py
newton/_src/geometry/narrow_phase.py
newton/_src/sim/contacts.py

newton/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Any user-facing class/function/object added under _src must be exposed via the public Newton API through re-exports

Files:

newton/_src/geometry/broad_phase_common.py
newton/_src/geometry/collision_primitive.py
newton/tests/test_collision_pipeline.py
newton/_src/geometry/contact_reduction_global.py
newton/_src/geometry/kernels.py
newton/examples/__init__.py
newton/examples/cable/example_cable_pile.py
newton/_src/geometry/contact_data.py
newton/_src/viewer/viewer_gl.py
newton/_src/geometry/broad_phase_nxn.py
newton/_src/sim/collide_unified.py
newton/_src/sim/collide.py
newton/_src/geometry/narrow_phase.py
newton/_src/sim/contacts.py
newton/tests/test_collision_primitives.py

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use prefix-first naming for classes: ActuatorPD, ActuatorPID (not PDActuator, PIDActuator)
Use prefix-first naming for methods: add_shape_sphere() (not add_sphere_shape())
Method names must use snake_case
Prefer nested classes when self-contained: if a helper type or enum is only meaningful inside one parent class and doesn't need a public identity, define it as a nested class instead of a top-level class/module
Follow PEP 8 for Python code
Use Google-style docstrings with clear, concise explanations of what functions do, their parameters, and return values

Files:

newton/_src/geometry/broad_phase_common.py
newton/_src/geometry/collision_primitive.py
newton/tests/test_collision_pipeline.py
newton/_src/geometry/contact_reduction_global.py
asv/benchmarks/collision/benchmark_collision_pipeline_unified.py
asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/geometry/kernels.py
newton/examples/__init__.py
asv/benchmarks/collision/benchmark_collision_pipeline_comparison.py
newton/examples/cable/example_cable_pile.py
newton/_src/geometry/contact_data.py
newton/_src/viewer/viewer_gl.py
newton/_src/geometry/broad_phase_nxn.py
newton/_src/sim/collide_unified.py
newton/_src/sim/collide.py
newton/_src/geometry/narrow_phase.py
newton/_src/sim/contacts.py
newton/tests/test_collision_primitives.py

**/*

📄 CodeRabbit inference engine (AGENTS.md)

CLI arguments must use kebab-case (e.g., --use-cuda-graph, not --use_cuda_graph)

Files:

newton/_src/geometry/broad_phase_common.py
newton/_src/geometry/collision_primitive.py
newton/tests/test_collision_pipeline.py
newton/_src/geometry/contact_reduction_global.py
asv/benchmarks/collision/benchmark_collision_pipeline_unified.py
asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/geometry/kernels.py
newton/examples/__init__.py
asv/benchmarks/collision/benchmark_collision_pipeline_comparison.py
newton/examples/cable/example_cable_pile.py
newton/_src/geometry/contact_data.py
newton/_src/viewer/viewer_gl.py
newton/_src/geometry/broad_phase_nxn.py
newton/_src/sim/collide_unified.py
newton/_src/sim/collide.py
newton/_src/geometry/narrow_phase.py
newton/_src/sim/contacts.py
newton/tests/test_collision_primitives.py

newton/examples/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

newton/examples/**/*.py: Examples must not import from newton._src, only from the public Newton API
Examples must follow the format with an Example class and must be registered in README.md with a uv run command and screenshot
Examples must implement a test_final() method that verifies the simulation state is valid after the example has run
Examples may optionally implement a test_post_step() method that is evaluated after every step() call

Files:

newton/examples/__init__.py
newton/examples/cable/example_cable_pile.py

🧠 Learnings (16)

📚 Learning: 2025-09-26T05:58:21.013Z

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because they follow a write-first-read-later pattern: BVH query kernels in frame_begin() populate these buffers completely (write-only operations), and collision kernels only read from them afterward in accumulate_contact_force(), ensuring no undefined reads occur.

Applied to files:

newton/_src/geometry/broad_phase_common.py
newton/_src/geometry/contact_reduction_global.py
asv/benchmarks/collision/benchmark_broad_phase.py
newton/examples/cable/example_cable_pile.py
newton/_src/geometry/broad_phase_nxn.py
newton/_src/sim/collide_unified.py
newton/_src/geometry/narrow_phase.py
newton/tests/test_collision_primitives.py

📚 Learning: 2025-09-26T05:58:21.013Z

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because the BVH query kernels that use them always write the count to query_results[0, tid] = 0 first before populating any data, ensuring no undefined reads occur.

Applied to files:

newton/_src/geometry/broad_phase_common.py
asv/benchmarks/collision/benchmark_broad_phase.py
newton/examples/cable/example_cable_pile.py
newton/_src/geometry/broad_phase_nxn.py
newton/_src/sim/collide_unified.py
newton/_src/geometry/narrow_phase.py

📚 Learning: 2025-12-12T17:45:26.847Z

Learnt from: vastsoun
Repo: newton-physics/newton PR: 1019
File: newton/_src/solvers/kamino/geometry/primitives.py:0-0
Timestamp: 2025-12-12T17:45:26.847Z
Learning: In newton/_src/solvers/kamino/geometry/primitives.py, the narrow-phase kernel `_primitive_narrowphase` does not need to handle symmetric shape-pair orderings (e.g., both SPHERE-BOX and BOX-SPHERE) because `kamino.core.builder.ModelBuilder` and `kamino.geometry.primitive.broadphase` guarantee that explicit candidate geometry pairs are always defined with GIDs in ascending order.

Applied to files:

newton/_src/geometry/broad_phase_common.py
newton/_src/geometry/contact_reduction_global.py
asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/geometry/broad_phase_nxn.py
newton/_src/geometry/narrow_phase.py
newton/tests/test_collision_primitives.py

📚 Learning: 2025-10-30T07:28:13.112Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 984
File: newton/_src/geometry/broad_phase_nxn.py:318-336
Timestamp: 2025-10-30T07:28:13.112Z
Learning: In Newton's BroadPhaseAllPairs class (newton/_src/geometry/broad_phase_nxn.py), device management is the caller's responsibility. The library intentionally does not perform automatic device transfers for precomputed arrays in the launch method, even if there's a device mismatch. Users must ensure that the device specified in __init__ matches the device used in launch.

Applied to files:

newton/_src/geometry/broad_phase_common.py
newton/_src/geometry/broad_phase_nxn.py
newton/_src/geometry/narrow_phase.py

📚 Learning: 2025-08-12T17:51:37.474Z

Learnt from: nvlukasz
Repo: newton-physics/newton PR: 519
File: newton/geometry.py:16-22
Timestamp: 2025-08-12T17:51:37.474Z
Learning: In the Newton public API refactor, common geometry symbols like ParticleFlags and ShapeFlags are now exposed at the top level in newton/__init__.py rather than through newton.geometry. The newton/geometry.py module is intentionally focused only on broad-phase collision detection classes (BroadPhaseAllPairs, BroadPhaseExplicit, BroadPhaseSAP).

Applied to files:

newton/_src/geometry/broad_phase_common.py
newton/_src/geometry/collision_primitive.py
newton/_src/geometry/contact_reduction_global.py
asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/geometry/kernels.py
newton/examples/__init__.py
newton/examples/cable/example_cable_pile.py
newton/_src/geometry/broad_phase_nxn.py
newton/_src/sim/collide_unified.py
newton/_src/geometry/narrow_phase.py
newton/tests/test_collision_primitives.py

📚 Learning: 2026-01-13T03:11:40.556Z

Learnt from: jumyungc
Repo: newton-physics/newton PR: 1333
File: newton/_src/sim/builder.py:0-0
Timestamp: 2026-01-13T03:11:40.556Z
Learning: Ensure all VBD damping semantics follow Rayleigh-style damping and are unitless. This convention, demonstrated in SolverVBD, should be applied consistently across all VBD-related constraints in the codebase. In newton/_src/sim/builder.py, validate that any damping terms use the same normalization and are treated as dimensionless; add comments and unit tests to enforce consistency across VBD components.

Applied to files:

newton/_src/geometry/broad_phase_common.py
newton/_src/geometry/collision_primitive.py
newton/tests/test_collision_pipeline.py
newton/_src/geometry/contact_reduction_global.py
newton/_src/geometry/kernels.py
newton/examples/__init__.py
newton/examples/cable/example_cable_pile.py
newton/_src/geometry/contact_data.py
newton/_src/viewer/viewer_gl.py
newton/_src/geometry/broad_phase_nxn.py
newton/_src/sim/collide_unified.py
newton/_src/sim/collide.py
newton/_src/geometry/narrow_phase.py
newton/_src/sim/contacts.py
newton/tests/test_collision_primitives.py

📚 Learning: 2025-10-24T07:56:14.792Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 981
File: newton/_src/geometry/collision_convex.py:180-289
Timestamp: 2025-10-24T07:56:14.792Z
Learning: In newton/_src/geometry/collision_convex.py, the single-contact solver (create_solve_convex_single_contact) intentionally returns count=1 even when shapes are separated (signed_distance > contact_threshold). The calling code is responsible for deciding whether to accept the contact based on the signed distance value. This design pushes filtering responsibility to the caller.

Applied to files:

newton/_src/geometry/collision_primitive.py
newton/_src/geometry/contact_data.py
newton/_src/sim/collide_unified.py
newton/tests/test_collision_primitives.py

📚 Learning: 2025-08-20T03:30:16.037Z

Learnt from: eric-heiden
Repo: newton-physics/newton PR: 584
File: newton/_src/utils/import_urdf.py:159-167
Timestamp: 2025-08-20T03:30:16.037Z
Learning: In the URDF importer (newton/_src/utils/import_urdf.py), cylinders are intentionally created using add_shape_capsule() instead of add_shape_cylinder() because cylinder collisions are not fully supported yet. This is a deliberate workaround until proper cylinder collision support is implemented.

Applied to files:

newton/_src/geometry/collision_primitive.py
newton/tests/test_collision_primitives.py

📚 Learning: 2025-10-10T10:47:41.082Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 899
File: newton/tests/test_rigid_contact.py:214-268
Timestamp: 2025-10-10T10:47:41.082Z
Learning: In `newton/tests/test_rigid_contact.py`, using `add_shape_plane(width=0, length=0)` for an infinite plane works correctly with the collision system and does not cause degenerate AABB issues in the test scenario `test_shape_collisions_gjk_mpr_multicontact`.

Applied to files:

newton/tests/test_collision_pipeline.py
newton/_src/geometry/narrow_phase.py

📚 Learning: 2025-12-12T08:45:51.908Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 1221
File: newton/examples/example_sdf.py:277-287
Timestamp: 2025-12-12T08:45:51.908Z
Learning: In Newton examples (e.g., example_sdf.py), computing contacts once per frame and reusing them across multiple substeps is an intentional design choice, not a bug. The contacts are computed before the substep loop and intentionally kept constant throughout all substeps within that frame.

Applied to files:

newton/_src/geometry/contact_reduction_global.py
newton/_src/viewer/viewer_gl.py
newton/_src/sim/contacts.py

📚 Learning: 2025-12-12T08:45:43.428Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 1221
File: newton/examples/example_sdf.py:277-287
Timestamp: 2025-12-12T08:45:43.428Z
Learning: In Newtown (Newton) example code, specifically files under newton/examples, computing contacts once per frame and reusing them across all substeps is an intentional design choice, not a bug. Reviewers should verify that contacts are computed before the substep loop and reused for every substep within the same frame. This pattern reduces redundant work and preserves frame-consistency; do not flag as a regression unless the behavior is changed for correctness or performance reasons.

Applied to files:

newton/examples/__init__.py
newton/examples/cable/example_cable_pile.py

📚 Learning: 2025-12-12T08:46:21.381Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 1221
File: newton/examples/example_sdf.py:189-191
Timestamp: 2025-12-12T08:46:21.381Z
Learning: In Newton, SDF data structures are shared across replicated world instances (e.g., when using ModelBuilder.replicate()). Therefore, the memory cost of SDF generation scales with sdf_max_resolution but does not scale with the number of worlds. When reviewing or modifying newton/examples/example_sdf.py (and similar example scripts), verify that SDF memory usage is controlled by sdf_max_resolution and not by the count of replicated worlds, and document or justify any changes to sdf_max_resolution to manage memory and performance.

Applied to files:

newton/examples/__init__.py

📚 Learning: 2025-07-23T14:36:42.182Z

Learnt from: shi-eric
Repo: newton-physics/newton PR: 461
File: asv/benchmarks/envs/example_humanoid.py:40-41
Timestamp: 2025-07-23T14:36:42.182Z
Learning: In Warp benchmarks, explicit wp.init() calls are not needed in most circumstances since the first Warp API call that requires initialization will automatically call wp.init(). Explicit wp.init() in setup() methods is helpful when the ASV benchmark is measuring a Warp API call, as wp.init() has non-trivial overhead that should be excluded from the benchmark timing.

Applied to files:

newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-10-06T18:14:18.582Z

Learnt from: daedalus5
Repo: newton-physics/newton PR: 868
File: newton/examples/diffsim/example_diffsim_bear.py:82-96
Timestamp: 2025-10-06T18:14:18.582Z
Learning: Warp's tile operations (`wp.tile_store()` and `wp.tile_load()`) have built-in boundary checking. Out-of-bounds writes are automatically skipped and out-of-bounds reads are zero-filled, so tiled kernels are safe even when the array size is not a multiple of the tile size.

Applied to files:

newton/_src/geometry/broad_phase_nxn.py

📚 Learning: 2025-09-15T13:13:37.976Z

Learnt from: gdaviet
Repo: newton-physics/newton PR: 750
File: newton/_src/solvers/implicit_mpm/solve_rheology.py:772-783
Timestamp: 2025-09-15T13:13:37.976Z
Learning: In Warp's tiled launch API, using a 2D launch dimension dim=(num_blocks, tile_size) is correct for tiled kernels. The wp.tid() function can either return a 2D tuple (block_index, lane) or just discard the last dimension to return only the block_index when lane information is not needed, making `block_index = wp.tid()` a valid shortcut in tiled kernels.

Applied to files:

newton/_src/geometry/broad_phase_nxn.py
newton/_src/geometry/narrow_phase.py

📚 Learning: 2025-09-10T13:03:58.591Z

Learnt from: gdaviet
Repo: newton-physics/newton PR: 750
File: newton/_src/solvers/implicit_mpm/solve_rheology.py:799-804
Timestamp: 2025-09-10T13:03:58.591Z
Learning: In Warp's tiled launch API, the launch dimension should include the tile size as the last dimension, matching the CUDA block dimension. For tiled operations, use dim=(num_blocks, tile_size) where tile_size equals block_dim, not just dim=num_blocks like regular kernel launches.

Applied to files:

newton/_src/geometry/broad_phase_nxn.py
newton/_src/geometry/narrow_phase.py

🧬 Code graph analysis (12)

newton/_src/geometry/broad_phase_common.py (1)

newton/_src/solvers/mujoco/solver_mujoco.py (1)

tile (2370-2390)

newton/_src/geometry/collision_primitive.py (1)

newton/_src/sim/builder.py (1)

dot (5249-5250)

asv/benchmarks/collision/benchmark_broad_phase.py (5)

newton/_src/geometry/broad_phase_nxn.py (2)

BroadPhaseAllPairs (265-419)

BroadPhaseExplicit (422-493)

newton/_src/geometry/broad_phase_sap.py (2)

BroadPhaseSAP (395-650)

SAPSortType (34-38)

newton/_src/viewer/viser/static/assets/WebsocketServerWorker-C6PJJ7Dx.js (1)

k (1-1)

newton/_src/sim/collide_unified.py (1)

compute_shape_aabbs (169-238)

newton/_src/utils/recorder.py (1)

append (61-70)

newton/_src/geometry/kernels.py (1)

newton/_src/geometry/types.py (1)

GeoType (25-67)

asv/benchmarks/collision/benchmark_collision_pipeline_comparison.py (2)

newton/_src/sim/collide.py (1)

CollisionPipeline (69-355)

newton/_src/sim/collide_unified.py (4)

from_model (448-550)

CollisionPipelineUnified (269-746)

BroadPhaseMode (70-80)

collide (552-736)

newton/_src/geometry/contact_data.py (1)

newton/_src/sim/builder.py (1)

dot (5249-5250)

newton/_src/viewer/viewer_gl.py (1)

newton/_src/sim/contacts.py (2)

get_rigid_count (62-64)

get_soft_count (66-68)

newton/_src/geometry/broad_phase_nxn.py (2)

newton/_src/geometry/broad_phase_common.py (3)

check_aabb_overlap (25-41)

precompute_world_map (179-296)

test_world_and_group_pair (150-176)

newton/tests/test_broad_phase.py (1)

test_world_and_group_pair (56-82)

newton/_src/sim/collide_unified.py (2)

newton/_src/geometry/contact_data.py (2)

ContactData (26-62)

contact_passes_margin_check (66-96)

newton/_src/geometry/types.py (1)

GeoType (25-67)

newton/_src/sim/collide.py (1)

newton/_src/sim/contacts.py (1)

update_debug_info (142-145)

newton/_src/geometry/narrow_phase.py (9)

newton/_src/geometry/collision_core.py (1)

check_infinite_plane_bsphere_overlap (504-556)

newton/_src/geometry/collision_primitive.py (4)

collide_capsule_capsule (199-293)

collide_plane_capsule (297-348)

collide_plane_sphere (119-125)

collide_sphere_sphere (129-158)

newton/_src/geometry/contact_data.py (1)

ContactData (26-62)

newton/_src/geometry/flags.py (1)

ShapeFlags (30-48)

newton/_src/geometry/types.py (2)

GeoType (25-67)

SDF (74-113)

newton/_src/geometry/support_function.py (1)

extract_shape_data (305-347)

newton/_src/geometry/broad_phase_nxn.py (2)

launch (348-419)

launch (436-493)

newton/_src/geometry/broad_phase_sap.py (1)

launch (506-650)

newton/_src/geometry/contact_reduction_global.py (4)

clear_active (351-385)

get_data_struct (387-404)

reduce_buffered_contacts_kernel (548-567)

export_reduced_contacts_kernel (658-760)

newton/tests/test_collision_primitives.py (1)

newton/_src/geometry/collision_primitive.py (1)

collide_capsule_capsule (199-293)

🪛 GitHub Actions: Pull Request

newton/_src/sim/contacts.py

[error] 46-46: CpuPinnedAllocator has no attribute 'device' when attempting to allocate pinned CPU memory. This caused an exception during test execution (likely due to Warp/CPU allocator API mismatch).

🪛 Ruff (0.14.11)

newton/_src/geometry/contact_reduction_global.py

716-716: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)

asv/benchmarks/collision/benchmark_collision_pipeline_unified.py

187-190: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

191-191: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

225-225: Unused method argument: grid_size

(ARG002)

225-225: Unused method argument: broad_phase_mode_str

(ARG002)

243-243: Unused method argument: grid_size

(ARG002)

243-243: Unused method argument: broad_phase_mode_str

(ARG002)

263-266: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

267-267: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

310-310: Unused method argument: grid_size

(ARG002)

310-310: Unused method argument: broad_phase_mode_str

(ARG002)

340-343: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

344-344: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

392-392: Unused method argument: grid_size

(ARG002)

392-392: Unused method argument: broad_phase_mode_str

(ARG002)

asv/benchmarks/collision/benchmark_broad_phase.py

127-131: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

132-132: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

247-247: Unused method argument: grid_size

(ARG002)

247-247: Unused method argument: spacing_factor

(ARG002)

247-247: Unused method argument: broad_phase_type

(ARG002)

262-262: Unused method argument: grid_size

(ARG002)

262-262: Unused method argument: spacing_factor

(ARG002)

262-262: Unused method argument: broad_phase_type

(ARG002)

271-271: Unused method argument: grid_size

(ARG002)

271-271: Unused method argument: spacing_factor

(ARG002)

271-271: Unused method argument: broad_phase_type

(ARG002)

276-276: Unused method argument: grid_size

(ARG002)

276-276: Unused method argument: spacing_factor

(ARG002)

276-276: Unused method argument: broad_phase_type

(ARG002)

asv/benchmarks/collision/benchmark_collision_pipeline_comparison.py

182-185: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

186-186: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

214-214: Unused method argument: pyramid_size

(ARG002)

214-214: Unused method argument: pipeline_type

(ARG002)

233-233: Unused method argument: pyramid_size

(ARG002)

233-233: Unused method argument: pipeline_type

(ARG002)

241-241: Unused method argument: pyramid_size

(ARG002)

241-241: Unused method argument: pipeline_type

(ARG002)

258-261: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

262-262: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

299-299: Unused method argument: pyramid_size

(ARG002)

299-299: Unused method argument: pipeline_type

(ARG002)

329-332: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

333-333: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

375-375: Unused method argument: pyramid_size

(ARG002)

375-375: Unused method argument: pipeline_type

(ARG002)

newton/_src/geometry/broad_phase_nxn.py

99-99: Unused function argument: matrix_size

(ARG001)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Run GPU Tests / Run GPU Unit Tests on AWS EC2
GitHub Check: Run GPU Benchmarks / Run GPU Benchmarks on AWS EC2

🔇 Additional comments (65)

newton/tests/test_collision_pipeline.py (1)

432-437: LGTM! Tolerance adjustment is well-documented.

The increased tolerance (0.1) for SDF-SDF hydroelastic contacts is reasonable given the documented variability in contact normal direction. This is consistent with the higher tolerances used in mixed SDF/BVH modes (0.2 and 0.5).

newton/_src/geometry/contact_reduction_global.py (2)

558-561: Good performance optimization with early exit.

The early return when num_contacts == 0 avoids unnecessary iteration and hashtable operations when there's no work to do.

688-691: Good performance optimization with early exit.

The early return when num_active == 0 efficiently skips processing when the hashtable has no active entries.

newton/_src/geometry/contact_data.py (1)

65-96: Well-structured margin check implementation.

The function correctly computes whether a contact should be written based on the separation distance and margin threshold. The math properly accounts for effective radii and thicknesses of both shapes.

The wp.normalize() call on line 83 is defensive—if contact_normal_a_to_b is already guaranteed to be a unit vector from upstream contact computation, this normalization is redundant but harmless. Consider documenting the expected invariant in the docstring if the input is always normalized.

newton/_src/viewer/viewer_gl.py (2)

81-83: Good initialization of contact debug state.

The _contacts field is properly initialized to None in the constructor.

166-176: Clean API for setting contacts debug info.

The set_contacts() method provides a clear interface for enabling contact count display in the stats overlay. The docstring adequately explains the purpose and usage.

newton/_src/geometry/kernels.py (3)

269-288: Good ellipsoid SDF implementation with numerical stability.

The approximate SDF formula k0 * (k0 - 1) / k1 is a well-established approach for ellipsoids. The implementation properly handles the degenerate case (near center) by checking k1 > eps and returning -min(r) as the fallback, which represents the maximum penetration depth inside the ellipsoid.

820-823: Good integration of ellipsoid support in soft contacts.

The ellipsoid case is correctly added to the shape type dispatch, calling both ellipsoid_sdf and ellipsoid_sdf_grad with geo_scale as the radii. This is consistent with how other shape types handle their scale parameters.

291-314: Good ellipsoid SDF approximation implementation.

The SDF formula k0*(k0-1)/k1 is the standard fast approximation for ellipsoid distance. The degenerate fallback returning wp.vec3(0.0, 0.0, 1.0) provides a valid unit vector when k1 < eps.

For highly eccentric ellipsoids (large aspect ratios), the approximation error grows noticeably, and iterative refinement (1–3 Newton steps on the exact distance equation) can improve accuracy when subpixel-level precision is needed.

The analytical gradient derivation should be verified against the exact mathematical form to ensure correctness for the use case.

newton/_src/geometry/collision_primitive.py (1)

228-293: Multi-contact capsule-capsule implementation looks correct.

The implementation correctly handles two cases:

Non-parallel axes (Lines 246-272): Uses a 2x2 linear system to find closest points, with proper clamping when solutions fall outside the capsule segments. Only one contact is generated with the second slot remaining at wp.inf.

Parallel axes (Lines 275-291): Generates two contacts at both ends of capsule 1, using the first contact's normal for both. This is physically correct for parallel capsules where contact can occur along the entire overlapping length.

The determinant threshold MINVAL = 1e-15 (line 246) is appropriate for detecting near-parallel configurations.

One minor observation: In the parallel case, the second contact (line 289) computes _normal but discards it, using the first contact's normal instead. This is intentional per the comment "Use first contact's normal for both" and is a reasonable design choice for consistency.

newton/_src/geometry/broad_phase_nxn.py (2)

186-262: Kernel correctly implements tiled launch with thread-block cooperative writes.

The kernel implementation is well-structured:

Thread ID calculation (lines 186-188): Correctly computes global thread ID from block and lane indices.

Bounds checking (line 199): Guards against out-of-range threads in the last block.

World/collision filtering (lines 225-251): Preserves all existing collision filtering logic including world indices, collision groups, and AABB overlap checks.

Dedicated -1 segment handling (lines 232-233): Correctly skips duplicate global-global pairs except in the dedicated segment.

Cooperative writing (lines 254-262): Uses write_pair_thread_block for all threads, passing the has_valid_pair flag to coordinate the write.

397-419: Tiled launch configuration is correct.

The launch configuration properly sets up the tiled kernel:

block_dim = 128 is a reasonable block size for tile operations

num_blocks correctly computes the ceiling division

wp.launch_tiled is the appropriate API for kernels using tile operations

One consideration: The hardcoded block_dim = 128 may not be optimal for all GPUs. Consider making this configurable or tuning based on device capabilities in future iterations.

newton/tests/test_collision_primitives.py (2)

195-212: Test kernel correctly updated for multi-contact API.

The kernel signature and body changes correctly align with the updated collide_capsule_capsule return type:

distances parameter type changed to wp.vec2

contact_positions parameter type changed to wp.types.matrix((2, 3), wp.float32)

Return values correctly assigned to output arrays

817-818: Array initialization correctly updated for multi-contact types.

The arrays are properly initialized with the new types matching the updated kernel signature.

newton/_src/geometry/narrow_phase.py (11)

132-146: Clear and well-documented kernel factory function.

The docstring accurately describes the purpose of this two-stage collision pipeline. The primitive kernel handles lightweight analytical collisions and routes remaining pairs appropriately.

185-195: Consider early return optimization for empty work.

The early return at line 191-192 is good for avoiding unnecessary computation. However, when num_work_items == 0, threads still participate in the tile scan below (lines 459-469), which could cause issues since upper would be 0 and the loop wouldn't execute.

Actually, looking more carefully, the early return happens before the loop, so this is correct. The logic handles the empty case properly.

231-234: Type sorting ensures consistent collision handling order.

This sorting guarantees type_a <= type_b, which is important since:

The primitive collision functions expect specific orderings (e.g., plane-sphere, not sphere-plane)

Per learnings, the narrow-phase kernel relies on GIDs being in ascending order

This aligns with the learned pattern from newton/_src/solvers/kamino/geometry/primitives.py.

274-300: Mesh routing logic handles all mesh cases correctly.

The routing properly handles:

Mesh-mesh → shape_pairs_mesh_mesh buffer

Infinite-plane + mesh → shape_pairs_mesh_plane buffer with vertex counting

Any remaining mesh pairs → shape_pairs_mesh buffer for triangle overlap detection

The order matters: mesh-mesh is checked first, then plane-mesh, then general mesh-convex. This prevents double-routing since is_valid is set to False after each routing.

457-490: Thread block cooperative allocation using tile scan is well-implemented.

The tile scan pattern correctly:

Computes inclusive prefix sum for contact counts across the block

Uses last thread to atomically reserve space for all contacts in the block

Broadcasts the offset to all threads

Each thread calculates its base index using offset + inclusive_scan[i] - num_valid

This reduces atomic contention compared to per-thread atomics.

504-516: Clear documentation of kernel responsibilities after refactor.

The docstring accurately describes the GJK/MPR kernel's role in the two-stage pipeline: it receives pre-sorted, pre-filtered pairs that couldn't be handled analytically.

1125-1164: New has_meshes parameter enables mesh-avoidance optimization.

The docstring clearly explains that when has_meshes=False, mesh-related kernels are skipped. This is a good performance optimization for primitive-only scenes.

The parameter is properly propagated through the class and used in launch_custom_write.

1246-1259: Consolidated counter array reduces kernel launch overhead.

Using a single array with sliced views for individual counters is an efficient pattern that:

Reduces memory fragmentation

Enables single zero_() call to clear all counters

Views share the same underlying memory

This is a good optimization.

1336-1337: Single zero call clears all counters efficiently.

The consolidated counter array allows clearing all 7 counters with a single kernel launch, which is more efficient than separate zero_() calls.

1400-1401: Mesh kernel skipping controlled by has_meshes flag.

The conditional properly gates all mesh-related processing, which includes:

Mesh-plane contacts

Mesh triangle overlap detection

Mesh triangle contact processing

Mesh-mesh contacts

This optimization can significantly improve performance for scenes with only primitive shapes.

262-264: [Your rewritten review comment text here]
[Exactly ONE classification tag]

newton/_src/sim/collide_unified.py (6)

83-146: Refactored contact writer with separated margin check.

The split into write_contact_no_margin_check and write_contact enables:

Callers that have already performed margin checks to avoid redundant computation

The margin check wrapper to filter contacts before allocation

The recomputation of contact points (lines 101-112) ensures consistency with the margin check logic in contact_passes_margin_check.

148-166: Margin check wrapper follows single-responsibility principle.

The write_contact function cleanly delegates to contact_passes_margin_check before writing, matching the pattern in write_contact_simple in narrow_phase.py.

390-409: Neighbor-based heuristic provides reasonable allocation estimates.

The heuristic parameters are well-documented:

12 neighbors per body (FCC packing)

2 contacts per neighbor (average across shape types)

1.2x safety margin

For a scene with 100 shapes: (100 * 12 * 2) / 2 * 1.2 = 1440 contacts, which is much more reasonable than O(N²) allocation of 100 * 99 / 2 * 10 = 49500 contacts.

493-496: Auto-detection of has_meshes from model shape types.

The implementation correctly scans model.shape_type for MESH shapes. Using numpy() to convert the array for iteration is appropriate here since this runs during initialization, not in the hot path.

572-573: Debug info enabled flag properly propagated to Contacts.

The enable_debug_info parameter is correctly passed to the Contacts constructor, enabling graph-compatible contact count tracking when requested.

732-735: Debug info update integrated at correct location.

update_debug_info() is called after all contact generation (rigid and soft) is complete, ensuring accurate counts are captured before returning.

newton/examples/__init__.py (2)

398-461: Standard pipeline now explicitly constructed to support debug info.

Previously, returning None for standard pipeline relied on implicit construction. Now, CollisionPipeline.from_model is called explicitly to support the enable_contact_debug_info parameter.

The docstring accurately reflects the new parameter and its purpose.

483-488: Debug info parameter correctly forwarded to unified pipeline.

The enable_contact_debug_info parameter is properly passed to CollisionPipelineUnified.from_model, maintaining consistency with the standard pipeline path.

asv/benchmarks/collision/benchmark_collision_pipeline_unified.py (7)

37-79: Flat mesh grid generation is correct.

The function correctly:

Creates a (grid_x+1) × (grid_y+1) vertex grid

Generates 2 triangles per quad with consistent winding

Returns appropriate numpy arrays

The triangle winding (v00→v10→v01 and v10→v11→v01) produces consistent normals pointing in +Z direction.

82-172: Scene building creates a good benchmark scenario.

The scene includes:

100×100 triangle mesh ground (20K triangles)

3D grid of mixed primitive shapes with random position jitter

Each body uses a free joint for articulation

The mix of shape types (sphere, box, capsule, cylinder, cone) provides comprehensive coverage of the collision pipeline.

153-156: Free joint + articulation per body may add overhead.

Each body gets its own articulation with a single free joint. This is valid but creates many small articulations. Consider whether this matches real-world usage patterns.

For benchmarking purposes, this is acceptable as it measures worst-case articulation overhead.

186-191: Static analysis: ClassVar annotations are optional for ASV benchmarks.

The Ruff hints about ClassVar annotations (RUF012) are technically correct but not required for ASV benchmark classes. ASV uses these class attributes for benchmark configuration, and the pattern is standard across all ASV benchmarks.

The unused parameter warnings (ARG002) are also expected—ASV requires these parameters in the signature for parameterization, but setup() stores the necessary data in instance attributes.

224-240: Benchmark timing methodology is sound.

The approach:

Warmup iterations (5) to stabilize GPU state

CUDA graph capture for consistent measurement

Multiple timed iterations (20) with wp.ScopedTimer

Median selection for robust statistics

Using synchronize=True in ScopedTimer ensures accurate GPU timing.

242-248: Contact count tracking provides useful auxiliary metric.

The track_contact_count method runs without the graph to get actual contact counts, which is useful for validating that the benchmark generates meaningful collision workloads.

410-434: CLI entry point follows established benchmark patterns.

The main block provides a convenient way to run individual benchmarks outside of ASV, using newton.utils.run_benchmark.

newton/examples/cable/example_cable_pile.py (6)

34-47: Configuration constants provide good flexibility for testing.

The three constants allow easy experimentation:

USE_UNIFIED_PIPELINE: Toggle between pipeline implementations

BROAD_PHASE_MODE: Select broad phase algorithm

RIGID_CONTACT_MAX_OVERRIDE: Override allocation when needed

The comments clearly explain each option's purpose.

163-163: Setting rigid_contact_margin on builder is good practice.

This sets a default per-shape collision margin (5mm), which is propagated to all shapes during model construction. This ensures consistent margin handling across the collision pipeline.

266-324: Pipeline creation logic handles all modes correctly.

The implementation:

Maps string to BroadPhaseMode enum

Uses RIGID_CONTACT_MAX_OVERRIDE path for explicit control

Falls back to NXN if EXPLICIT mode lacks shape_contact_pairs

Uses from_model for automatic allocation when no override

Enables debug info for both paths

The fallback from EXPLICIT to NXN with a warning is user-friendly.

326-329: Collision and viewer integration properly wired.

Initial collision is computed and contacts are passed to the viewer for visualization. This enables the debug contact display feature.

384-423: Test method remains compatible with pipeline changes.

The test_final() method tests physical correctness (positions, velocities, bounds) which is independent of which collision pipeline is used. The tests should pass regardless of USE_UNIFIED_PIPELINE setting.

352-355: Verify graph-compatibility of set_contacts call in CUDA graph capture context.

Calling viewer.set_contacts(self.contacts) after collision updates ensures the viewer displays current contact data for debug visualization. However, whether this call is compatible with CUDA graph capture (mentioned at lines 337-339) depends on set_contacts implementation—if it performs host-side CUDA API calls or non-device-memory operations, it cannot be safely included within a CUDA graph, violating the constraint that graph nodes must be limited to kernels, memcpy, memset, and child graphs.

asv/benchmarks/collision/benchmark_broad_phase.py (6)

1-41: LGTM on imports and module setup.

The module docstring clearly explains the benchmark purpose. Importing from _src modules is acceptable for internal benchmark code (not user-facing). The wp.config.quiet = True appropriately suppresses verbose output during benchmarks.

43-112: Well-structured scene builder function.

The function correctly builds a parameterized 3D grid of spheres and precomputes AABBs for broad-phase testing. The docstring clearly documents the spacing_factor semantics.

134-203: Robust setup with appropriate skip logic for shared memory limits.

The setup correctly handles the sap_tile shared memory constraint by skipping larger grid sizes. The CUDA graph capture pattern with warmup iterations is correctly implemented for reliable benchmarking.

205-244: Launch helpers correctly delegate to broad phase APIs.

Each launch method properly invokes its respective broad phase algorithm with appropriate parameters.

246-259: Timing method uses best practices for GPU benchmarking.

Using wp.ScopedTimer with synchronize=True ensures accurate GPU timing. Returning the median of multiple samples provides stable measurements. The float("nan") return for skipped cases is appropriate for ASV.

261-278: Track methods correctly implemented for ASV metrics.

The tracking methods properly report candidate pairs, shape counts, and theoretical pair bounds. The unused method arguments flagged by static analysis are required by ASV's parameterized benchmark framework.

newton/_src/sim/contacts.py (3)

51-60: Copy method correctly uses async transfer pattern.

The wp.copy function handles device-to-pinned-host transfers appropriately for graph-compatible async copies.

136-145: Clean opt-in pattern for debug info.

The conditional initialization ensures no overhead when debug info is disabled. The update_debug_info method correctly guards against None.

147-157: Good optimization: clearing only counts instead of full arrays.

The comment clearly explains the rationale—readers check counts first, so full array clearing is unnecessary. This eliminates expensive memset operations.

newton/_src/sim/collide.py (4)

78-146: Clean integration of enable_contact_debug_info parameter.

The flag is properly added to __init__ with a clear docstring explaining its purpose (graph-capture-compatible contact count tracking). The parameter is correctly stored for later use.

150-199: Factory method correctly propagates the new parameter.

The from_model classmethod properly forwards enable_contact_debug_info to the constructor, maintaining API consistency.

215-223: Debug info flag correctly passed to Contacts constructor.

The enable_debug_info parameter is properly forwarded when creating new Contacts objects.

341-344: Debug info update correctly placed after contact generation.

The update_debug_info() call is appropriately positioned after both soft and rigid contacts are generated, ensuring accurate counts are copied.

asv/benchmarks/collision/benchmark_collision_pipeline_comparison.py (6)

1-41: LGTM on imports and module setup.

The module docstring clearly explains the benchmark scope and intentionally excluded shapes. Imports are appropriate for benchmark code.

42-167: Comprehensive scene builder covering all supported pipeline types.

The function correctly builds a complex scene with multiple shape types supported by both pipelines. The pipeline type switching logic handles all four variants (standard, unified_nxn, unified_explicit, unified_sap) appropriately.

170-244: Collision detection benchmark correctly implemented.

The setup properly handles CUDA graph capture with warmup. The timing method uses median for stable results. Tracking methods correctly report contact and shape counts.

246-314: Step benchmark correctly captures full simulation substep.

The _do_step method properly implements the collide + solve pattern with state double-buffering. CUDA graph capture encompasses the entire step for accurate timing.

317-390: Frame benchmark provides realistic end-to-end timing.

The 10 substeps at 60fps configuration represents a realistic simulation workload. CUDA graph capture of the entire frame ensures consistent timing across runs.

393-417: Clean CLI entry point for manual benchmark execution.

The argparse integration allows running specific benchmarks individually or all benchmarks together. Using newton.utils.run_benchmark ensures consistent execution behavior.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-14T16:02:08Z

newton/_src/geometry/broad_phase_nxn.py

+@wp.func
+def _get_lower_triangular_indices_fast(index: int, matrix_size: int) -> tuple[int, int]:
+    """Convert linear index to (row, col) pair using O(1) closed-form formula.
+
+    For a matrix of size N, pairs are stored as:
+      index 0 -> (0, 1)
+      index 1 -> (0, 2)
+      index 2 -> (1, 2)
+      index 3 -> (0, 3)
+      ...
+
+    The starting index for column c is c*(c-1)/2.
+    Given index k, we solve: c*(c-1)/2 <= k < c*(c+1)/2
+    This gives: c = floor((1 + sqrt(1 + 8k)) / 2)
+    Then: r = k - c*(c-1)/2
+    """
+    # Closed-form solution using quadratic formula
+    # c = floor((1 + sqrt(1 + 8*index)) / 2)
+    discriminant = float(1 + 8 * index)
+    c = int((1.0 + wp.sqrt(discriminant)) * 0.5)
+
+    # Handle edge case where floating point gives us c+1
+    start_c = (c * (c - 1)) >> 1
+    if start_c > index:
+        c = c - 1
+        start_c = (c * (c - 1)) >> 1
+
+    r = index - start_c
+    return r, c


⚠️ Potential issue | 🟡 Minor

Unused parameter matrix_size in _get_lower_triangular_indices_fast.

The static analysis tool correctly identified that matrix_size is unused in this function. Looking at the implementation, the closed-form formula computes (r, c) directly from the index without needing to know the matrix size.

However, checking the binary search version (_get_lower_triangular_indices_binary_search), it uses matrix_size for:

Bounds validation (line 79-81): if index >= total

Computing the row (lines 84-94)

The fast version lacks bounds validation. If index is out of range (>= n*(n-1)/2), it will return an invalid (r, c) pair without warning.

Consider either:

Adding bounds validation using matrix_size, or

Removing the parameter if validation is guaranteed by the caller

Option 1: Add bounds validation

`@wp.func` def _get_lower_triangular_indices_fast(index: int, matrix_size: int) -> tuple[int, int]: """Convert linear index to (row, col) pair using O(1) closed-form formula. ... """ + # Bounds check + total = (matrix_size * (matrix_size - 1)) >> 1 + if index >= total: + return -1, -1 + # Closed-form solution using quadratic formula

🧰 Tools

🪛 Ruff (0.14.11)

99-99: Unused function argument: matrix_size

(ARG001)

🤖 Prompt for AI Agents

In `@newton/_src/geometry/broad_phase_nxn.py` around lines 98 - 126, The parameter matrix_size is unused in _get_lower_triangular_indices_fast and the function lacks bounds checking; add the same validation as in _get_lower_triangular_indices_binary_search by computing total = matrix_size * (matrix_size - 1) // 2 and raising/asserting (or returning an error) if index >= total or index < 0 before computing c/r, keeping the existing closed-form logic (use the function name _get_lower_triangular_indices_fast and adjust the signature to still accept matrix_size), or if callers guarantee correctness, remove the matrix_size parameter and update all call sites to reflect the trimmed signature.

…rid sizes

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@asv/benchmarks/collision/benchmark_broad_phase.py`:
- Around line 187-193: The graph capture uses wp.get_device().is_cuda but the
benchmark stores the model/device in self.device; replace the get_device check
with self.device.is_cuda to ensure you check the intended device before starting
the CUDA capture of self._launch_func() within wp.ScopedCapture(), leaving
assignment to self.graph unchanged; update the conditional that currently
references wp.get_device().is_cuda to use self.device.is_cuda instead.

In `@newton/_src/sim/contacts.py`:
- Around line 61-67: The current get_rigid_count and get_soft_count read pinned
arrays with .numpy() without ensuring the async copy completed, which can return
stale data; update both methods (get_rigid_count and get_soft_count) to accept
an optional sync: bool = True parameter and, when sync is True, call
wp.synchronize() (or the project-preferred wp.synchronize_stream()) before
converting to numpy() so the pinned-memory copy has completed; alternatively, if
you prefer not to block by default, change the default to False and update the
docstrings to explicitly state callers must synchronize before calling if they
need up-to-date values.

🧹 Nitpick comments (3)

newton/_src/geometry/broad_phase_common.py (1)
94-98: Consider using wp.select() or ternary for cleaner boolean-to-int conversion.

The boolean-to-int conversion pattern works but could be more concise:
has_pair_int = 0
if thread_has_pair:
    has_pair_int = 1
Suggested simplification
-    # Convert boolean to int (1 if has pair, 0 otherwise)
-    has_pair_int = 0
-    if thread_has_pair:
-        has_pair_int = 1
+    # Convert boolean to int (1 if has pair, 0 otherwise)
+    has_pair_int = wp.select(thread_has_pair, 1, 0)
newton/_src/sim/contacts.py (1)
22-23: Remove stray comment.

This comment appears to be a leftover from development/debugging and doesn't add value to the code.
Suggested fix
-# No kernel needed - we just zero counts!
-
-
 class ContactDebugInfo:
newton/_src/sim/collide_unified.py (1)
394-427: Consider extracting magic numbers to named constants.

The neighbor-based heuristic is well-documented, but the magic numbers (12, 2, 1.2, 10) are scattered in the code. Consider defining them as class constants for easier tuning and maintainability:
# At class level
DEFAULT_MAX_NEIGHBORS_PER_BODY = 12
DEFAULT_CONTACTS_PER_NEIGHBOR = 2
ALLOCATION_SAFETY_MARGIN = 1.2
EXPLICIT_MODE_CLAMP_THRESHOLD = 10
This is optional since the inline comments already explain the rationale.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f9c7ac and 4e6c067.

📒 Files selected for processing (6)

asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/geometry/broad_phase_common.py
newton/_src/sim/collide_unified.py
newton/_src/sim/contacts.py
newton/_src/viewer/viewer_gl.py
newton/examples/cable/example_cable_pile.py

🚧 Files skipped from review as they are similar to previous changes (1)

newton/_src/viewer/viewer_gl.py

🧰 Additional context used

📓 Path-based instructions (5)

newton/_src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

The newton/_src/ directory is internal library implementation only and must not be imported by user code (examples, documentation, or external users)

Files:

newton/_src/geometry/broad_phase_common.py
newton/_src/sim/contacts.py
newton/_src/sim/collide_unified.py

newton/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Any user-facing class/function/object added under _src must be exposed via the public Newton API through re-exports

Files:

newton/_src/geometry/broad_phase_common.py
newton/_src/sim/contacts.py
newton/examples/cable/example_cable_pile.py
newton/_src/sim/collide_unified.py

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use prefix-first naming for classes: ActuatorPD, ActuatorPID (not PDActuator, PIDActuator)
Use prefix-first naming for methods: add_shape_sphere() (not add_sphere_shape())
Method names must use snake_case
Prefer nested classes when self-contained: if a helper type or enum is only meaningful inside one parent class and doesn't need a public identity, define it as a nested class instead of a top-level class/module
Follow PEP 8 for Python code
Use Google-style docstrings with clear, concise explanations of what functions do, their parameters, and return values

Files:

newton/_src/geometry/broad_phase_common.py
newton/_src/sim/contacts.py
asv/benchmarks/collision/benchmark_broad_phase.py
newton/examples/cable/example_cable_pile.py
newton/_src/sim/collide_unified.py

**/*

📄 CodeRabbit inference engine (AGENTS.md)

CLI arguments must use kebab-case (e.g., --use-cuda-graph, not --use_cuda_graph)

Files:

newton/_src/geometry/broad_phase_common.py
newton/_src/sim/contacts.py
asv/benchmarks/collision/benchmark_broad_phase.py
newton/examples/cable/example_cable_pile.py
newton/_src/sim/collide_unified.py

newton/examples/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

newton/examples/**/*.py: Examples must not import from newton._src, only from the public Newton API
Examples must follow the format with an Example class and must be registered in README.md with a uv run command and screenshot
Examples must implement a test_final() method that verifies the simulation state is valid after the example has run
Examples may optionally implement a test_post_step() method that is evaluated after every step() call

Files:

newton/examples/cable/example_cable_pile.py

🧠 Learnings (11)

📓 Common learnings

Learnt from: shi-eric
Repo: newton-physics/newton PR: 521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because they follow a write-first-read-later pattern: BVH query kernels in frame_begin() populate these buffers completely (write-only operations), and collision kernels only read from them afterward in accumulate_contact_force(), ensuring no undefined reads occur.

📚 Learning: 2025-09-26T05:58:21.013Z

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because they follow a write-first-read-later pattern: BVH query kernels in frame_begin() populate these buffers completely (write-only operations), and collision kernels only read from them afterward in accumulate_contact_force(), ensuring no undefined reads occur.

Applied to files:

newton/_src/geometry/broad_phase_common.py
asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/sim/collide_unified.py

📚 Learning: 2025-09-26T05:58:21.013Z

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because the BVH query kernels that use them always write the count to query_results[0, tid] = 0 first before populating any data, ensuring no undefined reads occur.

Applied to files:

newton/_src/geometry/broad_phase_common.py
asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/sim/collide_unified.py

📚 Learning: 2025-12-12T17:45:26.847Z

Learnt from: vastsoun
Repo: newton-physics/newton PR: 1019
File: newton/_src/solvers/kamino/geometry/primitives.py:0-0
Timestamp: 2025-12-12T17:45:26.847Z
Learning: In newton/_src/solvers/kamino/geometry/primitives.py, the narrow-phase kernel `_primitive_narrowphase` does not need to handle symmetric shape-pair orderings (e.g., both SPHERE-BOX and BOX-SPHERE) because `kamino.core.builder.ModelBuilder` and `kamino.geometry.primitive.broadphase` guarantee that explicit candidate geometry pairs are always defined with GIDs in ascending order.

Applied to files:

newton/_src/geometry/broad_phase_common.py

📚 Learning: 2025-10-30T07:28:13.112Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 984
File: newton/_src/geometry/broad_phase_nxn.py:318-336
Timestamp: 2025-10-30T07:28:13.112Z
Learning: In Newton's BroadPhaseAllPairs class (newton/_src/geometry/broad_phase_nxn.py), device management is the caller's responsibility. The library intentionally does not perform automatic device transfers for precomputed arrays in the launch method, even if there's a device mismatch. Users must ensure that the device specified in __init__ matches the device used in launch.

Applied to files:

newton/_src/geometry/broad_phase_common.py

📚 Learning: 2025-08-12T17:51:37.474Z

Learnt from: nvlukasz
Repo: newton-physics/newton PR: 519
File: newton/geometry.py:16-22
Timestamp: 2025-08-12T17:51:37.474Z
Learning: In the Newton public API refactor, common geometry symbols like ParticleFlags and ShapeFlags are now exposed at the top level in newton/__init__.py rather than through newton.geometry. The newton/geometry.py module is intentionally focused only on broad-phase collision detection classes (BroadPhaseAllPairs, BroadPhaseExplicit, BroadPhaseSAP).

Applied to files:

newton/_src/geometry/broad_phase_common.py
asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/sim/collide_unified.py

📚 Learning: 2026-01-13T03:11:40.556Z

Learnt from: jumyungc
Repo: newton-physics/newton PR: 1333
File: newton/_src/sim/builder.py:0-0
Timestamp: 2026-01-13T03:11:40.556Z
Learning: Ensure all VBD damping semantics follow Rayleigh-style damping and are unitless. This convention, demonstrated in SolverVBD, should be applied consistently across all VBD-related constraints in the codebase. In newton/_src/sim/builder.py, validate that any damping terms use the same normalization and are treated as dimensionless; add comments and unit tests to enforce consistency across VBD components.

Applied to files:

newton/_src/geometry/broad_phase_common.py
newton/_src/sim/contacts.py
newton/examples/cable/example_cable_pile.py
newton/_src/sim/collide_unified.py

📚 Learning: 2025-12-12T08:45:51.908Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 1221
File: newton/examples/example_sdf.py:277-287
Timestamp: 2025-12-12T08:45:51.908Z
Learning: In Newton examples (e.g., example_sdf.py), computing contacts once per frame and reusing them across multiple substeps is an intentional design choice, not a bug. The contacts are computed before the substep loop and intentionally kept constant throughout all substeps within that frame.

Applied to files:

newton/_src/sim/contacts.py

📚 Learning: 2025-10-29T14:55:48.171Z

Learnt from: gdaviet
Repo: newton-physics/newton PR: 940
File: newton/examples/mpm/example_mpm_granular.py:92-101
Timestamp: 2025-10-29T14:55:48.171Z
Learning: In newton/examples/mpm/example_mpm_granular.py, the capture() method intentionally advances the simulation state once during CUDA graph capture. This is expected behavior because CUDA graph capture records exact memory addresses, so the state arrays (state_0, state_1) cannot be reset or reallocated after capture without invalidating the graph.

Applied to files:

newton/examples/cable/example_cable_pile.py

📚 Learning: 2025-12-12T08:45:43.428Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 1221
File: newton/examples/example_sdf.py:277-287
Timestamp: 2025-12-12T08:45:43.428Z
Learning: In Newtown (Newton) example code, specifically files under newton/examples, computing contacts once per frame and reusing them across all substeps is an intentional design choice, not a bug. Reviewers should verify that contacts are computed before the substep loop and reused for every substep within the same frame. This pattern reduces redundant work and preserves frame-consistency; do not flag as a regression unless the behavior is changed for correctness or performance reasons.

Applied to files:

newton/examples/cable/example_cable_pile.py

📚 Learning: 2025-10-24T07:56:14.792Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 981
File: newton/_src/geometry/collision_convex.py:180-289
Timestamp: 2025-10-24T07:56:14.792Z
Learning: In newton/_src/geometry/collision_convex.py, the single-contact solver (create_solve_convex_single_contact) intentionally returns count=1 even when shapes are separated (signed_distance > contact_threshold). The calling code is responsible for deciding whether to accept the contact based on the signed distance value. This design pushes filtering responsibility to the caller.

Applied to files:

newton/_src/sim/collide_unified.py

🧬 Code graph analysis (4)

newton/_src/geometry/broad_phase_common.py (1)

newton/_src/solvers/mujoco/solver_mujoco.py (1)

tile (2370-2390)

asv/benchmarks/collision/benchmark_broad_phase.py (3)

newton/_src/geometry/broad_phase_nxn.py (2)

BroadPhaseAllPairs (265-419)

BroadPhaseExplicit (422-493)

newton/_src/geometry/broad_phase_sap.py (2)

BroadPhaseSAP (395-650)

SAPSortType (34-38)

newton/_src/sim/collide_unified.py (1)

compute_shape_aabbs (169-238)

newton/examples/cable/example_cable_pile.py (4)

newton/examples/__init__.py (1)

create_collision_pipeline (398-488)

newton/_src/sim/collide_unified.py (1)

collide (570-754)

newton/_src/sim/collide.py (1)

collide (201-345)

newton/_src/sim/model.py (1)

collide (658-719)

newton/_src/sim/collide_unified.py (4)

newton/_src/geometry/contact_data.py (2)

ContactData (26-62)

contact_passes_margin_check (66-96)

newton/_src/sim/builder.py (1)

shape_count (1135-1139)

newton/_src/sim/contacts.py (1)

update_debug_info (141-144)

newton/_src/geometry/types.py (1)

GeoType (25-67)

🪛 Ruff (0.14.11)

asv/benchmarks/collision/benchmark_broad_phase.py

124-128: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

129-129: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

244-244: Unused method argument: grid_size

(ARG002)

244-244: Unused method argument: spacing_factor

(ARG002)

244-244: Unused method argument: broad_phase_type

(ARG002)

259-259: Unused method argument: grid_size

(ARG002)

259-259: Unused method argument: spacing_factor

(ARG002)

259-259: Unused method argument: broad_phase_type

(ARG002)

268-268: Unused method argument: grid_size

(ARG002)

268-268: Unused method argument: spacing_factor

(ARG002)

268-268: Unused method argument: broad_phase_type

(ARG002)

273-273: Unused method argument: grid_size

(ARG002)

273-273: Unused method argument: spacing_factor

(ARG002)

273-273: Unused method argument: broad_phase_type

(ARG002)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Run GPU Benchmarks / Run GPU Benchmarks on AWS EC2
GitHub Check: Run GPU Tests / Run GPU Unit Tests on AWS EC2

🔇 Additional comments (14)

asv/benchmarks/collision/benchmark_broad_phase.py (5)

44-109: LGTM!

The scene builder function is well-structured for benchmark purposes. It creates a consistent test scenario with configurable grid density and properly computes AABBs for the broad phase tests. The docstring clearly explains the spacing_factor parameter semantics.

267-275: LGTM!

The tracking methods correctly return metrics without requiring the full benchmark setup. The shape_count is set in both the skip and non-skip branches of setup(), so these methods work correctly for all cases.

243-256: LGTM!

The timing methodology is sound: using wp.ScopedTimer with synchronize=True ensures accurate GPU timing, and taking the median of multiple samples provides robust measurements. The skip handling returns NaN appropriately for cases where the benchmark cannot run.

119-129: Static analysis hints are false positives for ASV benchmark conventions.

The Ruff warnings about:

RUF012 (mutable class attributes): ASV framework expects params, param_names, repeat, number, etc. as class attributes without ClassVar annotation

ARG002 (unused method arguments): ASV passes parameters to all benchmark methods even though setup() already processes them

These are standard ASV patterns and the warnings can be safely ignored.

202-241: LGTM!

The launch helper methods correctly wrap each broad phase variant's launch() call. Keeping _launch_nxn and _launch_sap separate despite identical signatures provides clarity and allows future divergence if needed. The comment # AABBs are pre-expanded correctly explains the None margin parameter.

newton/_src/geometry/broad_phase_common.py (1)

71-123: The offset broadcast pattern is correct. wp.tile() is a collective operation where each thread's value is placed at its corresponding index in the resulting tile. This identical pattern appears in three files (broad_phase_common.py, collision_core.py, and narrow_phase.py) and works as intended. When the last thread reads offset_broadcast_tile[wp.block_dim() - 1], it retrieves the atomic offset that it placed there, which is then broadcast to all threads for writing pairs to the array.

Likely an incorrect or invalid review comment.

newton/_src/sim/contacts.py (1)

141-156: LGTM!

The update_debug_info method cleanly encapsulates the debug info update, and the clear() optimization to only zero counts is well-documented and aligns with the write-first-read-later buffer pattern used throughout the collision system. Based on learnings, the broad_phase buffers follow this same pattern where BVH query kernels populate buffers completely before reads occur.

newton/_src/sim/collide_unified.py (4)

83-146: LGTM with note on code duplication.

The function correctly handles atomic index allocation with overflow rollback. The recomputation of contact points (lines 101-112) duplicates logic from contact_passes_margin_check in contact_data.py. While this is intentional to ensure consistency after index allocation, consider extracting shared computation to a helper if this pattern expands further.

148-166: LGTM!

Clean wrapper pattern that separates margin-check gating from the actual write logic. The early return on failed margin check avoids unnecessary atomic operations.

511-514: LGTM!

Auto-detection of has_meshes is a convenient default. Note that model.shape_type.numpy() triggers a device-to-host copy, but this is acceptable since from_model is typically called once during initialization.

750-753: LGTM!

Debug info update is correctly placed after all contact generation (both rigid and soft), ensuring counts reflect the complete collision pass.
newton/examples/cable/example_cable_pile.py (3)
148-148: LGTM!

The contact margin of 0.05m (~4x the cable radius of 0.012m) is reasonable for detecting contacts before shapes are in direct contact.

250-253: LGTM!

Correctly creates the collision pipeline after model finalization and reuses it for both initial collision and subsequent simulation steps. The use of newton.examples.create_collision_pipeline follows the public API pattern required by coding guidelines.

268-278: Verify: Per-substep collision detection.

This example computes contacts every substep rather than once per frame. While the learning indicates that "computing contacts once per frame and reusing them across all substeps" is the standard pattern in Newton examples, per-substep collision may be intentional here for accurate cable pile settling dynamics.

If performance is a concern, consider moving collision detection outside the substep loop (before line 270) and reusing contacts across substeps:
def simulate(self):
    """Execute all simulation substeps for one frame."""
    # Compute contacts once per frame
    self.contacts = self.model.collide(self.state_0, collision_pipeline=self.collision_pipeline)
    
    for _substep in range(self.sim_substeps):
        # ... rest of loop without collide call

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-14T16:50:26Z

asv/benchmarks/collision/benchmark_broad_phase.py

+        # Capture CUDA graph
+        self.graph = None
+        if wp.get_device().is_cuda:
+            with wp.ScopedCapture() as capture:
+                self._launch_func()
+            self.graph = capture.graph
+


⚠️ Potential issue | 🟡 Minor

Use self.device.is_cuda instead of wp.get_device().is_cuda for consistency.

The graph capture check uses wp.get_device() which returns the current default device, but the benchmark already stores the model's device in self.device. These may differ if the default device changed.

Suggested fix

# Capture CUDA graph self.graph = None - if wp.get_device().is_cuda: + if self.device.is_cuda: with wp.ScopedCapture() as capture: self._launch_func() self.graph = capture.graph

🤖 Prompt for AI Agents

In `@asv/benchmarks/collision/benchmark_broad_phase.py` around lines 187 - 193, The graph capture uses wp.get_device().is_cuda but the benchmark stores the model/device in self.device; replace the get_device check with self.device.is_cuda to ensure you check the intended device before starting the CUDA capture of self._launch_func() within wp.ScopedCapture(), leaving assignment to self.graph unchanged; update the conditional that currently references wp.get_device().is_cuda to use self.device.is_cuda instead.

coderabbitai · 2026-01-14T16:50:26Z

newton/_src/sim/contacts.py

+    def get_rigid_count(self) -> int:
+        """Get the latest rigid contact count (non-blocking read from pinned memory)."""
+        return int(self.rigid_count_host.numpy()[0])
+
+    def get_soft_count(self) -> int:
+        """Get the latest soft contact count (non-blocking read from pinned memory)."""
+        return int(self.soft_count_host.numpy()[0])


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the actual code in the contacts.py file cd newton && git ls-files newton/_src/sim/contacts.py && head -100 newton/_src/sim/contacts.py | tail -80

Repository: newton-physics/newton

Length of output: 151

🏁 Script executed:

# Get more context around the ContactDebugInfo class sed -n '25,70p' newton/_src/sim/contacts.py

Repository: newton-physics/newton

Length of output: 1944

🏁 Script executed:

# Search for other usages of wp.copy in the codebase to see patterns rg "wp\.copy" -A 3 -B 3 --type py

Repository: newton-physics/newton

Length of output: 19367

🏁 Script executed:

# Look for synchronize() calls in the codebase to understand synchronization patterns rg "wp\.synchronize" --type py -B 2 -A 2

Repository: newton-physics/newton

Length of output: 38218

🏁 Script executed:

# Search for other pinned memory usage patterns rg "pinned" --type py -B 3 -A 3

Repository: newton-physics/newton

Length of output: 7726

Pinned memory reads require synchronization to ensure data validity.

The docstrings claim "non-blocking read from pinned memory", but calling .numpy() after wp.copy() without explicit synchronization may return stale data if the async copy hasn't completed. Similar code elsewhere in the codebase (e.g., solver_implicit_mpm.py) calls wp.synchronize_stream() before reading pinned arrays after a copy.

Consider adding an optional sync parameter that defaults to True, calling wp.synchronize() before returning the value. Alternatively, document that callers must synchronize before reading if they need the latest values.

🤖 Prompt for AI Agents

In `@newton/_src/sim/contacts.py` around lines 61 - 67, The current get_rigid_count and get_soft_count read pinned arrays with .numpy() without ensuring the async copy completed, which can return stale data; update both methods (get_rigid_count and get_soft_count) to accept an optional sync: bool = True parameter and, when sync is True, call wp.synchronize() (or the project-preferred wp.synchronize_stream()) before converting to numpy() so the pinned-memory copy has completed; alternatively, if you prefer not to block by default, change the default to False and update the docstrings to explicitly state callers must synchronize before calling if they need up-to-date values.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@newton/_src/sim/contacts.py`:
- Around line 61-67: get_rigid_count() and get_soft_count() read pinned-host
buffers via .numpy() immediately after an async wp.copy(), which can return
stale data; call wp.synchronize() at the start of each getter (before accessing
rigid_count_host.numpy()[0] / soft_count_host.numpy()[0]) to ensure the async
copies complete, or alternatively update the docstrings to explicitly require
callers to call wp.synchronize() after copy_counts()—prefer adding
wp.synchronize() inside both getters for safety.

🧹 Nitpick comments (5)

asv/benchmarks/collision/benchmark_broad_phase.py (1)
258-275: Inconsistent _skip attribute access pattern.

time_broad_phase uses getattr(self, "_skip", False) (line 246) for safe access, but track_candidate_pairs (line 261) accesses self._skip directly. While ASV guarantees setup() runs first, using a consistent pattern improves robustness.
♻️ Suggested fix for consistency
     `@skip_benchmark_if`(wp.get_cuda_device_count() == 0)
     def track_candidate_pairs(self, grid_size, spacing_factor, broad_phase_type):
         """Track the number of candidate pairs found."""
-        if self._skip:
+        if getattr(self, "_skip", False):
             return float("nan")
         self._launch_func()
         wp.synchronize()
         return self.num_candidate_pairs.numpy()[0]
newton/_src/geometry/broad_phase_common.py (1)
94-97: Consider simplifying boolean-to-int conversion.

The explicit if/else can be replaced with a conditional expression. However, if Warp's kernel semantics require this pattern, feel free to keep as-is.
Suggested simplification
-    # Convert boolean to int (1 if has pair, 0 otherwise)
-    has_pair_int = 0
-    if thread_has_pair:
-        has_pair_int = 1
+    # Convert boolean to int (1 if has pair, 0 otherwise)
+    has_pair_int = 1 if thread_has_pair else 0
newton/_src/sim/contacts.py (1)
22-23: Remove stale debug comment.

This comment appears to be a leftover from development and doesn't add value to the code.
Suggested fix
-# No kernel needed - we just zero counts!
-
-
 class ContactDebugInfo:
newton/_src/viewer/viewer_gl.py (1)
1299-1305: Simplify redundant hasattr check.

Since _contacts is always initialized to None in __init__ (line 83), the hasattr(self, "_contacts") check is redundant.
Suggested fix
                 # Contact stats (if debug info is available)
-                if hasattr(self, "_contacts") and self._contacts is not None and self._contacts.debug_info is not None:
+                if self._contacts is not None and self._contacts.debug_info is not None:
                     rigid_count = self._contacts.debug_info.get_rigid_count()
                     soft_count = self._contacts.debug_info.get_soft_count()
                     imgui.separator()
                     imgui.text(f"Rigid Contacts: {rigid_count}")
                     imgui.text(f"Soft Contacts: {soft_count}")
newton/_src/sim/collide_unified.py (1)

511-514: Auto-detection of has_meshes incurs a synchronization.

Calling model.shape_type.numpy() triggers a device-to-host copy and synchronization. For models with many shapes, this could add latency during pipeline construction.

Consider documenting this behavior or providing the option to skip auto-detection by always passing has_meshes explicitly in performance-critical scenarios.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f9c7ac and 4e6c067.

📒 Files selected for processing (6)

asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/geometry/broad_phase_common.py
newton/_src/sim/collide_unified.py
newton/_src/sim/contacts.py
newton/_src/viewer/viewer_gl.py
newton/examples/cable/example_cable_pile.py

🧰 Additional context used

📓 Path-based instructions (5)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use prefix-first naming for classes: ActuatorPD, ActuatorPID (not PDActuator, PIDActuator)
Use prefix-first naming for methods: add_shape_sphere() (not add_sphere_shape())
Method names must use snake_case
Prefer nested classes when self-contained: if a helper type or enum is only meaningful inside one parent class and doesn't need a public identity, define it as a nested class instead of a top-level class/module
Follow PEP 8 for Python code
Use Google-style docstrings with clear, concise explanations of what functions do, their parameters, and return values

Files:

asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/sim/contacts.py
newton/_src/sim/collide_unified.py
newton/_src/viewer/viewer_gl.py
newton/examples/cable/example_cable_pile.py
newton/_src/geometry/broad_phase_common.py

**/*

📄 CodeRabbit inference engine (AGENTS.md)

CLI arguments must use kebab-case (e.g., --use-cuda-graph, not --use_cuda_graph)

Files:

asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/sim/contacts.py
newton/_src/sim/collide_unified.py
newton/_src/viewer/viewer_gl.py
newton/examples/cable/example_cable_pile.py
newton/_src/geometry/broad_phase_common.py

newton/_src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

The newton/_src/ directory is internal library implementation only and must not be imported by user code (examples, documentation, or external users)

Files:

newton/_src/sim/contacts.py
newton/_src/sim/collide_unified.py
newton/_src/viewer/viewer_gl.py
newton/_src/geometry/broad_phase_common.py

newton/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Any user-facing class/function/object added under _src must be exposed via the public Newton API through re-exports

Files:

newton/_src/sim/contacts.py
newton/_src/sim/collide_unified.py
newton/_src/viewer/viewer_gl.py
newton/examples/cable/example_cable_pile.py
newton/_src/geometry/broad_phase_common.py

newton/examples/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

newton/examples/**/*.py: Examples must not import from newton._src, only from the public Newton API
Examples must follow the format with an Example class and must be registered in README.md with a uv run command and screenshot
Examples must implement a test_final() method that verifies the simulation state is valid after the example has run
Examples may optionally implement a test_post_step() method that is evaluated after every step() call

Files:

newton/examples/cable/example_cable_pile.py

🧠 Learnings (15)

📓 Common learnings

Learnt from: shi-eric
Repo: newton-physics/newton PR: 521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because they follow a write-first-read-later pattern: BVH query kernels in frame_begin() populate these buffers completely (write-only operations), and collision kernels only read from them afterward in accumulate_contact_force(), ensuring no undefined reads occur.

📚 Learning: 2025-09-26T05:58:21.013Z

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because they follow a write-first-read-later pattern: BVH query kernels in frame_begin() populate these buffers completely (write-only operations), and collision kernels only read from them afterward in accumulate_contact_force(), ensuring no undefined reads occur.

Applied to files:

asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/sim/collide_unified.py
newton/_src/geometry/broad_phase_common.py

📚 Learning: 2025-09-26T05:58:21.013Z

Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because the BVH query kernels that use them always write the count to query_results[0, tid] = 0 first before populating any data, ensuring no undefined reads occur.

Applied to files:

asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/sim/collide_unified.py
newton/_src/geometry/broad_phase_common.py

📚 Learning: 2025-08-12T17:51:37.474Z

Learnt from: nvlukasz
Repo: newton-physics/newton PR: 519
File: newton/geometry.py:16-22
Timestamp: 2025-08-12T17:51:37.474Z
Learning: In the Newton public API refactor, common geometry symbols like ParticleFlags and ShapeFlags are now exposed at the top level in newton/__init__.py rather than through newton.geometry. The newton/geometry.py module is intentionally focused only on broad-phase collision detection classes (BroadPhaseAllPairs, BroadPhaseExplicit, BroadPhaseSAP).

Applied to files:

asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/sim/collide_unified.py
newton/_src/geometry/broad_phase_common.py

📚 Learning: 2025-10-30T07:28:13.112Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 984
File: newton/_src/geometry/broad_phase_nxn.py:318-336
Timestamp: 2025-10-30T07:28:13.112Z
Learning: In Newton's BroadPhaseAllPairs class (newton/_src/geometry/broad_phase_nxn.py), device management is the caller's responsibility. The library intentionally does not perform automatic device transfers for precomputed arrays in the launch method, even if there's a device mismatch. Users must ensure that the device specified in __init__ matches the device used in launch.

Applied to files:

asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/geometry/broad_phase_common.py

📚 Learning: 2025-12-12T17:45:26.847Z

Learnt from: vastsoun
Repo: newton-physics/newton PR: 1019
File: newton/_src/solvers/kamino/geometry/primitives.py:0-0
Timestamp: 2025-12-12T17:45:26.847Z
Learning: In newton/_src/solvers/kamino/geometry/primitives.py, the narrow-phase kernel `_primitive_narrowphase` does not need to handle symmetric shape-pair orderings (e.g., both SPHERE-BOX and BOX-SPHERE) because `kamino.core.builder.ModelBuilder` and `kamino.geometry.primitive.broadphase` guarantee that explicit candidate geometry pairs are always defined with GIDs in ascending order.

Applied to files:

asv/benchmarks/collision/benchmark_broad_phase.py
newton/_src/geometry/broad_phase_common.py

📚 Learning: 2025-12-12T08:45:51.908Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 1221
File: newton/examples/example_sdf.py:277-287
Timestamp: 2025-12-12T08:45:51.908Z
Learning: In Newton examples (e.g., example_sdf.py), computing contacts once per frame and reusing them across multiple substeps is an intentional design choice, not a bug. The contacts are computed before the substep loop and intentionally kept constant throughout all substeps within that frame.

Applied to files:

newton/_src/sim/contacts.py
newton/_src/viewer/viewer_gl.py

📚 Learning: 2026-01-13T03:11:40.556Z

Learnt from: jumyungc
Repo: newton-physics/newton PR: 1333
File: newton/_src/sim/builder.py:0-0
Timestamp: 2026-01-13T03:11:40.556Z
Learning: Ensure all VBD damping semantics follow Rayleigh-style damping and are unitless. This convention, demonstrated in SolverVBD, should be applied consistently across all VBD-related constraints in the codebase. In newton/_src/sim/builder.py, validate that any damping terms use the same normalization and are treated as dimensionless; add comments and unit tests to enforce consistency across VBD components.

Applied to files:

newton/_src/sim/contacts.py
newton/_src/sim/collide_unified.py
newton/_src/viewer/viewer_gl.py
newton/examples/cable/example_cable_pile.py
newton/_src/geometry/broad_phase_common.py

📚 Learning: 2025-10-24T07:56:14.792Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 981
File: newton/_src/geometry/collision_convex.py:180-289
Timestamp: 2025-10-24T07:56:14.792Z
Learning: In newton/_src/geometry/collision_convex.py, the single-contact solver (create_solve_convex_single_contact) intentionally returns count=1 even when shapes are separated (signed_distance > contact_threshold). The calling code is responsible for deciding whether to accept the contact based on the signed distance value. This design pushes filtering responsibility to the caller.

Applied to files:

newton/_src/sim/collide_unified.py

📚 Learning: 2025-08-20T03:30:16.037Z

Learnt from: eric-heiden
Repo: newton-physics/newton PR: 584
File: newton/_src/utils/import_urdf.py:159-167
Timestamp: 2025-08-20T03:30:16.037Z
Learning: In the URDF importer (newton/_src/utils/import_urdf.py), cylinders are intentionally created using add_shape_capsule() instead of add_shape_cylinder() because cylinder collisions are not fully supported yet. This is a deliberate workaround until proper cylinder collision support is implemented.

Applied to files:

newton/examples/cable/example_cable_pile.py

📚 Learning: 2025-08-20T18:02:36.099Z

Learnt from: eric-heiden
Repo: newton-physics/newton PR: 587
File: newton/_src/sim/builder.py:656-661
Timestamp: 2025-08-20T18:02:36.099Z
Learning: In Newton physics, collisions between shapes with body index -1 (world-attached/global shapes) are automatically skipped by the collision system, so no manual collision filter pairs need to be added between global shapes from different builders.

Applied to files:

newton/examples/cable/example_cable_pile.py

📚 Learning: 2025-10-29T14:55:48.171Z

Learnt from: gdaviet
Repo: newton-physics/newton PR: 940
File: newton/examples/mpm/example_mpm_granular.py:92-101
Timestamp: 2025-10-29T14:55:48.171Z
Learning: In newton/examples/mpm/example_mpm_granular.py, the capture() method intentionally advances the simulation state once during CUDA graph capture. This is expected behavior because CUDA graph capture records exact memory addresses, so the state arrays (state_0, state_1) cannot be reset or reallocated after capture without invalidating the graph.

Applied to files:

newton/examples/cable/example_cable_pile.py

📚 Learning: 2025-12-12T08:45:43.428Z

Learnt from: nvtw
Repo: newton-physics/newton PR: 1221
File: newton/examples/example_sdf.py:277-287
Timestamp: 2025-12-12T08:45:43.428Z
Learning: In Newtown (Newton) example code, specifically files under newton/examples, computing contacts once per frame and reusing them across all substeps is an intentional design choice, not a bug. Reviewers should verify that contacts are computed before the substep loop and reused for every substep within the same frame. This pattern reduces redundant work and preserves frame-consistency; do not flag as a regression unless the behavior is changed for correctness or performance reasons.

Applied to files:

newton/examples/cable/example_cable_pile.py

📚 Learning: 2025-08-12T18:04:06.577Z

Learnt from: nvlukasz
Repo: newton-physics/newton PR: 519
File: newton/_src/solvers/featherstone/kernels.py:75-75
Timestamp: 2025-08-12T18:04:06.577Z
Learning: The Newton physics framework requires nightly Warp builds, which means compatibility concerns with older stable Warp versions (like missing functions such as wp.spatial_adjoint) are not relevant for this project.

Applied to files:

newton/_src/geometry/broad_phase_common.py

📚 Learning: 2025-08-12T05:17:34.423Z

Learnt from: shi-eric
Repo: newton-physics/newton PR: 521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

Applied to files:

newton/_src/geometry/broad_phase_common.py

🧬 Code graph analysis (5)

asv/benchmarks/collision/benchmark_broad_phase.py (4)

newton/_src/geometry/broad_phase_nxn.py (2)

BroadPhaseAllPairs (265-419)

BroadPhaseExplicit (422-493)

newton/_src/geometry/broad_phase_sap.py (2)

BroadPhaseSAP (395-650)

SAPSortType (34-38)

newton/_src/geometry/narrow_phase.py (1)

launch (1568-1649)

newton/_src/geometry/contact_reduction.py (1)

synchronize (34-34)

newton/_src/sim/collide_unified.py (3)

newton/_src/geometry/contact_data.py (2)

ContactData (26-62)

contact_passes_margin_check (66-96)

newton/_src/solvers/mujoco/kernels.py (1)

write_contact (65-107)

newton/_src/geometry/types.py (1)

GeoType (25-67)

newton/_src/viewer/viewer_gl.py (1)

newton/_src/sim/contacts.py (2)

get_rigid_count (61-63)

get_soft_count (65-67)

newton/examples/cable/example_cable_pile.py (4)

newton/examples/__init__.py (1)

create_collision_pipeline (398-488)

newton/_src/sim/collide_unified.py (1)

collide (570-754)

newton/_src/sim/collide.py (1)

collide (201-345)

newton/_src/sim/model.py (1)

collide (658-719)

newton/_src/geometry/broad_phase_common.py (1)

newton/_src/solvers/mujoco/solver_mujoco.py (1)

tile (2370-2390)

🪛 Ruff (0.14.11)

asv/benchmarks/collision/benchmark_broad_phase.py

124-128: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

129-129: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

244-244: Unused method argument: grid_size

(ARG002)

244-244: Unused method argument: spacing_factor

(ARG002)

244-244: Unused method argument: broad_phase_type

(ARG002)

259-259: Unused method argument: grid_size

(ARG002)

259-259: Unused method argument: spacing_factor

(ARG002)

259-259: Unused method argument: broad_phase_type

(ARG002)

268-268: Unused method argument: grid_size

(ARG002)

268-268: Unused method argument: spacing_factor

(ARG002)

268-268: Unused method argument: broad_phase_type

(ARG002)

273-273: Unused method argument: grid_size

(ARG002)

273-273: Unused method argument: spacing_factor

(ARG002)

273-273: Unused method argument: broad_phase_type

(ARG002)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Run GPU Benchmarks / Run GPU Benchmarks on AWS EC2

🔇 Additional comments (16)

asv/benchmarks/collision/benchmark_broad_phase.py (4)

1-42: LGTM!

The imports are well-organized with wp.config.quiet = True correctly placed before the newton import to suppress verbose output during benchmarks. Using internal _src modules is appropriate for benchmark code that needs direct access to specific implementations.

44-109: LGTM!

The scene builder is well-structured with clear documentation. The workflow correctly:

Creates a 3D grid of articulated sphere bodies

Finalizes the model and computes forward kinematics

Pre-computes AABBs using the unified pipeline's kernel

The AABB arrays use wp.zeros() which is fine since they follow a write-first-read-later pattern (per the retrieved learnings about broad_phase buffers).

202-241: LGTM!

The launch helper methods correctly match their respective broad phase class APIs. The comment "AABBs are pre-expanded" clearly explains why None is passed for shape_contact_margin.

243-275: LGTM!

The benchmark methods follow standard ASV patterns. The unused parameter warnings from static analysis (ARG002) are false positives—ASV requires these parameters for parameterized benchmarks even though values come from setup(). The timing logic using ScopedTimer with median sampling provides reliable measurements.

newton/_src/geometry/broad_phase_common.py (1)

71-123: Tile-cooperative write pattern with correct synchronization.

The algorithm is correct: inclusive scan computes per-thread write offsets (1-based, converted to 0-based indices), only the last thread atomically reserves space, and the subsequent tile broadcast ensures all threads see the base offset before writing. The tile operations (wp.tile_scan_inclusive and wp.tile()) are block-wide collectives in Warp that provide implicit synchronization—the broadcast on line 114 is guaranteed to see the atomic_add result from line 110 before any thread proceeds to write.

This cooperative approach significantly reduces atomic contention compared to per-pair atomic reservations.

newton/_src/sim/contacts.py (2)

146-156: LGTM! Performance optimization for clear() is well-documented.

The change to only zero counts instead of clearing all arrays is a valid optimization. The comment clearly explains the rationale (write-first-read-later pattern), which aligns with learned patterns from similar broad_phase buffers.

25-48: LGTM! Graph-capture-compatible debug info implementation.

The ContactDebugInfo class correctly uses pinned host memory (pinned=True) for async copies that won't break CUDA graph capture. The device-side arrays are properly scoped.

newton/_src/viewer/viewer_gl.py (2)

82-83: LGTM! Contact debug info field initialization.

The _contacts field is properly initialized to None and will be set via set_contacts() when debug info display is needed.

166-176: LGTM! Clean implementation of set_contacts method.

The method has a clear docstring explaining the purpose and the optional nature of debug info display.

newton/_src/sim/collide_unified.py (4)

83-146: LGTM! Well-structured contact write path with margin gating.

The separation of write_contact_no_margin_check (core write logic) and write_contact (margin-gated wrapper) is clean. The atomic counter management with overflow protection (lines 92-96) correctly handles buffer limits.

148-166: LGTM! Clean wrapper function with margin check.

The write_contact function properly gates contact writes using contact_passes_margin_check before delegating to the core write logic.

388-427: Well-documented heuristic for contact buffer allocation.

The neighbor-based heuristic (12 neighbors × 2 contacts ÷ 2 with 20% safety margin) is a reasonable estimate for dense 3D scenarios and is well-documented with the rationale. The 10x threshold for clamping large EXPLICIT mode allocations prevents memory waste from O(N²) explosion.

One minor suggestion: the magic number 10 on line 424 could be extracted to a named constant for clarity.

750-753: LGTM! Debug info update properly gated.

The update_debug_info() call is correctly guarded by enable_contact_debug_info flag, ensuring no overhead when debug info is disabled.

newton/examples/cable/example_cable_pile.py (3)

148-148: LGTM! Per-shape contact margin default.

Setting builder.rigid_contact_margin = 0.05 as a default for all shapes is a reasonable configuration for this cable pile scenario.

250-253: LGTM! Proper collision pipeline initialization.

The example correctly uses newton.examples.create_collision_pipeline() from the public API and stores the pipeline for reuse. The initial contacts are computed with the pipeline, ensuring consistent behavior.

277-277: LGTM! Consistent collision pipeline usage.

The simulation loop correctly reuses the stored collision_pipeline for all subsequent collide() calls, which is the expected pattern.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-14T17:05:43Z

newton/_src/sim/contacts.py

+    def get_rigid_count(self) -> int:
+        """Get the latest rigid contact count (non-blocking read from pinned memory)."""
+        return int(self.rigid_count_host.numpy()[0])
+
+    def get_soft_count(self) -> int:
+        """Get the latest soft contact count (non-blocking read from pinned memory)."""
+        return int(self.soft_count_host.numpy()[0])


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Warp array numpy() method synchronization CUDA

💡 Result:

Short answer: calling a Warp array's numpy() will synchronize and copy GPU data to the host so the returned ndarray is safe to use on the CPU. For CUDA arrays Warp performs a device-to-host copy and will synchronize the Warp CUDA stream (and with consumer frameworks where applicable) before returning the NumPy view. If you perform explicit asynchronous wp.copy() calls you still need to call wp.synchronize() yourself before calling numpy() on the CPU-side buffer. [1][2][3]

Key refs:

FAQ: implicit synchronization when converting to NumPy; manual wp.synchronize() needed only for async wp.copy() patterns. [1]

Interoperability: numpy() copies CUDA arrays back and synchronizes the consumer framework stream with the Warp stream. [2]

API docs: array.numpy() description — synchronous device→host copy for GPU arrays. [3]

If you want, I can show a minimal code example illustrating both the automatic sync path and the case requiring wp.synchronize().

🏁 Script executed:

cat -n newton/_src/sim/contacts.py | head -70

Repository: newton-physics/newton

Length of output: 3218

Add wp.synchronize() before reading the copied counts in get_rigid_count() and get_soft_count().

The code uses asynchronous wp.copy() (lines 58-59) to transfer counts to pinned host memory, but then reads directly without synchronization. According to Warp documentation, explicit asynchronous wp.copy() calls require wp.synchronize() before accessing the CPU-side buffer to ensure the copy has completed. Without this, the reads may return stale or undefined data.

Add wp.synchronize() before .numpy() calls, or update the docstrings to clarify that callers must synchronize after copy_counts() before calling these getters.

🤖 Prompt for AI Agents

In `@newton/_src/sim/contacts.py` around lines 61 - 67, get_rigid_count() and get_soft_count() read pinned-host buffers via .numpy() immediately after an async wp.copy(), which can return stale data; call wp.synchronize() at the start of each getter (before accessing rigid_count_host.numpy()[0] / soft_count_host.numpy()[0]) to ensure the async copies complete, or alternatively update the docstrings to explicitly require callers to call wp.synchronize() after copy_counts()—prefer adding wp.synchronize() inside both getters for safety.

eric-heiden · 2026-01-15T06:37:00Z

Let's wait with merging this until #1335 has been addressed so we have a benchmark to compare performance impacts on cable sim with the previous pipeline and the unified pipeline.
I will review this soon.

nvtw added 16 commits January 13, 2026 13:21

Attempt to improve the performance of the unified collision pipeline …

283f840

…for simple shapes like spheres or capsules

Properly use effective radius

0368db7

Ran ruff

2bef800

Improve the handling of parallel capsules

f13b1e3

Add benchmark comparing unified vs standard collision pipeline

8ce7266

Improve unit test stability

b46d5c6

Improve the benchmarks

1c68618

Add an option to not run all mesh related kernels if there are no mes…

70c3334

…hes in the scene

Improve performance of the unified pipeline a bit

038c73b

Add broad phase benchmark

95321ef

experimental nxn broad phase improvements

4a74b4a

Merge branch 'main' into dev/tw/add_collision_benchmarks

3fb0fee

Use thread block write in nxn broad phase

19edcda

Improve contact writing

133968b

Backup commit

af9c163

Next backup commit

6f9c7ac

nvtw self-assigned this Jan 14, 2026

nvtw marked this pull request as draft January 14, 2026 15:43

nvtw added 2 commits January 14, 2026 16:53

Ran ruff

2f9f530

Revert the changes in example_cable_pile.py

116a26e

coderabbitai bot reviewed Jan 14, 2026

View reviewed changes

nvtw and others added 3 commits January 14, 2026 17:14

Fix contact buffer overallocation leading to unreasonably high cuda g…

3e82782

…rid sizes

Merge branch 'main' into dev/tw/add_collision_benchmarks

7fa35b9

Merge branch 'main' into dev/tw/add_collision_benchmarks

4e6c067

coderabbitai bot reviewed Jan 14, 2026

View reviewed changes

nvtw and others added 2 commits January 15, 2026 08:39

Merge branch 'main' into dev/tw/add_collision_benchmarks

d2bd585

Clean up the allocation logic in the unified pipeline

b2eb056

nvtw closed this Jan 15, 2026

Unified collision pipeline performance improvements #1355

Unified collision pipeline performance improvements #1355

Uh oh!

Conversation

nvtw commented Jan 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Newton Migration Guide

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

eric-heiden commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nvtw commented Jan 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 14, 2026 •

edited

Loading