Skip to content

Conversation

@Cucchi01
Copy link
Contributor

@Cucchi01 Cucchi01 commented Jan 12, 2026

Description

  • Problem: sp.bsr_get_diag leaves missing diagonal blocks untouched, which can propagate stale values; in my case this led to NaNs in some environments.
  • Fix: explicitly zero compliance_mat_diagonal before calling bsr_get_diag in the implicit MPM rheology solve.

I opened NVIDIA/warp#1170 to fix this at the source. In the meantime, I’m opening this local change in Newton to avoid the NaNs or other undesired behavior.

I’m not sure whether it’s better to wait for the Warp PR to be merged or to keep this safeguard here as well.
@gdaviet let me know what you prefer

Newton Migration Guide

Please ensure the migration guide for warp.sim users is up-to-date with the changes made in this PR.

  • [] The migration guide in docs/migration.rst is up-to date

Before your PR is "Ready for review"

  • Necessary tests have been added and new examples are tested (see newton/tests/test_examples.py)
  • Documentation is up-to-date
  • Code passes formatting and linting checks with pre-commit run -a

Summary by CodeRabbit

  • Bug Fixes
    • Improved numerical stability by explicitly zero-initializing intermediate matrix data before use, preventing uninitialized values from affecting diagonal extraction.
    • No public APIs or exported interfaces were changed.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

📝 Walkthrough

Walkthrough

Added an explicit zero_() call to compliance_mat_diagonal before calling sp.bsr_get_diag when compliance_mat.nnz != 0, ensuring the allocated diagonal block is zero-initialized prior to extraction.

Changes

Cohort / File(s) Summary
Defensive initialization
newton/_src/solvers/implicit_mpm/solve_rheology.py
Inserted zero_() to explicitly zero-initialize compliance_mat_diagonal before extracting the BSR diagonal with sp.bsr_get_diag in the non-empty compliance_mat.nnz path.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: zeroing the compliance diagonal buffer before calling bsr_get_diag to fix NaN propagation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f2d9d51 and 6327fb4.

📒 Files selected for processing (1)
  • newton/_src/solvers/implicit_mpm/solve_rheology.py
🧰 Additional context used
📓 Path-based instructions (4)
newton/_src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

The newton/_src/ directory is internal library implementation only and must not be imported by user code (examples, documentation, or external users)

Files:

  • newton/_src/solvers/implicit_mpm/solve_rheology.py
newton/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Any user-facing class/function/object added under _src must be exposed via the public Newton API through re-exports

Files:

  • newton/_src/solvers/implicit_mpm/solve_rheology.py
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use prefix-first naming for classes: ActuatorPD, ActuatorPID (not PDActuator, PIDActuator)
Use prefix-first naming for methods: add_shape_sphere() (not add_sphere_shape())
Method names must use snake_case
Prefer nested classes when self-contained: if a helper type or enum is only meaningful inside one parent class and doesn't need a public identity, define it as a nested class instead of a top-level class/module
Follow PEP 8 for Python code
Use Google-style docstrings with clear, concise explanations of what functions do, their parameters, and return values

Files:

  • newton/_src/solvers/implicit_mpm/solve_rheology.py
**/*

📄 CodeRabbit inference engine (AGENTS.md)

CLI arguments must use kebab-case (e.g., --use-cuda-graph, not --use_cuda_graph)

Files:

  • newton/_src/solvers/implicit_mpm/solve_rheology.py
🧠 Learnings (8)
📓 Common learnings
Learnt from: gdaviet
Repo: newton-physics/newton PR: 750
File: newton/_src/solvers/implicit_mpm/solve_rheology.py:892-895
Timestamp: 2025-09-09T11:03:41.928Z
Learning: In newton/_src/solvers/implicit_mpm/solve_rheology.py, passing None for compliance_mat_diagonal is intentional to avoid loading zeros from global memory when compliance_mat is None, rather than creating a zero matrix and extracting its diagonal.
Learnt from: nvlukasz
Repo: newton-physics/newton PR: 519
File: newton/_src/solvers/featherstone/kernels.py:75-75
Timestamp: 2025-08-12T18:04:06.577Z
Learning: The Newton physics framework requires nightly Warp builds, which means compatibility concerns with older stable Warp versions (like missing functions such as wp.spatial_adjoint) are not relevant for this project.
📚 Learning: 2025-09-09T11:03:41.928Z
Learnt from: gdaviet
Repo: newton-physics/newton PR: 750
File: newton/_src/solvers/implicit_mpm/solve_rheology.py:892-895
Timestamp: 2025-09-09T11:03:41.928Z
Learning: In newton/_src/solvers/implicit_mpm/solve_rheology.py, passing None for compliance_mat_diagonal is intentional to avoid loading zeros from global memory when compliance_mat is None, rather than creating a zero matrix and extracting its diagonal.

Applied to files:

  • newton/_src/solvers/implicit_mpm/solve_rheology.py
📚 Learning: 2025-09-18T07:05:56.836Z
Learnt from: gdaviet
Repo: newton-physics/newton PR: 750
File: newton/_src/solvers/implicit_mpm/solve_rheology.py:969-986
Timestamp: 2025-09-18T07:05:56.836Z
Learning: In newton/_src/solvers/implicit_mpm/solve_rheology.py, transposed_strain_mat parameter cannot be None - the type signature was corrected to reflect this guarantee, eliminating the need for None checks when accessing transposed_strain_mat.offsets.

Applied to files:

  • newton/_src/solvers/implicit_mpm/solve_rheology.py
📚 Learning: 2025-12-13T17:26:39.791Z
Learnt from: lenroe-nv
Repo: newton-physics/newton PR: 1248
File: newton/_src/geometry/contact_data.py:47-49
Timestamp: 2025-12-13T17:26:39.791Z
Learning: In ContactData (newton/_src/geometry/contact_data.py), the fields contact_stiffness, contact_damping, and contact_friction_scale use 0.0 as the "unset" sentinel value. This is intentional: Warp struct fields initialize to 0.0 by default, and solver-side code interprets 0.0 as "use default/unscaled" rather than "disable." This design avoids explicit initialization in contact generation kernels.

Applied to files:

  • newton/_src/solvers/implicit_mpm/solve_rheology.py
📚 Learning: 2025-08-12T05:17:34.423Z
Learnt from: shi-eric
Repo: newton-physics/newton PR: 521
File: newton/examples/example_cloth_hanging.py:36-36
Timestamp: 2025-08-12T05:17:34.423Z
Learning: The Newton migration guide (docs/migration.rst) is specifically for documenting how to migrate existing warp.sim functionality to Newton equivalents. New Newton-only features that didn't exist in warp.sim do not need migration documentation.

Applied to files:

  • newton/_src/solvers/implicit_mpm/solve_rheology.py
📚 Learning: 2025-09-26T05:58:21.013Z
Learnt from: WenchaoHuang
Repo: newton-physics/newton PR: 835
File: newton/_src/solvers/style3d/collision/collision.py:49-53
Timestamp: 2025-09-26T05:58:21.013Z
Learning: In newton/_src/solvers/style3d/collision/collision.py, the broad_phase buffers (broad_phase_ee, broad_phase_ef, broad_phase_vf) created with wp.array() don't need zero-initialization because they follow a write-first-read-later pattern: BVH query kernels in frame_begin() populate these buffers completely (write-only operations), and collision kernels only read from them afterward in accumulate_contact_force(), ensuring no undefined reads occur.

Applied to files:

  • newton/_src/solvers/implicit_mpm/solve_rheology.py
📚 Learning: 2025-08-12T18:04:06.577Z
Learnt from: nvlukasz
Repo: newton-physics/newton PR: 519
File: newton/_src/solvers/featherstone/kernels.py:75-75
Timestamp: 2025-08-12T18:04:06.577Z
Learning: The Newton physics framework requires nightly Warp builds, which means compatibility concerns with older stable Warp versions (like missing functions such as wp.spatial_adjoint) are not relevant for this project.

Applied to files:

  • newton/_src/solvers/implicit_mpm/solve_rheology.py
📚 Learning: 2026-01-13T03:11:40.556Z
Learnt from: jumyungc
Repo: newton-physics/newton PR: 1333
File: newton/_src/sim/builder.py:0-0
Timestamp: 2026-01-13T03:11:40.556Z
Learning: Ensure all VBD damping semantics follow Rayleigh-style damping and are unitless. This convention, demonstrated in SolverVBD, should be applied consistently across all VBD-related constraints in the codebase. In newton/_src/sim/builder.py, validate that any damping terms use the same normalization and are treated as dimensionless; add comments and unit tests to enforce consistency across VBD components.

Applied to files:

  • newton/_src/solvers/implicit_mpm/solve_rheology.py
🔇 Additional comments (1)
newton/_src/solvers/implicit_mpm/solve_rheology.py (1)

1125-1128: LGTM! Well-documented workaround for the upstream Warp issue.

The explicit .zero_() before bsr_get_diag ensures missing diagonal blocks default to zeros rather than propagating stale memory, preventing the NaN issues described in the PR. The TODO comment properly tracks the upstream fix version for future cleanup.

Based on learnings, this correctly preserves the existing optimization where compliance_mat_diagonal = None is passed when compliance_mat.nnz == 0 (handled by the if branch at lines 1122-1123), avoiding unnecessary global memory loads of zeros in that case.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@shi-eric
Copy link
Member

@Cucchi01's fix is merged in the Warp dev branch and will be available in the next nightly build (~3 hours from now), but we can't simply update Newton to point at the new dev build due to some new deprecation warnings that will cause various tests to fail since they look at test output (see #1305 from @christophercrouzet). I think it makes sense to take this change, but there should be a # TODO reminding us to remove the workaround after Newton is updated to warp_lang-1.12.0.dev20260113 or newer.

Signed-off-by: Cucchi01 <[email protected]>
@Cucchi01 Cucchi01 temporarily deployed to external-pr-approval January 13, 2026 08:16 — with GitHub Actions Inactive
@Cucchi01 Cucchi01 temporarily deployed to external-pr-approval January 13, 2026 08:16 — with GitHub Actions Inactive
@Cucchi01
Copy link
Contributor Author

@shi-eric Thank you for the quick feedback and for approving the Warp PR.
I’ve added the TODO comment to this PR.

@Cucchi01 Cucchi01 temporarily deployed to external-pr-approval January 13, 2026 15:39 — with GitHub Actions Inactive
@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

@shi-eric shi-eric added this pull request to the merge queue Jan 13, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 13, 2026
@shi-eric shi-eric added this pull request to the merge queue Jan 13, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 13, 2026
@shi-eric shi-eric added this pull request to the merge queue Jan 13, 2026
Merged via the queue into newton-physics:main with commit 9012541 Jan 13, 2026
34 of 37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants