Skip to content

Conversation

@aliceb-nv
Copy link
Contributor

@aliceb-nv aliceb-nv commented Oct 22, 2025

This PR adds support for the RINS heuristic, which is run every N nodes to search for feasible solutions in a neighborhood computed from the equal integers between the best integer incumbent and the node's LP optimal.
RINS provides a ~1% improvement over the main branch: ~15.5% vs ~14.3-14.7%.

This PR also fixes a bug in the CPUFJ external solution bookkeeping.

closes #531

Summary by CodeRabbit

  • New Features

    • Integrated multi-threaded RINS local search and CPU worker scaffold for stronger MIP diversification.
    • Configurable per-run node limit and node-processed callback for live node-level reporting.
    • Log-prefix propagation for clearer, consistent logs.
  • Bug Fixes

    • Safer solution copying and presolve bounds/flag consistency checks.
    • Improved thread-safety and synchronization across population/diversity components.
    • More robust incumbent handling and explicit node-limit status reporting.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
cpp/src/mip/diversity/lns/rins.cuh (1)

105-106: Initialize seed in the declaration or constructor to avoid indeterminate state.

The seed member is declared without initialization and is not set in the rins_t constructor (lines 33-34 in rins.cu). It's only initialized later in enable() (line 145 in rins.cu), leaving a window where seed has an indeterminate value. If any code path accesses seed between construction and enable(), undefined behavior results.

Apply this diff to initialize seed in the header:

-  i_t seed;
+  i_t seed{0};

Or initialize it in the constructor (rins.cu line 34):

   fixrate    = settings.default_fixrate;
   time_limit = settings.default_time_limit;
+  seed       = 0;
cpp/src/mip/diversity/lns/rins.cu (2)

98-101: Implement preemption mechanism in stop_cpu_solver to enable timely interruption.

The empty stop_cpu_solver prevents mid-run preemption of run_rins(). If RINS takes a long time, there's no way to interrupt it early for shutdown or timeout. The FJ implementation sets a halted flag (line 1120 in fj_cpu.cu) that the solver checks periodically.

Consider adding a similar mechanism:

  1. Add std::atomic<bool> halted{false}; to rins_t or rins_thread_t.
  2. Set halted = true in stop_cpu_solver().
  3. Clear halted = false in start_cpu_solver() or at the start of run_rins().
  4. Check halted periodically in run_rins() (e.g., before/after major operations) and return early if set.

227-234: Use explicit atomic load and consistent integer types.

Two issues:

  1. Line 227: node_count is atomic but used without explicit .load(), while line 252 uses explicit .load() for the same variable. Be consistent.

  2. Line 234: Mixing int and i_t with an intermediate cast. vars_to_fix.size() returns size_t, multiplying by rins_ratio could yield a large value, casting to int risks overflow, then assigning to i_t. Use i_t throughout.

Apply this diff:

-  thrust::default_random_engine g(seed + node_count);
+  thrust::default_random_engine g(seed + node_count.load());
  
   // shuffle fixing order
   thrust::shuffle(rins_handle.get_thrust_policy(), vars_to_fix.begin(), vars_to_fix.end(), g);
  
   // fix n first according to fractional ratio
   f_t rins_ratio = fixrate;
-  i_t n_to_fix   = std::max((int)(vars_to_fix.size() * rins_ratio), 0);
+  i_t n_to_fix   = std::max<i_t>(static_cast<i_t>(vars_to_fix.size() * rins_ratio), i_t{0});
🧹 Nitpick comments (2)
cpp/src/mip/relaxed_lp/relaxed_lp.cu (1)

65-65: Consider removing or clarifying the trailing comment.

The trailing comment // 3 doesn't provide meaningful context. If it's indicating a version or mode number, that's already clear from Stable3. Consider removing it or replacing it with a comment explaining why Stable3 is appropriate for RINS.

-  pdlp_settings.pdlp_solver_mode                     = pdlp_solver_mode_t::Stable3;  // 3
+  pdlp_settings.pdlp_solver_mode                     = pdlp_solver_mode_t::Stable3;
cpp/src/mip/problem/problem.cu (1)

201-201: Consider renaming deep_copy for clarity.

The parameter name deep_copy is somewhat misleading since it controls which fields are copied (partial vs. full) rather than how they are copied (shallow vs. deep). All copied fields use deep copying via copy constructors. Consider names like full_copy or include_mip_data to better reflect the actual behavior.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e438989 and 95b0eb9.

📒 Files selected for processing (5)
  • cpp/src/mip/diversity/lns/rins.cu (1 hunks)
  • cpp/src/mip/diversity/lns/rins.cuh (1 hunks)
  • cpp/src/mip/problem/problem.cu (2 hunks)
  • cpp/src/mip/problem/problem.cuh (1 hunks)
  • cpp/src/mip/relaxed_lp/relaxed_lp.cu (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-22T14:25:22.899Z
Learnt from: aliceb-nv
PR: NVIDIA/cuopt#527
File: cpp/src/mip/diversity/lns/rins.cu:167-175
Timestamp: 2025-10-22T14:25:22.899Z
Learning: In MIP (Mixed Integer Programming) problems in the cuOPT codebase, `n_integer_vars == 0` is impossible by definition—MIP problems must have at least one integer variable. If there are no integer variables, it would be a pure Linear Programming (LP) problem, not a MIP problem.

Applied to files:

  • cpp/src/mip/diversity/lns/rins.cu
🧬 Code graph analysis (1)
cpp/src/mip/diversity/lns/rins.cu (2)
cpp/src/mip/feasibility_jump/fj_cpu.cu (14)
  • cpu_worker_thread (1070-1091)
  • cpu_worker_thread (1070-1070)
  • kill_cpu_solver (1094-1103)
  • kill_cpu_solver (1094-1094)
  • lock (1074-1074)
  • lock (1086-1086)
  • lock (1097-1097)
  • lock (1110-1110)
  • start_cpu_solver (1106-1116)
  • start_cpu_solver (1106-1106)
  • stop_cpu_solver (1119-1122)
  • stop_cpu_solver (1119-1119)
  • wait_for_cpu_solver (1125-1132)
  • wait_for_cpu_solver (1125-1125)
cpp/src/mip/solver.cu (8)
  • solution (74-78)
  • solution (74-74)
  • solution (80-85)
  • solution (80-82)
  • solution (87-90)
  • solution (87-87)
  • dm (92-92)
  • dm (120-120)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: conda-cpp-build / 12.9.1, 3.10, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.13, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.11, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.11, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.12, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.10, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.12, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.13, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.10, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-sh-client / 13.0.1, 3.10, amd64, rockylinux8
🔇 Additional comments (3)
cpp/src/mip/problem/problem.cu (2)

216-221: LGTM: Presolve data initialization logic is correct.

The conditional initialization appropriately creates presolve_data from the original problem when deep_copy = false (PDLP use case), and copies from the existing presolve_data when deep_copy = true.


264-275: LGTM: Conditional MIP indices initialization is correct.

The logic appropriately skips copying MIP-specific index arrays when deep_copy = false, which is correct for the PDLP use case that doesn't require these fields.

cpp/src/mip/problem/problem.cuh (1)

65-65: Remove this review comment—no call sites with the boolean parameter exist.

The codebase contains no calls to the two-parameter problem_t constructor. All existing calls use the single-argument version, which invokes the default copy constructor (line 64) that performs a complete deep copy. The old parameter name no_deep_copy has been completely removed and is not referenced anywhere. The two-parameter constructor with deep_copy parameter is present but appears to be either newly added or internal, with no active call sites requiring updates.

Likely an incorrect or invalid review comment.

Comment on lines 224 to 263
reverse_coefficients(
(!no_deep_copy)
(!deep_copy)
? rmm::device_uvector<f_t>(problem_.reverse_coefficients, handle_ptr->get_stream())
: rmm::device_uvector<f_t>(problem_.reverse_coefficients.size(), handle_ptr->get_stream())),
: rmm::device_uvector<f_t>(problem_.reverse_coefficients, handle_ptr->get_stream())),
reverse_constraints(
(!no_deep_copy)
(!deep_copy)
? rmm::device_uvector<i_t>(problem_.reverse_constraints, handle_ptr->get_stream())
: rmm::device_uvector<i_t>(problem_.reverse_constraints.size(), handle_ptr->get_stream())),
: rmm::device_uvector<i_t>(problem_.reverse_constraints, handle_ptr->get_stream())),
reverse_offsets(
(!no_deep_copy)
? rmm::device_uvector<i_t>(problem_.reverse_offsets, handle_ptr->get_stream())
: rmm::device_uvector<i_t>(problem_.reverse_offsets.size(), handle_ptr->get_stream())),
coefficients(
(!no_deep_copy)
? rmm::device_uvector<f_t>(problem_.coefficients, handle_ptr->get_stream())
: rmm::device_uvector<f_t>(problem_.coefficients.size(), handle_ptr->get_stream())),
variables((!no_deep_copy)
(!deep_copy) ? rmm::device_uvector<i_t>(problem_.reverse_offsets, handle_ptr->get_stream())
: rmm::device_uvector<i_t>(problem_.reverse_offsets, handle_ptr->get_stream())),
coefficients((!deep_copy)
? rmm::device_uvector<f_t>(problem_.coefficients, handle_ptr->get_stream())
: rmm::device_uvector<f_t>(problem_.coefficients, handle_ptr->get_stream())),
variables((!deep_copy)
? rmm::device_uvector<i_t>(problem_.variables, handle_ptr->get_stream())
: rmm::device_uvector<i_t>(problem_.variables.size(), handle_ptr->get_stream())),
offsets((!no_deep_copy)
? rmm::device_uvector<i_t>(problem_.offsets, handle_ptr->get_stream())
: rmm::device_uvector<i_t>(problem_.offsets.size(), handle_ptr->get_stream())),
: rmm::device_uvector<i_t>(problem_.variables, handle_ptr->get_stream())),
offsets((!deep_copy) ? rmm::device_uvector<i_t>(problem_.offsets, handle_ptr->get_stream())
: rmm::device_uvector<i_t>(problem_.offsets, handle_ptr->get_stream())),
objective_coefficients(
(!no_deep_copy)
(!deep_copy)
? rmm::device_uvector<f_t>(problem_.objective_coefficients, handle_ptr->get_stream())
: rmm::device_uvector<f_t>(problem_.objective_coefficients.size(),
handle_ptr->get_stream())),
: rmm::device_uvector<f_t>(problem_.objective_coefficients, handle_ptr->get_stream())),
variable_bounds(
(!no_deep_copy)
? rmm::device_uvector<f_t2>(problem_.variable_bounds, handle_ptr->get_stream())
: rmm::device_uvector<f_t2>(problem_.variable_bounds.size(), handle_ptr->get_stream())),
(!deep_copy) ? rmm::device_uvector<f_t2>(problem_.variable_bounds, handle_ptr->get_stream())
: rmm::device_uvector<f_t2>(problem_.variable_bounds, handle_ptr->get_stream())),
constraint_lower_bounds(
(!no_deep_copy)
(!deep_copy)
? rmm::device_uvector<f_t>(problem_.constraint_lower_bounds, handle_ptr->get_stream())
: rmm::device_uvector<f_t>(problem_.constraint_lower_bounds.size(),
handle_ptr->get_stream())),
: rmm::device_uvector<f_t>(problem_.constraint_lower_bounds, handle_ptr->get_stream())),
constraint_upper_bounds(
(!no_deep_copy)
(!deep_copy)
? rmm::device_uvector<f_t>(problem_.constraint_upper_bounds, handle_ptr->get_stream())
: rmm::device_uvector<f_t>(problem_.constraint_upper_bounds.size(),
handle_ptr->get_stream())),
: rmm::device_uvector<f_t>(problem_.constraint_upper_bounds, handle_ptr->get_stream())),
combined_bounds(
(!no_deep_copy)
? rmm::device_uvector<f_t>(problem_.combined_bounds, handle_ptr->get_stream())
: rmm::device_uvector<f_t>(problem_.combined_bounds.size(), handle_ptr->get_stream())),
(!deep_copy) ? rmm::device_uvector<f_t>(problem_.combined_bounds, handle_ptr->get_stream())
: rmm::device_uvector<f_t>(problem_.combined_bounds, handle_ptr->get_stream())),
variable_types(
(!no_deep_copy)
? rmm::device_uvector<var_t>(problem_.variable_types, handle_ptr->get_stream())
: rmm::device_uvector<var_t>(problem_.variable_types.size(), handle_ptr->get_stream())),
integer_indices((!no_deep_copy) ? 0 : problem_.integer_indices.size(),
handle_ptr->get_stream()),
binary_indices((!no_deep_copy) ? 0 : problem_.binary_indices.size(), handle_ptr->get_stream()),
nonbinary_indices((!no_deep_copy) ? 0 : problem_.nonbinary_indices.size(),
handle_ptr->get_stream()),
is_binary_variable((!no_deep_copy) ? 0 : problem_.is_binary_variable.size(),
handle_ptr->get_stream()),
(!deep_copy) ? rmm::device_uvector<var_t>(problem_.variable_types, handle_ptr->get_stream())
: rmm::device_uvector<var_t>(problem_.variable_types, handle_ptr->get_stream())),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Simplify redundant ternary operators with identical branches.

All ternary operators in this range have identical branches regardless of the deep_copy flag. These can be simplified to a single constructor call, improving code clarity.

For example, lines 224-227:

-  reverse_coefficients(
-    (!deep_copy)
-      ? rmm::device_uvector<f_t>(problem_.reverse_coefficients, handle_ptr->get_stream())
-      : rmm::device_uvector<f_t>(problem_.reverse_coefficients, handle_ptr->get_stream())),
+  reverse_coefficients(problem_.reverse_coefficients, handle_ptr->get_stream()),

Apply similar simplifications to:

  • reverse_constraints (lines 228-231)
  • reverse_offsets (lines 232-234)
  • coefficients (lines 235-237)
  • variables (lines 238-240)
  • offsets (lines 241-242)
  • objective_coefficients (lines 243-246)
  • variable_bounds (lines 247-249)
  • constraint_lower_bounds (lines 250-253)
  • constraint_upper_bounds (lines 254-257)
  • combined_bounds (lines 258-260)
  • variable_types (lines 261-263)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In cpp/src/mip/problem/problem.cu around lines 224-263 the constructor
initializers use redundant ternary expressions where both branches are
identical; replace each expression of the form (!deep_copy) ?
rmm::device_uvector<T>(problem_.X, handle_ptr->get_stream()) :
rmm::device_uvector<T>(problem_.X, handle_ptr->get_stream()) with a single call
rmm::device_uvector<T>(problem_.X, handle_ptr->get_stream()) for each listed
member (reverse_coefficients, reverse_constraints, reverse_offsets,
coefficients, variables, offsets, objective_coefficients, variable_bounds,
constraint_lower_bounds, constraint_upper_bounds, combined_bounds,
variable_types) to remove the unnecessary conditional and improve clarity.

pdlp_settings.pdlp_solver_mode = pdlp_solver_mode_t::Stable2;
pdlp_settings.pdlp_solver_mode = pdlp_solver_mode_t::Stable3; // 3
set_pdlp_solver_mode(pdlp_settings);
// TODO: set Stable3 here?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove the stale TODO comment.

The TODO comment asks whether to set Stable3, but the code on line 65 already does so. This comment should be removed since the task is complete, or clarified if there's remaining uncertainty about the correctness of using Stable3 for RINS.

   pdlp_settings.pdlp_solver_mode                     = pdlp_solver_mode_t::Stable3;  // 3
   set_pdlp_solver_mode(pdlp_settings);
-  // TODO: set Stable3 here?
   pdlp_solver_t<i_t, f_t> lp_solver(op_problem, pdlp_settings);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// TODO: set Stable3 here?
pdlp_settings.pdlp_solver_mode = pdlp_solver_mode_t::Stable3; // 3
set_pdlp_solver_mode(pdlp_settings);
pdlp_solver_t<i_t, f_t> lp_solver(op_problem, pdlp_settings);
🤖 Prompt for AI Agents
In cpp/src/mip/relaxed_lp/relaxed_lp.cu around line 67, remove the stale TODO
comment "// TODO: set Stable3 here?" because Stable3 is already set on line 65;
if there remains uncertainty about using Stable3 for RINS, replace the TODO with
a short clarifying comment describing the decision (e.g., "Stable3 set for RINS
as above") or add a proper code comment explaining why Stable3 is correct.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
cpp/src/mip/diversity/lns/rins.cu (1)

156-156: Use consistent integer type for n_to_fix.

The calculation mixes size_t, f_t, int, and i_t, which can cause narrowing conversions and type inconsistency.

Apply this diff to use i_t consistently:

-  i_t n_to_fix   = std::max((int)(vars_to_fix.size() * rins_ratio), 0);
+  i_t n_to_fix   = std::max<i_t>(static_cast<i_t>(vars_to_fix.size() * rins_ratio), i_t{0});
cpp/src/mip/diversity/lns/rins.cuh (1)

98-98: Initialize seed member variable.

seed is declared without initialization. Although enable() assigns it a value (line 82 in the .cu file), accessing it before enable() would be undefined behavior.

Apply this diff to initialize at declaration:

-  i_t seed;
+  i_t seed{0};
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 95b0eb9 and 9c89559.

📒 Files selected for processing (12)
  • cpp/src/dual_simplex/logger.hpp (3 hunks)
  • cpp/src/mip/diversity/diversity_manager.cu (3 hunks)
  • cpp/src/mip/diversity/lns/rins.cu (1 hunks)
  • cpp/src/mip/diversity/lns/rins.cuh (1 hunks)
  • cpp/src/mip/diversity/population.cuh (3 hunks)
  • cpp/src/mip/feasibility_jump/fj_cpu.cu (1 hunks)
  • cpp/src/mip/feasibility_jump/fj_cpu.cuh (2 hunks)
  • cpp/src/mip/local_search/local_search.cu (1 hunks)
  • cpp/src/mip/problem/problem.cu (5 hunks)
  • cpp/src/mip/solver.cu (5 hunks)
  • cpp/src/mip/solver_context.cuh (2 hunks)
  • cpp/src/mip/utilities/cpu_worker_thread.cuh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • cpp/src/mip/diversity/population.cuh
  • cpp/src/mip/problem/problem.cu
  • cpp/src/mip/feasibility_jump/fj_cpu.cuh
  • cpp/src/dual_simplex/logger.hpp
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-22T14:25:22.899Z
Learnt from: aliceb-nv
Repo: NVIDIA/cuopt PR: 527
File: cpp/src/mip/diversity/lns/rins.cu:167-175
Timestamp: 2025-10-22T14:25:22.899Z
Learning: In MIP (Mixed Integer Programming) problems in the cuOPT codebase, `n_integer_vars == 0` is impossible by definition—MIP problems must have at least one integer variable. If there are no integer variables, it would be a pure Linear Programming (LP) problem, not a MIP problem.

Applied to files:

  • cpp/src/mip/diversity/lns/rins.cu
🧬 Code graph analysis (3)
cpp/src/mip/solver.cu (1)
cpp/src/dual_simplex/branch_and_bound.hpp (3)
  • solution (132-132)
  • solution (145-145)
  • solution (217-217)
cpp/src/mip/feasibility_jump/fj_cpu.cu (1)
cpp/src/mip/diversity/lns/rins.cu (2)
  • run_worker (44-48)
  • run_worker (44-44)
cpp/src/mip/diversity/lns/rins.cu (2)
cpp/src/mip/feasibility_jump/fj_cpu.cu (2)
  • run_worker (1064-1068)
  • run_worker (1064-1064)
cpp/src/mip/solver.cu (8)
  • solution (74-78)
  • solution (74-74)
  • solution (80-85)
  • solution (80-82)
  • solution (87-90)
  • solution (87-87)
  • dm (92-92)
  • dm (120-120)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: wheel-build-cuopt-sh-client / 13.0.1, 3.10, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.10, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.11, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.11, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.12, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.12, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.13, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.10, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.13, amd64, rockylinux8
  • GitHub Check: checks / check-style
🔇 Additional comments (4)
cpp/src/mip/diversity/diversity_manager.cu (1)

420-422: RINS activation timing looks good.

Enabling RINS right after skipping the FJ-only fast path keeps the heuristics wiring straightforward and avoids double toggles later.

cpp/src/mip/feasibility_jump/fj_cpu.cu (1)

1058-1086: Thread shutdown path looks solid.

The new destructor/on_start/on_terminate plumbing cleanly signals fj_cpu->halted and prevents leaks.

cpp/src/mip/local_search/local_search.cu (1)

149-152: Good call on using request_termination().

Switching the scratch threads to the shared termination helper keeps the lifecycle consistent with the new base class.

cpp/src/mip/utilities/cpu_worker_thread.cuh (1)

27-140: Reusable worker scaffold appreciated.

The base class encapsulates the start/stop/join logic cleanly and gives derived classes well-defined hooks.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
cpp/src/mip/presolve/trivial_presolve.cuh (1)

105-113: Resize operation is correct.

The var_flags resize follows the same pattern as the other vector resizes and is placed appropriately. One minor observation: the resize ordering (bounds → types → var_flags → binary → obj) differs slightly from the iterator declaration order (bounds → types → binary → obj → var_flags), though this doesn't affect correctness.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 285b4c7 and 5c71aab.

📒 Files selected for processing (1)
  • cpp/src/mip/presolve/trivial_presolve.cuh (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.13, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.12, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.11, arm64, rockylinux8
🔇 Additional comments (1)
cpp/src/mip/presolve/trivial_presolve.cuh (1)

80-104: LGTM! Consistent variable cleanup pattern.

The addition of var_flags removal follows the same pattern as the other variable-related vector cleanups. The formatting alignment also improves readability across all the removal operations.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
cpp/src/mip/diversity/diversity_manager.cu (2)

251-260: Consider eliminating the redundant call.

The call to add_external_solutions_to_population() appears twice: once inside the if-block (line 255) and once unconditionally after it (line 258). While the early-exit optimization in add_external_solutions_to_population() (line 193 in population.cu) makes the second call a no-op when executed twice, removing line 255 would simplify the logic without changing behavior.

Apply this diff to eliminate the redundancy:

 if (population.preempt_heuristic_solver_.load()) {
   if (population.current_size() == 0) { population.allocate_solutions(); }
-  population.add_external_solutions_to_population();
   return true;
 }
 population.add_external_solutions_to_population();
 return false;

422-423: Add idempotency guard to rins.enable() or document single-call assumption.

The enable() function creates new thread and problem resources without checking if already enabled. While unique_ptr prevents leaks, calling enable() multiple times during an active solve could disrupt the previously spawned thread. Add a guard (if (enabled) return; at the start) or document that run_solver() is called only once per solver instance.

Note: The original concern about resource leaks is unfounded (unique_ptr handles cleanup), but the missing idempotency guard is valid.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5c71aab and 54136ae.

📒 Files selected for processing (5)
  • cpp/src/mip/diversity/diversity_manager.cu (4 hunks)
  • cpp/src/mip/diversity/lns/rins.cu (1 hunks)
  • cpp/src/mip/diversity/population.cu (9 hunks)
  • cpp/src/mip/diversity/population.cuh (3 hunks)
  • cpp/src/mip/local_search/local_search.cu (9 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-22T14:25:22.899Z
Learnt from: aliceb-nv
Repo: NVIDIA/cuopt PR: 527
File: cpp/src/mip/diversity/lns/rins.cu:167-175
Timestamp: 2025-10-22T14:25:22.899Z
Learning: In MIP (Mixed Integer Programming) problems in the cuOPT codebase, `n_integer_vars == 0` is impossible by definition—MIP problems must have at least one integer variable. If there are no integer variables, it would be a pure Linear Programming (LP) problem, not a MIP problem.

Applied to files:

  • cpp/src/mip/diversity/lns/rins.cu
🧬 Code graph analysis (2)
cpp/src/mip/diversity/lns/rins.cu (2)
cpp/src/mip/feasibility_jump/fj_cpu.cu (2)
  • run_worker (1064-1068)
  • run_worker (1064-1064)
cpp/src/mip/solver.cu (8)
  • solution (74-78)
  • solution (74-74)
  • solution (80-85)
  • solution (80-82)
  • solution (87-90)
  • solution (87-87)
  • dm (92-92)
  • dm (120-120)
cpp/src/mip/local_search/local_search.cu (1)
cpp/src/mip/solver.cu (6)
  • solution (74-78)
  • solution (74-74)
  • solution (80-85)
  • solution (80-82)
  • solution (87-90)
  • solution (87-87)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.13, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.12, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.13, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.10, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.12, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.11, amd64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.10, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-mps-parser / 13.0.1, 3.11, arm64, rockylinux8
  • GitHub Check: wheel-build-cuopt-sh-client / 13.0.1, 3.10, amd64, rockylinux8
🔇 Additional comments (22)
cpp/src/mip/diversity/diversity_manager.cu (2)

60-60: LGTM! RINS member initialization follows existing patterns.

The initialization of the rins member in the constructor is consistent with other member initializations (e.g., ls on line 59).


738-738: LGTM! Improved error handling for device setting.

Wrapping cudaSetDevice with RAFT_CUDA_TRY ensures proper error handling and matches the error-handling patterns used elsewhere in the codebase.

cpp/src/mip/diversity/population.cu (6)

167-175: LGTM! Queue capping prevents CPUFJ solution flooding.

The logic correctly identifies and removes the worst (maximum objective) solution from the CPUFJ queue when it exceeds 10 entries, preventing unbounded growth while retaining the best solutions.


192-193: LGTM! Effective optimization to reduce lock contention.

The early-exit check using the atomic solutions_in_external_queue_ flag avoids acquiring the write lock when the external queue is empty, improving performance under contention.


215-215: LGTM! Wait time tracking improves observability.

The longest wait time tracking provides useful debugging information about queue latency. The logic correctly initializes the timer in external_solution_t (population.cuh line 200), tracks the maximum across all queued solutions, and logs it when solutions are consumed.

Also applies to: 228-228, 258-261


401-401: LGTM! Mutex protects population from concurrent RINS access.

The write_mutex lock guards against race conditions when RINS (running on a separate thread) attempts to read the population while the main thread is modifying it. The use of recursive_mutex allows the same thread to acquire the lock multiple times if needed.


571-571: LGTM! Mutex guards concurrent modifications to indices.

The lock protects the indices vector during quality updates and sorting, preventing data races when RINS reads the population on another thread.


670-671: LGTM! Proper synchronization for population reads and modifications.

Both population_to_vector and halve_the_population correctly acquire the write lock to prevent race conditions:

  • population_to_vector protects the read operation with an early exit optimization for empty populations
  • halve_the_population protects extensive modifications (clearing and re-adding solutions)

Also applies to: 693-693

cpp/src/mip/diversity/population.cuh (4)

38-38: LGTM! RINS origin properly integrated.

The new RINS enum value and its string mapping follow the existing pattern for solution origins.

Also applies to: 45-45


66-66: LGTM! Guards prevent undefined behavior on empty populations.

The empty checks prevent:

  • Integer underflow in current_size() when indices.size() is 0
  • Out-of-bounds access in is_feasible() when solutions is empty

These guards are essential for robustness during initialization.

Also applies to: 69-69


196-207: LGTM! Timer enables queue latency tracking.

The timer member tracks how long solutions wait in the external queue. Initializing to infinity serves as a clear sentinel value, and the elapsed time is logged for observability (population.cu line 260).


213-213: LGTM! Synchronization primitives enable thread-safe RINS integration.

  • write_mutex: A recursive mutex that protects population data structures from concurrent modifications
  • solutions_in_external_queue_: An atomic flag enabling lock-free early-exit optimization when the external queue is empty

Both are essential for safe multi-threaded operation with RINS.

Also applies to: 217-217

cpp/src/mip/local_search/local_search.cu (2)

149-151: LGTM! Cleaner termination pattern.

The change from kill_cpu_solver() to request_termination() aligns with modern thread management patterns and is likely part of the CPU worker thread refactor mentioned in the PR summary.


466-466: LGTM! Frequent integration ensures external solutions are not lost.

The strategic placement of add_external_solutions_to_population() calls:

  • Before preemption checks to enable responsive control flow
  • Throughout long-running FP/FJ loops to prevent solution loss
  • After potential improvements to capture all progress

This addresses the concern raised in past reviews about FP loops running too long without integrating external solutions from CPUFJ, B&B, and now RINS.

Based on learnings

Also applies to: 476-476, 586-586, 620-620, 674-674, 680-680, 705-705, 760-760, 781-781

cpp/src/mip/diversity/lns/rins.cu (8)

27-35: LGTM! Constructor properly initializes all members.

The initialization of time_limit from settings.default_time_limit (line 34) addresses the past review concern about using time_limit before initialization. All members are now properly initialized before use.


38-48: LGTM! Clean lifecycle management.

The destructor properly requests termination, and run_worker() provides a clean thread entry point for the RINS logic.


51-75: LGTM! Well-structured callback gating logic.

The callbacks correctly implement the RINS triggering logic:

  • new_best_incumbent_callback: Updates the improvement tracker
  • node_callback:
    • Multi-level gating (enabled check, node cadence, improvement frequency)
    • Opportunistic lock-free check (line 67) to reduce contention
    • Proper locking for state modification (lines 68-73)

The atomic counters provide sufficient memory ordering guarantees as noted in past reviews.


89-120: LGTM! Thorough setup and validation.

The setup logic correctly:

  • Sets device affinity on first call (line 91)
  • Validates problem dimensions and handle consistency (lines 95-102)
  • Safely copies the best feasible solution under lock (line 107)

The assertions provide strong guarantees about preconditions.


127-178: LGTM! Variable fixing logic correctly implements RINS neighborhood.

The logic properly:

  • Identifies variables where incumbent and LP solution agree (lines 127-137)
  • Validates fractional ratio and aborts if too few candidates (lines 138-144)
  • Randomly selects subset to fix based on fixrate (lines 146-155)
  • Validates all selected variables are integers (lines 157-163)

Minor note: Line 153 mixes int and i_t types, but the defensive max(., 0) prevents negative values, so impact is minimal.

Based on learnings


180-205: LGTM! Proper subproblem construction and preprocessing.

The logic correctly:

  • Creates fixed-variable subproblem (line 182)
  • Constructs solution in fixed space (lines 187-193)
  • Optionally adds objective cut to focus search (lines 195-200)
  • Applies trivial presolve to tighten the subproblem (lines 202-205)

This follows standard RINS methodology.


257-262: Verify assertion safety: B&B objective vs solution callback.

The assertion on line 261 assumes that if branch_and_bound_solution.objective is not NaN, then the solution callback was invoked at least once (making rins_solution_queue.size() > 0).

Please confirm that the B&B solver guarantees the solution callback is invoked for every feasible solution before returning a non-NaN objective. If the callback can be skipped in some edge cases (e.g., initial guess is optimal), the assertion could fail.


207-327: LGTM! Sophisticated parallel search with adaptive parameters.

The implementation effectively:

  • Runs B&B and FJ in parallel to explore the RINS neighborhood (lines 209-255)
  • Uses a "Goldilocks" strategy to adaptively tune fixrate and time_limit based on sub-MIP outcome (lines 263-282)
  • Thoroughly processes and validates all solutions before adding to population (lines 294-323)
  • Properly tracks statistics for tuning (lines 325-326)

The parallel approach maximizes the value of each RINS run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA] Recombiner with RINS

4 participants