-
Notifications
You must be signed in to change notification settings - Fork 91
Improve sorting for the diving queue via pseudocost estimate and randomness #564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ing basis_update_mpf_t.
…s when detaching the node from the tree. add backtracking argument to diving.
…d_strenghtening to persist some variables between calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
cpp/src/utilities/pcg.hpp (1)
20-20: Prefer C++ header<cstdint>over C header.Use
<cstdint>instead of<stdint.h>for idiomatic C++. The C++ version is equivalent but follows C++ conventions.Apply this diff:
-#include <stdint.h> +#include <cstdint>
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cpp/src/utilities/pcg.hpp(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cpp/src/utilities/pcg.hpp (1)
cpp/src/dual_simplex/diving_queue.hpp (4)
seed(62-62)seed(62-62)a(42-45)a(42-42)
🪛 Clang (14.0.6)
cpp/src/utilities/pcg.hpp
[error] 20-20: 'stdint.h' file not found
(clang-diagnostic-error)
🔇 Additional comments (1)
cpp/src/utilities/pcg.hpp (1)
25-154: Solid PCG implementation.The PCG random number generator is correctly implemented with proper initialization, state updates, output permutation, and skip-ahead functionality. The previous discussion about
uniform()semantics has been appropriately resolved with the semi-closed interval[low, high)behavior.
|
/ok to test 899d2d3 |
| template <typename T> | ||
| T uniform(T low, T high) | ||
| { | ||
| double val = next_double(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| double val = next_double(); | |
| T val; next(val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even for integer, we need to generate a floating point number between 0 and 1 for this algorithm to work. See this article: https://www.pcg-random.org/posts/bounded-rands.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is slightly bias and do not cover the full 64-bit range, but should be enough for our purposes. In the article, its called FP Multiply.
|
|
||
| #include <stdint.h> | ||
|
|
||
| // Copied from raft/PCGenerator (rng_device.cuh). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to copy? we can use raft directly right?
Also, there is random generator within cuopt source code (https://github.com/NVIDIA/cuopt/blob/a7df1219fa8f96f1f749efb36fd073bd47090615/cpp/src/utilities/seed_generator.cuh)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PCGenerator header from raft includes CUDA code, such that the compilation will fail when called in a cpp file, like branch_and_bound.cpp. We need to change to a .cu file to reuse the code from raft
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::random_device is non-deterministic because it uses the entropy from the hardware (according to cppreference). We need to use a pseudorandom generator here so we can set the seed and have a reproducible result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also PCG is a higher quality and faster random generator than the ones offer by the STL.
|
/ok to test be79001 |
| std::vector<char> row_sense; | ||
| // Limit the number of simplex iterations when diving. | ||
| if (thread_type == thread_type_t::DIVING) { | ||
| lp_settings.iteration_limit = 0.05 * stats_.total_lp_iters - stats.total_lp_iters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble understanding what iteration limit is being set here. There is a stats_ and a stats. Maybe a comment would clarify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you pass stats_ in for stats could this set a negative iteration limit?
| stats_.total_lp_solve_time += toc(lp_start_time); | ||
| stats_.total_lp_iters += node_iter; | ||
| stats.total_lp_solve_time += toc(lp_start_time); | ||
| stats.total_lp_iters += node_iter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the local stats represent?
| true, | ||
| original_lp_.lower, | ||
| original_lp_.upper, | ||
| stats_, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing stats_ in for stats?
| std::deque<mip_node_t<i_t, f_t>*> stack; | ||
| stack.push_front(&subtree.root); | ||
|
|
||
| stats_t lp_stats; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see so this tracking how much work we have done on the diving thread? Maybe we could call this dive_stats for clarity?
| min_diving_queue_size_ = 4 * settings_.num_diving_threads; | ||
| status_ = mip_exploration_status_t::RUNNING; | ||
| lower_bound_ceiling_ = inf; | ||
| diving_queue_.set_rng_seed(settings_.random_seed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you see an improvement by using randomness? If so, by how much?
| f_t leaf_objective) | ||
| { | ||
| mutex.lock(); | ||
| std::lock_guard<omp_mutex_t> lock(mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did we need to change the mutex here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this suggestion to align with much of the rest of the codebase, which uses scoped locks to ensure mutexes are properly released at scope exit. Threading/deadlock errors are a nightmare to debug, so might as well ease our task a little by taking advantage of RAII :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear: there is no functional change. This is a code style/best practices suggestion
| } | ||
|
|
||
| template <typename i_t, typename f_t> | ||
| i_t pseudo_costs_t<i_t, f_t>::variable_selection(const std::vector<i_t>& fractional, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of unnecessary changes to this function. Can we revert back to the original function here?
The logs have been removed which are useful for debugging. The variables have been renamed unnecessarily.
| f_t pc_up_avg; | ||
| initialized(num_initialized_down, num_initialized_up, pc_down_avg, pc_up_avg); | ||
|
|
||
| for (auto j : fractional) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful to include a reference or even add the math and the intuition behind this estimate in the comments.
|
|
||
| f_t pc_up = | ||
| pseudo_cost_num_up[j] != 0 ? pseudo_cost_sum_up[j] / pseudo_cost_num_up[j] : pc_up_avg; | ||
| estimate += std::min(std::max(pc_down * f_down, eps), std::max(pc_up * f_up, eps)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be easier to read if you introduced some named variables
const f_t change_in_obj_down = pc_down * f_down;
const f_t change_in_obj_up = pc_up * f_up;
chris-maes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing this @nguidotti
Please revert the changes to the pseudocost code before merging.
Let's also get the other PRs merged so the diff of this one is smaller and easier to understand.
chris-maes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing this @nguidotti
Please revert the changes to the pseudocost code before merging.
Let's also get the other PRs merged so the diff of this one is smaller and easier to understand.
|
I still working on this PR, so I move it back to being draft. I saw some regressions in the number of node explored, which I want to solve that before merging this PR. |
In the previous version, there was 2 loops and temporary buffers, which was not needed in this case. I merge to be a single loop. I will revert back some changes to be more similar to the previous version. |
|
Closing this PR for now as the benefits are small with the current implementation. |
This PR improves the sorting and selection of the nodes for diving. More specifically, it sorts the nodes on the diving queue based on the pseudocost estimation of the objective value (see [1, 2]). Every time a thread consume a node from the diving queue, it has a
0.1chance in selecting a random node. The randomness is generated via PCG (see [3], same as RAFT). This also opens the doors for implementing the hybrid best-estimate/best-first search for the main thread in the future.Some fixes for the incumbent in B&B is also included and was taken from #527.
This PR includes the changes from #492 and #473 and may only be merge after those PRs.
Performance (MIPLIB2017):
reuse-basis-factorization(#492):~14.26with225feasible solutionsthis PR:
~14.09with225feasible solutions. Improves the consistency in solving thecmflsp50-24-8-8instance.References:
[1] J. J. H. Forrest, J. P. H. Hirst, and J. A. Tomlin, “Practical Solution of Large Mixed Integer Programming Problems with Umpire,” Management Science, vol. 20, no. 5, pp. 736–773, Jan. 1974, doi: 10.1287/mnsc.20.5.736.
[2] T. Achterberg, “Constraint Integer Programming,” PhD, Technischen Universität Berlin, Berlin, 2007. doi: 10.14279/depositonce-1634. (Section 6.4)
[3] https://www.pcg-random.org/index.html
Checklist
Summary by CodeRabbit
New Features
Optimization