Skip to content

Conversation

@BhoomishGupta
Copy link
Contributor

Fixes #6647

Proposed Changes

Created a helper function that reduces the partition value without the need of initial value..
Added a test case that was mentioned in the issue #6647.
Corrected a typo in documentation.

I only did changes in :

  1. hpx\libs\core\algorithms\include\hpx\parallel\algorithms\reduce.hpp
  2. hpx\libs\core\algorithms\tests\regressions\CMakeLists.txt
  3. the rest were changes due to clang and editor.config

And created this :

  1. hpx\libs\core\algorithms\tests\regressions\reduce_6647.cpp

Any background context you want to provide?

Previously, the implementation assumed that the *first was convertible to T. This assumption does not hold for the actual function logic required here. This PR removes that dependency, ensuring the type handling is correct.

Checklist

Not all points below apply to all pull requests.

  • I have added a new feature and have added tests to go along with it.
  • I have fixed a bug and have added a regression test.
  • I have added a test using random numbers; I have made sure it uses a seed, and that random numbers generated are valid inputs for the tests.

@StellarBot
Copy link

Can one of the admins verify this patch?

Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please separate the formatting changes into a different PR? It's close to impossible to review your actual changes here.

Also, for the formatting - are you sure you're using clang-format V20?

@BhoomishGupta
Copy link
Contributor Author

Yeah, I was not using V20.
So, should I close this PR and open a new one, that is only with the changes?

@hkaiser
Copy link
Member

hkaiser commented Feb 4, 2026

Yeah, I was not using V20. So, should I close this PR and open a new one, that is only with the changes?

Simply keep (force) pushing to your branch, this wil update the PR.

{
if (part_size == 1)
{
return HPX_INVOKE(r, *part_begin, *part_begin);
Copy link
Member

@hkaiser hkaiser Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not doing the correct thing as it would apply the reduction to the element of a single-element partition 'twice'. We have to avoid creating single-element partitions to begin with.

Currently, (almost) all algorithms use this functionality to determine the chunk size to use:

HPX_CXX_EXPORT template <typename ExPolicy, typename Future, typename F1,
typename IterOrR, typename Stride = std::size_t>
hpx::util::iterator_range<chunk_size_iterator<IterOrR>>
get_bulk_iteration_shape(ExPolicy& policy, std::vector<Future>& workitems,
F1&& f1, IterOrR& it_or_r, std::size_t& count, std::size_t& cores,
Stride s = Stride(1))
{
if (count == 0)
{
cores = 1;
auto it = chunk_size_iterator(it_or_r, 1);
return hpx::util::iterator_range(it, it);
}
Stride stride = parallel::detail::abs(s);
auto test_function = [&](std::size_t test_chunk_size) -> std::size_t {
if (test_chunk_size == 0)
return 0;
if (stride != 1)
{
// rounding up
test_chunk_size = (std::max) (static_cast<std::size_t>(stride),
(test_chunk_size + stride - 1) / stride * stride);
}
add_ready_future(workitems, f1, it_or_r, test_chunk_size);
test_chunk_size = (std::min) (count, test_chunk_size);
count -= test_chunk_size;
it_or_r = next_or_subrange(it_or_r, test_chunk_size, count);
return test_chunk_size;
};
// note: running the test function will modify 'count'
auto iteration_duration =
hpx::execution::experimental::measure_iteration(
policy.parameters(), policy.executor(), test_function, count);
cores = hpx::execution::experimental::processing_units_count(
policy.parameters(), policy.executor(), iteration_duration, count);
std::size_t max_chunks =
hpx::execution::experimental::maximal_number_of_chunks(
policy.parameters(), policy.executor(), cores, count);
std::size_t chunk_size =
hpx::execution::experimental::get_chunk_size(policy.parameters(),
policy.executor(), iteration_duration, cores, count);
// make sure, chunk size and max_chunks are consistent
adjust_chunk_size_and_max_chunks(cores, count, max_chunks, chunk_size);
auto last = next_or_subrange(it_or_r, count, 0);
if (stride != 1)
{
chunk_size = (std::max) (static_cast<std::size_t>(stride),
(chunk_size + stride - 1) / stride * stride);
}
// Report the calculated parameters to the corresponding parameters
// object.
hpx::execution::experimental::collect_execution_parameters(
policy.parameters(), policy.executor(), count, cores, max_chunks,
chunk_size);
// update executor with new values
policy = hpx::experimental::prefer(
hpx::execution::experimental::with_processing_units_count, policy,
cores);
auto shape_begin = chunk_size_iterator(it_or_r, chunk_size, count);
auto shape_end = chunk_size_iterator(last, chunk_size, count, count);
return hpx::util::iterator_range(shape_begin, shape_end);
}

Especially the invocation of the customization point here is responsible for computing the chunk size:

std::size_t chunk_size =
hpx::execution::experimental::get_chunk_size(policy.parameters(),
policy.executor(), iteration_duration, cores, count);

I'm inclined to think that we should introduce a new customization point replacing the final chunk size adjustments here

adjust_chunk_size_and_max_chunks(cores, count, max_chunks, chunk_size);

That would allow to install a custom implementation of said customization point for the reduce algorithm without interfering with any other, possibly user-supplied customizations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect reduce implementation

3 participants