Integrating thrust into HPX by adityacodes30 · Pull Request #6744 · STEllAR-GROUP/hpx

adityacodes30 · 2025-07-17T03:47:34Z

Integrate HPX algorithms with Nvidia CCCL (Thrust)

This PR implements integration between NVIDIA Thrust and HPX, enabling HPX algorithms to dispatch to GPU-accelerated Thrust implementations through HPX's execution policy system.

Google Summer of Code 2025 Project: HPX-Thrust Integration

Completed Milestones

Phase 1: Foundation Architecture

Execution Policy System - Complete policy hierarchy with sync/async support
- thrust_policy - Host execution with thrust::host
- thrust_device_policy - Device execution with thrust::device
- thrust_task_policy - Asynchronous GPU execution with HPX futures

Phase 2: Algorithm Integration

Universal Algorithm Dispatch - Single tag_invoke overload supporting all algorithms

Algorithm Mapping - 50+ HPX algorithms mapped to Thrust equivalents:

Core: fill, copy, transform, for_each, generate
Reductions: reduce, count, count_if, all_of, any_of, none_of
Search: find, find_if, find_first_of, equal, mismatch
Sorting: sort, stable_sort, partial_sort, is_sorted
Modifying: reverse, unique, remove_if, replace
Numeric: inclusive_scan, exclusive_scan, transform_reduce

Phase 3: Asynchronous Execution

HPX Future Integration - Async policies return hpx::future<T>
CUDA Stream Management - Proper stream handling via hpx::cuda::experimental::target
Non-blocking Execution - Using thrust::cuda::par_nosync for async operations
Event-based Synchronization - HPX future completion tied to CUDA events

Phase 4: Testing & Validation

Test Suite (9 tests):

Test 1-2: Synchronous host/device execution
Test 3-5: Multiple algorithms (fill, transform, reduce)
Test 6-8: Asynchronous execution with explicit/default targets

Build System Integration - CMake support with CUDA compilation
Memory Management - thrust::device_vector usage patterns

Phase 5: Documentation & Examples

Documentation - Usage patterns, API reference, integration guide
Examples - 7 example programs demonstrating featuresz

Technical Architecture

Core Components

// 1. Execution Policies (policy.hpp)
hpx::thrust::thrust_policy policy;           // Host execution
hpx::thrust::thrust_device_policy dev_policy; // Device execution  
hpx::thrust::thrust_task_policy task_policy;  // Async execution

// 2. Universal Algorithm Dispatch (algorithms.hpp)
template<typename HPXTag, typename ThrustPolicy, typename... Args>
auto tag_invoke(HPXTag, ThrustPolicy&&, Args&&...) 

// 3. Algorithm Mapping (algorithm_map.hpp) 
template<> struct algorithm_map<hpx::fill_t> {
    template<typename Policy, typename... Args>
    static constexpr decltype(auto) invoke(Policy&&, Args&&...);
};

Usage Examples

// Synchronous GPU execution
thrust::device_vector<int> data(1000);
hpx::fill(hpx::thrust::thrust_device_policy{}, data.begin(), data.end(), 42);

// Asynchronous GPU execution with futures
auto task_policy = hpx::thrust::thrust_task_policy{};
auto future = hpx::transform(task_policy, input.begin(), input.end(), 
                            output.begin(), [](int x) { return x * 2; });
auto result = future.get(); // Synchronize with GPU completion

Files Added

source/hpx/libs/core/thrust/
├── include/hpx/thrust/
│   ├── policy.hpp                     Execution policies
│   ├── algorithms.hpp               Universal dispatch
│   └── detail/algorithm_map.hpp     Algorithm mappings
├── tests/unit/
│   └── thrust_policy_test.cu         Comprehensive tests
├── examples/                         example programs
├── docs/index.rst                    Documentation
└── CMakeLists.txt                    Build configuration

Any background context you want to provide?

Implementation for HPX-Thrust integration. This provides the foundation for Thrust GPU-accelerated parallel algorithms through HPX. This project was part of Google Summer of Code'25

Checklist

I have added a new feature and have added tests to go along with it.
I have fixed a bug and have added a regression test.
I have added a test using random numbers; I have made sure it uses a seed, and that random numbers generated are valid inputs for the tests.

StellarBot · 2025-07-17T03:50:03Z

Can one of the admins verify this patch?

codacy-production · 2025-07-17T06:34:23Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ -0.70%	✅ ∅

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`3746bd5`)	263487	227324	86.28%
Head commit (`8e2b2a7`)	216318 (-47169)	185106 (-42218)	85.57% (-0.70%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#6744)	0	0	∅ (not applicable)

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

hkaiser

Thank you for your work on this!

libs/core/async_cuda/include/hpx/async_cuda/thrust/algorithms.hpp

libs/core/async_cuda/tests/unit/thrust_policy_test.cu

libs/core/async_cuda/tests/unit/Testing/Temporary/LastTest.log

libs/core/async_cuda/include/hpx/async_cuda/thrust/algorithms.hpp

libs/core/algorithms/include/hpx/parallel/algorithms/fill.hpp

libs/core/async_cuda/include/hpx/async_cuda/thrust/detail/algorithm_map.hpp

libs/core/async_cuda/include/hpx/async_cuda/thrust/algorithms.hpp

adityacodes30 · 2025-08-20T17:53:20Z

Thank you for the review @hkaiser , i pushed another commit which is a lot more representative of current state and i have made a lot of changes , the previous commit was outdated and an intial proof of concept draft . Also i am still adding algorithms in algorithm_map along with cmake checks

I will incorporate the general feedback mentioned above in the forthcoming commit here but if you could look at the implementation logic and see if we could make any improvements here . I will include that as well

Also @Pansysk75 and i were pondering on weather we should move thrust implementations to its own directory from async_commit since it might be confusing as all of thrust isnt 'async' . Would love your opinion here

hkaiser · 2025-08-20T18:28:29Z

Also @Pansysk75 and i were pondering on weather we should move thrust implementations to its own directory from async_commit since it might be confusing as all of thrust isnt 'async' . Would love your opinion here

Yes, turning this into a separate HPX module (e.g., libs/core/thrust, or similar) would be appreciated. Most importantly this would avoid having to add an unnnedded module dependency to async_cuda. You can use the script create_module_skeleton.py to generate an initial skeleton for this.

libs/core/async_cuda/include/hpx/async_cuda/thrust/algorithms.hpp

libs/core/async_cuda/include/hpx/async_cuda/thrust/policy.hpp

adityacodes30 · 2025-08-20T19:16:19Z

thrust implementation depends on async_cuda for async thrust policy , since we use the target from there to get stream and future while dispatching with par_nosync

hkaiser

Great work! Thanks!

libs/core/thrust/examples/async_fill_n.cu

libs/core/thrust/examples/device_copy.cu

libs/core/thrust/examples/device_fill.cu

libs/core/thrust/examples/device_transform.cu

libs/core/thrust/examples/host_fill.cu

libs/core/thrust/include/hpx/thrust/algorithms.hpp

hkaiser · 2025-08-29T22:27:27Z

libs/core/thrust/include/hpx/thrust/policy.hpp

@@ -0,0 +1,603 @@
+//  Copyright (c)      2025 Aditya Sapra


I would have preferred for the policies to be derived from the HPX policy base class as this reduces the amount of code duplication considerably. For now, your code is fine. Please consider changing it if there is sufficient time left, however.

okay let me make another branch with that approach and get it validated

changes at https://github.com/adityacodes30/hpx/tree/policyRefactoring

libs/core/thrust/tests/unit/thrust_policy_test.cu

CMakeLists.txt

adityacodes30 · 2025-08-30T22:50:05Z

@hkaiser i made the requested changes and the refactored derive from base changes are on https://github.com/adityacodes30/hpx/tree/policyRefactoring . Let me know if they look fine and i will merge them into this branch. Also i had to add a guard in any_sender since i was getting errors with device compilation through nvcc , ( inline constexpr empty_vtable_t<T> empty_vtable{}; ) .

Pansysk75 · 2025-09-17T05:09:44Z

@adityacodes30 Could you bring this branch up-to-speed with master? Also, could you take care of the clang_format and cmake_format issues? Thanks!

adityacodes30 · 2025-10-03T20:58:55Z

Sure !

hkaiser · 2025-11-11T13:12:31Z

@adityacodes30 Could you please fix the inspect issues as well?

/libs/core/thrust/include/hpx/thrust/algorithms.hpp: *I* missing #include (utility) for symbol std::declval on line [28](https://github.com/STEllAR-GROUP/hpx/blob/eea2195ffbd026e0c666ed3d3f32a7812361b4aa/libs/core/thrust/include/hpx/thrust/algorithms.hpp#L28)
/libs/core/thrust/include/hpx/thrust/policy.hpp: *I* missing #include (utility) for symbol std::move on line [209](https://github.com/STEllAR-GROUP/hpx/blob/eea2195ffbd026e0c666ed3d3f32a7812361b4aa/libs/core/thrust/include/hpx/thrust/policy.hpp#L209)
/libs/core/thrust/tests/unit/thrust_policy_test.cu: *I* missing #include (cstddef) for symbol std::size_t on line [36](https://github.com/STEllAR-GROUP/hpx/blob/eea2195ffbd026e0c666ed3d3f32a7812361b4aa/libs/core/thrust/tests/unit/thrust_policy_test.cu#L36), *I* missing #include (type_traits) for symbol std::is_same on line [195](https://github.com/STEllAR-GROUP/hpx/blob/eea2195ffbd026e0c666ed3d3f32a7812361b4aa/libs/core/thrust/tests/unit/thrust_policy_test.cu#L195)

This commit adds a basic poc implementation for thrust policy and algorithm dispatch based on it Signed-off-by: Aditya Sapra <adityaework@gmail.com>

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

Add par_nosync execution policy support which allows async execution , leveraging exiting async_cuda infrastructure and add tag invoke policy branching Signed-off-by: Aditya Sapra <adityaework@gmail.com>

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

Co-authored-by: Hartmut Kaiser <hartmut.kaiser@gmail.com> Signed-off-by: Aditya Sapra <adityaework@gmail.com>

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

adityacodes30 · 2025-11-12T19:23:41Z

@hkaiser fixed , also a lot of ci seem to be passing

codacy-production · 2025-11-12T21:27:25Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ -0.92%	✅ ∅

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`afcbb5b`)	273067	235146	86.11%
Head commit (`cb1b869`)	195039 (-78028)	166166 (-68980)	85.20% (-0.92%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#6744)	0	0	∅ (not applicable)

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

adityacodes30 · 2025-11-27T18:49:27Z

Any Blocker on getting this merged ?

hkaiser

LGTM, thanks!

hkaiser · 2025-11-27T23:08:14Z

@adityacodes30 Thank you for your contribution! Much appreciated!

hkaiser added type: enhancement category: algorithms project: GSoC labels Jul 25, 2025

Pansysk75 marked this pull request as ready for review August 20, 2025 10:25

Pansysk75 requested a review from hkaiser as a code owner August 20, 2025 10:25

hkaiser reviewed Aug 20, 2025

View reviewed changes

adityacodes30 requested review from Pansysk75 and hkaiser August 29, 2025 21:53

hkaiser reviewed Aug 29, 2025

View reviewed changes

adityacodes30 requested a review from hkaiser August 30, 2025 22:50

adityacodes30 force-pushed the master branch 2 times, most recently from 249d010 to eea2195 Compare November 9, 2025 19:50

adityacodes30 added 8 commits November 12, 2025 11:27

thrust policy and algorithm dipsatch

4caba0c

This commit adds a basic poc implementation for thrust policy and algorithm dispatch based on it Signed-off-by: Aditya Sapra <adityaework@gmail.com>

add algorithm map to molularise code

a5ccc42

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

Async execution policy and refactor

e4f15b2

Add par_nosync execution policy support which allows async execution , leveraging exiting async_cuda infrastructure and add tag invoke policy branching Signed-off-by: Aditya Sapra <adityaework@gmail.com>

Move to independent directory and address feedback

5a372c3

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

Add docs, examples , algorithms

8fd3a2a

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

refactor : address feedback

74b4be4

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

cleanup and refactoring

42e096a

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

fix formatting and add cmake option

7f35b7d

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

adityacodes30 and others added 9 commits November 12, 2025 11:27

restore unintended changes

002d72d

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

Update libs/core/thrust/include/hpx/thrust/algorithms.hpp

83e537f

Co-authored-by: Hartmut Kaiser <hartmut.kaiser@gmail.com> Signed-off-by: Aditya Sapra <adityaework@gmail.com>

Update libs/core/thrust/include/hpx/thrust/algorithms.hpp

f1bf78b

Co-authored-by: Hartmut Kaiser <hartmut.kaiser@gmail.com> Signed-off-by: Aditya Sapra <adityaework@gmail.com>

add copyright headers

234deb0

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

use bool-cpp20 concepts in tag_invoke

3eabcb8

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

Apply clang-format to algorithms.hpp

fbf282c

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

fix formatting issues

86486d1

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

cmake files format

0653418

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

include headers

cb1b869

Signed-off-by: Aditya Sapra <adityaework@gmail.com>

adityacodes30 force-pushed the master branch from 81149e2 to cb1b869 Compare November 12, 2025 17:27

hkaiser added this to the 2.0.0 milestone Nov 12, 2025

hkaiser approved these changes Nov 27, 2025

View reviewed changes

hkaiser merged commit 6ead071 into STEllAR-GROUP:master Nov 27, 2025
84 of 101 checks passed

Uh oh!

Conversation

adityacodes30 commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Integrate HPX algorithms with Nvidia CCCL (Thrust)

Google Summer of Code 2025 Project: HPX-Thrust Integration

Completed Milestones

Phase 1: Foundation Architecture

Phase 2: Algorithm Integration

Phase 3: Asynchronous Execution

Phase 4: Testing & Validation

Phase 5: Documentation & Examples

Technical Architecture

Core Components

Usage Examples

Files Added

Any background context you want to provide?

Checklist

Uh oh!

StellarBot commented Jul 17, 2025

Uh oh!

codacy-production bot commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

Uh oh!

hkaiser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adityacodes30 commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hkaiser commented Aug 20, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adityacodes30 commented Aug 20, 2025

Uh oh!

hkaiser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hkaiser Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

adityacodes30 Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

adityacodes30 Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

adityacodes30 commented Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Pansysk75 commented Sep 17, 2025

Uh oh!

adityacodes30 commented Oct 3, 2025

adityacodes30 commented Jul 17, 2025 •

edited

Loading

codacy-production bot commented Jul 17, 2025 •

edited

Loading

adityacodes30 commented Aug 20, 2025 •

edited

Loading

adityacodes30 commented Aug 30, 2025 •

edited

Loading

codacy-production bot commented Nov 12, 2025 •

edited

Loading