Skip to content

Conversation

@breyerml
Copy link
Member

@breyerml breyerml commented Oct 9, 2024

Implements the cg_streaming solver_type using USM.
The cg_explicit kernels are used, i.e., no special performance tuning has been performed.

The logic for the OpenCL backend had to be changed to allow multi-GPU support with OpenCL's SVM.
(instead of one context with all devices, we now create one separate context for each device)

…d improve simplicity of implementation by using a std::variant<cl_mem, T*> as device_pointer_type.
…lightly such that the backends are more similar.
…tially initializes all values to zero.

Instead, use a std::unique_ptr together with a C++17 conformant make_unique_for_overwrite implementation followed by an OpenMP parallel zero initialization of all values drastically reducing the overhead.
…bly and BLAS implementation. Align names more to the ones used in the other backends.
…bly + BLAS implementation. Align names more to the ones used in the other backends.
breyerml added 30 commits July 5, 2025 16:31
Now: some parts of the kernels are specialized for the CPU for better performance.
…e the HPX runtime before a call to Kokkos::initialize, otherwise the HPX specific command line options are ignored.
…ode duplication.

Add the possibility to filter out some command line options (mainly from third party libraries HPX and Kokkos).
…s by forwarding them to the respective initialization functions.
…tom kernels since the previous version using clEnqueueFillBuffer failed for SOME data sets on NVIDIA GPUs.
# Conflicts:
#	include/plssvm/backends/CUDA/csvm.hpp
#	include/plssvm/backends/gpu_device_ptr.hpp
#	include/plssvm/csvm.hpp
#	include/plssvm/detail/data_distribution.hpp
#	include/plssvm/detail/type_traits.hpp
#	src/plssvm/backends/OpenCL/csvm.cpp
#	src/plssvm/backends/OpenCL/detail/context.cpp
#	src/plssvm/backends/OpenCL/detail/device_ptr.cpp
#	src/plssvm/backends/OpenCL/detail/utility.cpp
#	src/plssvm/backends/OpenMP/csvm.cpp
#	src/plssvm/backends/stdpar/csvm.cpp
#	src/plssvm/detail/data_distribution.cpp
#	tests/backends/CUDA/detail/device_ptr.cpp
#	tests/backends/HIP/detail/device_ptr.hip
#	tests/backends/OpenCL/detail/device_ptr.cpp
#	tests/backends/generic_csvm_tests.hpp
#	tests/backends/generic_device_ptr_tests.hpp
#	tests/types_to_test.hpp
…r type.

Note: functionality currently not implemented!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants