Skip to content

Conversation

@multitalentloes
Copy link
Member

NOT intended for merging.

This PR shows current a POC for having the matrix assemble happen on the GPU. Many shortcuts and simplifications have been made, and very little effort has been spent on writing high-quality code. Performance results will be shared later when all the required elements for a proper SPE11C simulation case has been implemented and some effort has been spent into optimizing it. Currently I have a separate executable that will always use GPU assembly.

I will create separate PRs for subsets of files and get it merged partially over time.

@multitalentloes multitalentloes added the manual:new-feature This is a new feature and should be described in the manual label Nov 17, 2025
have to amend the GpuBuffer class because
the vector<bool> a few functions depend on does not
support the .data() member function due to how
it stores the data.
Realized that preprocessor statements were too
strict. Have now created a separate implementation
file that we can link against instead.
Figured out that mew structure was bad
created standalone executable to get
gpu compiler to see content
of tpfalinearizer
no longer needed as gpubuffer is header-only
@multitalentloes multitalentloes force-pushed the start_porting_assembly_to_gpu branch from 1c3c202 to 1f483ec Compare November 20, 2025 07:14
@multitalentloes
Copy link
Member Author

multitalentloes commented Nov 20, 2025

Yet to rebase after #6508 got merged which will reduce the diff quite a bit. #6543 will also partially merge this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

manual:new-feature This is a new feature and should be described in the manual

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant