Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gpu code #58

Open
wants to merge 170 commits into
base: master
Choose a base branch
from
Open

Gpu code #58

wants to merge 170 commits into from

Conversation

abouzied-nasar
Copy link

No description provided.

… b to stop an error related to prototype overloading in dummy.c in src/cuda
… task.c and h queue.h and removed unused vars from runner_doiact_functions_hydro_gpu.h
… tasks. Code now hangs so there must be some issue with task activation. problem for tomorrow!
@abouzied-nasar
Copy link
Author

This is the current state of the gpu code as implemented in the latest SWIFT version. It's not working properly yet as I think there is an issue with the GPU task activation (the code hangs before the first time step) and I need to implement a mechanism for creating deps for unpack tasks whic do not belong to a cell

…ave been source of hanging bug. Not the case :(
… deps for unpack tasks. CPU version works perfectly but GPU code now hangs for some reason
…gpu and re-worked runner_main_clean.cu but code still hangs
…gpu and re-worked runner_main_clean.cu but code still hangs
…. Removed them all and code still hangs. Will copy back previous files innext commit
…ing to with not unskipping tasks properly. IFDEFs are totally correct and code does not hang when GPU code commented out
…ng. Code is doing something but I don't think it's actually progressing through time steps
abouzied-nasar and others added 30 commits January 27, 2025 14:48
…ixed counter decrementation in scheduler_enqueue for the case when cell->hydro.count == 0
….yml set the max time step to 1e-6 to prevent code crashing when there are only a few particle updates. Need to modify GPU code to work for this case. Also set count_max_parts_tmp in runner_main2 to 10 times target to prevent running out of bounds for unsplit cells
…only gpu tasks are not activated if they have zero particles
…sity tasks in previous commit. I am now extending the revised splitting strategy to gradient and force subtypes
…dency graphs are looking good. Split domain with 8^3 cells into 16^3 cells with no issues
…son to with ideal size of ~64 parts per cell and 2 million parts. Will now test on G-H Isamabrd-AI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants