Skip to content

Conversation

@daviesje
Copy link
Contributor

This is the draft PR for the GPU build, thanks to ADACS.

I'm making this to more easily track my progress and compare files, but we are still a ways off being able to merge.

The key changes are:

  • a redesign of the build system, as CFFI could not support CUDA. This involves using meson to compile the C, and a C++ layer which defines the python bindings using nanobind.
  • Cuda alternative functions for the 2LPT, interpolation table evaluation, grid filtering, ionising emissivity/sfrd evaluation (the hmf integral setting), and the halo sampling.

Current Status (23/07/25):

  • The CUDA Files still use the older parameterisation
  • The CPU version was running at the v4 Beta release, however at a 10% slower speed. This may be due to optimisation flag differences in meson vs CFFI, but it seems that this does not result from either the wrapper layer or any specific function (everything seems to just take a little longer).
  • I am now incorporating updates which have occured since the Beta, and the branch is not 100% functional even on CPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants