Ewald: Performance optimization of various routines

For this library to be remotely competitive, almost every routine will need to be heavily optimized. We'll start with the single-threaded version. 

Right now, the potential evaluators have been reasonably well optimized, though there's more work everywhere else. We should write some basic benchmarking code, similar to what's already being using with `nanobench` inside [examples/ewald_total.cpp](../tree/ewald/examples/ewald_total.cpp). We should avoid benchmarking pure library calls like `ducc0`'s fft, but any routines we write that appear somewhat intensive should be optimized. A good start might be the application of `G(k)` after the first application of the fourier transform.

Profile the code, find the hotspots, benchmark them for a baseline, and then and refactor them or SIMD optimize them to be efficient as possible. We're currently using SCTL for SIMD vectorization, so you can look at the code we're using for the potential calculation as a reference for manually vectorizing some loops.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ewald: Performance optimization of various routines #20

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Ewald: Performance optimization of various routines #20

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions