Best sigma choice based on algorithmic complexity by magisterbrown · Pull Request #868 · flatironinstitute/finufft

magisterbrown · 2026-06-11T21:10:20Z

This pr adds a new best_upsampling_factor_complexity function that sets the sigma when the user does not specify it. This function loops over all sigma values in the range [1.25, 2.00] with a step size of 0.05 and estimates number of flops. The total number of flops includes the sum of spreading and FFT steps. The prediction of the spreading complexity, which quite accurate according to emperical results, is equal to the number of input points times the nspread of each point. The following images show the predictions and measurements for two different transforms.

FFT complexity is estimated as nlog(n) where n is a grid size, because of different backends and FFT algorithm it does not predict complexity well. Total complexity equals to spreadinterp+1000FFT, where 1000 is a bit of an arbitrary constant that requires a bit of tuning.

Type 3 transforms have a different best sigma heuristics that suggests the lowest sigma satisfying the check_sigma function.

github-actions · 2026-06-11T21:20:55Z

FFT backend: DUCC

Perftest plot

Numbers are advisory: GitHub-hosted runners have variable hardware. Treat <1.10× as noise.

CPU and compiler configuration

CPU name: AMD EPYC 7763 64-Core Processor.

Arch: X86_64.

Core count: 2.

ISA extensions: 3dnowext, 3dnowprefetch, abm, adx, aes, aperfmperf, apic, arat, avx, avx2, bmi1, bmi2, clflush, clflushopt, clwb, clzero, cmov, cmp_legacy, constant_tsc, cpuid, cr8_legacy, cx16, cx8, de, decodeassists, erms, extd_apicid, f16c, flushbyasid, fma, fpu, fsgsbase, fsrm, fxsr, fxsr_opt, ht, hypervisor, invpcid, lahf_lm, lm, mca, mce, misalignsse, mmx, mmxext, movbe, msr, mtrr, nonstop_tsc, nopl, npt, nrip_save, nx, osvw, osxsave, pae, pat, pausefilter, pcid, pclmulqdq, pdpe1gb, pfthreshold, pge, pni, popcnt, pse, pse36, rdpid, rdpru, rdrand, rdrnd, rdseed, rdtscp, rep_good, sep, sha, sha_ni, smap, smep, sse, sse2, sse4_1, sse4_2, sse4a, ssse3, svm, syscall, topoext, tsc, tsc_known_freq, tsc_reliable, tsc_scale, umip, user_shstk, v_vmsave_vmload, vaes, vmcb_clean, vme, vmmcall, vpclmulqdq, xgetbv1, xsave, xsavec, xsaveerptr, xsaveopt, xsaves.

Compiler version: c++ (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0.

Compiler flags: -march=native.

perftest commands

taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=f --N1=10000.0 --N2=1 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=0.002 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=1
taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=f --N1=10000.0 --N2=1 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=0.002 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=2
taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=f --N1=10000.0 --N2=1 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=0.002 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=3
taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=d --N1=10000.0 --N2=1 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=1e-09 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=1
taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=d --N1=10000.0 --N2=1 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=1e-09 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=2
taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=d --N1=10000.0 --N2=1 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=1e-09 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=3
taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=f --N1=320 --N2=320 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=0.0001 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=1
taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=f --N1=320 --N2=320 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=0.0001 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=2
taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=f --N1=320 --N2=320 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=0.0001 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=3
taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=d --N1=320 --N2=320 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=1e-09 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=1
taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=d --N1=320 --N2=320 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=1e-09 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=2
taskset -c 0 /home/runner/work/finufft/finufft/builds/master/perftest/perftest --arg --prec=d --N1=320 --N2=320 --N3=1 --ntransf=1 --threads=1 --M=10000000.0 --tol=1e-09 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=3
/home/runner/work/finufft/finufft/builds/master/perftest/perftest --prec=f --N1=320 --N2=320 --N3=1 --ntransf=1 --threads=0 --M=10000000.0 --tol=0.0001 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=1
/home/runner/work/finufft/finufft/builds/master/perftest/perftest --prec=f --N1=320 --N2=320 --N3=1 --ntransf=1 --threads=0 --M=10000000.0 --tol=0.0001 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=2
/home/runner/work/finufft/finufft/builds/master/perftest/perftest --prec=f --N1=320 --N2=320 --N3=1 --ntransf=1 --threads=0 --M=10000000.0 --tol=0.0001 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=3
/home/runner/work/finufft/finufft/builds/master/perftest/perftest --prec=d --N1=192 --N2=192 --N3=128 --ntransf=1 --threads=0 --M=10000000.0 --tol=1e-07 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=1
/home/runner/work/finufft/finufft/builds/master/perftest/perftest --prec=d --N1=192 --N2=192 --N3=128 --ntransf=1 --threads=0 --M=10000000.0 --tol=1e-07 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=2
/home/runner/work/finufft/finufft/builds/master/perftest/perftest --prec=d --N1=192 --N2=192 --N3=128 --ntransf=1 --threads=0 --M=10000000.0 --tol=1e-07 --n_runs=15 --sort=1 --upsampfact=0 --kerevalmethod=1 --debug=0 --bandwidth=1.0 --type=3

DiamonDinoia · 2026-06-12T16:05:30Z

It is good practise to open a PR when ctest and the other unit test at least work locally. If they break, is the test wrong or the code wrong?

Best sigma derived from complexity.

8d55f43

github-actions Bot added a commit that referenced this pull request Jun 11, 2026

Perftest image PR #868 @ 8d55f43 [no ci]

9962219

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best sigma choice based on algorithmic complexity#868

Best sigma choice based on algorithmic complexity#868
magisterbrown wants to merge 1 commit into
flatironinstitute:masterfrom
magisterbrown:best-complexity-simd

magisterbrown commented Jun 11, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

DiamonDinoia commented Jun 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

magisterbrown commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

DiamonDinoia commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

magisterbrown commented Jun 11, 2026 •

edited

Loading

DiamonDinoia commented Jun 12, 2026 •

edited

Loading