Better integration #608

davidnwobi · 2024-08-29T04:37:24Z

Updated models that require 1D integration to use new integration system. The new model have "_high_res" appended to their names.

The updated models are:

Core-Shell Bicelle
Core-Shell Cylinder
Core-Shell Ellipsoid
Cylinder
Ellipsoid
Flexible Cylinder with Elliptical Cross-section
Hollow Cylinder

Details of the method can be found at this repo

A more accurate cylinder model has been added to the sasmodels library and library functions have been added to support the model

The new models are appended with "_high_res" to indicate that they are using the new integration system.

pkienzle · 2024-09-06T00:30:52Z

(1) Using Δq = 2 π / d_max as a period spacing, and selecting, say, k=5 sample points per period, then you should get a ballpark estimate for the number of integration points by looking at the arc length at radius |q|. That is, Δq = |q| k Δθ, so step size in theta would be Δθ = Δq/(k |q|), and number of steps would be n = (π/2) / Δθ = k |q| d_max / 4.

How does this value compare to the n estimated by Gauss-Kronrod rule? If it is good enough, then we can apply it more easily across all the shape models without a bespoke spline fit for each model type.

(2) Rather than generic Gauss-Legendre, we could look at the structure of the function and select more points where it is larger (importance sampling). The problematic shapes have high eccentricity, which will show up near θ=0° or θ=90°.

(3) There are analytic approximations for large disks and long rods (#109). Can these be used for large q? Can they be extended to other models?

(4) The current code cannot run in OpenCL since the arrays are too large to put in constant memory. Not sure how to allocate them in main memory and make them accessible to the kernels.

(5) We could unroll the loops, computing the different (q, θ) values in parallel, then summing afterward. This would speed up the kernels a lot on high end graphics cards.

(6) Ideally we would update all models to use dynamic integration rather than having normal and high-resolution versions. It would be nice to have a mechanism to set the precision.

(7) What is the target precision on your models? Do you really need 32000 angles for them? You may need to be more careful about summing the terms to avoid loss in precision when adding many small numbers to a large total.

pkienzle · 2024-09-06T00:50:48Z

Note that you can adjust the number of points in the loop for an existing model. You can see this with

python -m sasmodels.compare -ngauss=512 ellipsoid

This is roughly equivalent to the following [untested]:

from sasmodels.core import load_model_info
from sasmodels.generate import set_integration_size
model_info = load_model_info('ellipsoid')
set_integration_size(model_info, 4096)

SasView could do this now, giving the number of theta points as a control parameter and sending that to sasmodels before evaluating. Not as good as having the model set the number of theta points as done in this PR, but it immediately applies to all models.

davidnwobi · 2024-09-06T11:38:39Z

Thanks you for your feedback, there are several aspects here that I should have considered earlier.

Comparison of integration points: I'll take a closer look and compare your method to the Gauss-Kronrod rule. While I'm no longer working with STFC, I do have some free time, so I'll run the comparisons and see how they stack up.
Importance sampling: I completely agree. For the cylinder model, peaks do occur near the ends of the interval, so targeting those areas would definitely improve efficiency. While it might involve more work, the payoff could be significant.
Analytical approximations: They can be used for large values of q but It also depends on the values of the other parameters. For example, for the cylinder model there are good approximations when either $q R$ and $\frac{q L}{2}$ is small. For other models, we might be able to find regions where certain parameters are small and see if the remaining part of the function can be integrated analytically.
OpenCL compatibility: I've had similar issues with OpenCL. I haven’t been able to get either the updated or default models running with OpenCL, whether in sasmodels or SasView. This might be a bug. Interestingly, CUDA works fine,AMD’s parallel processing works but runs out of memory for the updated models (integrated graphics though). The OpenCL definitely needs some further investigation.
Loop Unrolling: I could see if it's at least being done with theta at the moment.
Having precision control would be ideal, especially if the tolerance can vary based on the model or region. This looks doable, but it would require some more work.
What is the target precision on your models?

Target tolerance is a relative tolerance of 1 $\times 10^{-3}$.
As for needing 32,000 angles—no, not for the entire parameter space. Gauss-Quadrature struggles with highly oscillatory functions. it has a very slow convergence as the function becomes more oscillatory. Quick example for a decaying oscillatory
function:
$e^{- \sqrt{k} x} \sin(k x)$

With $k = 4000$:

$n = 512$; Rel error = $3.547640 \times 10^{-1}$
$n = 600$; Rel error = $4.000811 \times 10^{-2}$
$n = 700$; Rel error = $4.470038 \times 10^{-4}$
$n = 800$; Rel error = $1.042314 \times 10^{-4}$
$n = 900$; Rel error = $5.589838 \times 10^{-7}$
$n = 1024$; Rel error = $8.715261 \times 10^{-12}$

With $k = 400000$:

$n = 16384$; Rel error = $2.155721 \times 10^{0}$
$n = 20768$; Rel error = $1.492304 \times 10^{-1}$
$n = 22768$; Rel error = $5.929450 \times 10^{-2}$
$n = 24768$; Rel error = $7.977614 \times 10^{-3}$
$n = 26768$; Rel error = $2.309563 \times 10^{-4}$
$n = 28768$; Rel error = $4.160178 \times 10^{-4}$
$n = 30768$; Rel error = $3.771680 \times 10^{-5}$
$n = 32768$; Rel error = $5.919487 \times 10^{-6}$

Due to memory/space limitations, the number of points is selected from a discrete set (powers of 2 from 1 to 15), so that only those points need to be stored and loaded into memory. As a result, if it selects 32,768 points, it only means 16,384 wouldn’t be enough. It doesn't check any of the possibilities in between. It could have gotten away with less but it can't figure that out. A bit crude but it helps. Do you have any suggestions on a better way to handle this?

I'll look some of these issues.

Comparing with the Gauss Kronrod rule
And a dynamic way of setting the tolerance.

wpotrzebowski · 2024-10-27T20:35:18Z

This PR is poiniting to SasView:master. Shouldn't be main?

wpotrzebowski · 2024-10-27T20:37:18Z

This PR is poiniting to SasView:master. Shouldn't be main?

Sorry, my bad it is sasmodels...

pkienzle · 2025-02-01T01:23:20Z

Hacking the cylinder high res so that it returns the $n = 2^w$ for $w$ returned by the poly_eval function I can display the number of quadrature points $n$ as a function of Q.

Here is a plot of $n$ for radius R=1000, length L=20. The polynomial uses QR and QL, so this is showing behaviour for radius up to 1000000 when $Q<1$. It is not linear in Q like I was predicting.

Changing to R=1000, L=2000 so that the aspect ratio is larger changes the shape, but generally is within a factor of two:

Core shell bicelle $n$ scales similarly with the outer radius and cylinder length.

Hollow cylinder gives strange results, cutting off at $n=300$. This is even for radius=0, which means it should have the same behaviour as cylinder. I didn't look into the polynomial generator to see if it has artificial limits applied. [Note: I'm used the Integral-Fitting-Tools repo for my tests.]

Flexible cylinder elliptical $n$ is driven primarily by radius. Curiously, within existing sasmodels the flexible cylinder elliptical with one segment (i.e., kuhn length = length) is more accurate than the cylinder model when given a very long cylinder length. Maybe something special in Sk_WR?

Before we consider merging, we should spot check some of these results using sympy and mpmath. This can be added to explore/symint.py. For now symint is only checking whether increasing $n$ changes the answer, not whether the answer it gives is correct.

pkienzle · 2025-07-25T23:48:26Z

I implemented a proof of concept adaptive integration for cylinder using a qr heuristic:

https://github.com/SasView/sasmodels/blob/c0eb28eebd9387a5182859e568019d65b5a81cbf/sasmodels/models/lib/adaptive.c#L11203C1-L11213C2

Here's the updated cylinderp model (p for precision):

sasmodels/sasmodels/models/cylinderp.c

Lines 71 to 82 in c0eb28e

    
           double qr = q*(radius > length ? radius : length); 
        
           double *w, *z; 
        
           int n = gauss_weights(qr, &w, &z); 
        
           for (int i=0; i<n ;i++) { 
        
               const double theta = z[i]*zm + zb; 
        
               double sin_theta, cos_theta; // slots to hold sincos function output 
        
               // theta (theta,phi) the projection of the cylinder on the detector plane 
        
               SINCOS(theta , sin_theta, cos_theta); 
        
               const double form = _fq(q*sin_theta, q*cos_theta, radius, length); 
        
               total_F1 += w[i] * form * sin_theta; 
        
               total_F2 += w[i] * form * form * sin_theta; 
        
           }

These are available on the ticket-535-adaptive-integration branch.

I called the adaptive model cylinderp and left cylinder in place so that it is easy to compare accuracy and performance.

It has similar run speed for well behaved models:

$ python -m sasmodels.compare -exq cylinder,cylinderp -ngauss=0,0 radius=10 length=20 -exq -nq=1000                                              
DLL[64] t=3.65 ms, intensity=2151.   <==== cost using 76 gauss points everywhere
DLL[64] t=4.40 ms, intensity=2151.   <==== cost for adaptive model
|DLL[64]-DLL[64]|            max:3.494e-11  median:0.000e+00  98%:1.552e-11  rms:3.981e-12  zero-offset:+8.329e-13
|(DLL[64]-DLL[64])/DLL[64]|  max:3.490e-08  median:0.000e+00  98%:1.551e-08  rms:3.979e-09  zero-offset:+8.324e-10

It is a lot slower for pathological models:

$ python -m sasmodels.compare -exq cylinder,cylinderp -ngauss=10000,0 radius=10 length=20000 -exq -nq=1000
DLL[64] t=227.83 ms, intensity=11740.    <==== cost using 10000 gauss points everywhere
DLL[64] t=97.11 ms, intensity=11740        <==== cost for adaptive model
|DLL[64]-DLL[64]|            max:3.803e-08  median:7.744e-10  98%:1.406e-08  rms:4.119e-09  zero-offset:+1.507e-09
|(DLL[64]-DLL[64])/DLL[64]|  max:1.018e-05  median:5.512e-10  98%:5.591e-06  rms:1.395e-06  zero-offset:+4.587e-07

but much more accurate:

$ python -m sasmodels.compare -exq cylinder,cylinder -ngauss=10000,0 radius=10 length=20000 -exq -nq=1000 
DLL[64] t=221.55 ms, intensity=11740   <==== cost using 10000 gauss points everywhere
DLL[64] t=4.54 ms, intensity=11739      <==== cost using 76 gauss points everywhere
|DLL[64]-DLL[64]|            max:7.673e-01  median:8.364e-03  98%:6.608e-01  rms:2.538e-01  zero-offset:+1.487e-01
|(DLL[64]-DLL[64])/DLL[64]|  *** max:3.545e+00 *** median:2.664e-02  98%:2.524e+00  rms:5.524e-01  zero-offset:+1.938e-01

It should be pretty easy to go through all our shapes and update them to adaptive.

For shapes where we are integrating or θ and φ we should consider using specialized spherical point schemes, such as Lebedev quadrature (wikipedia).

pkienzle · 2025-08-01T20:57:40Z

With the repo at https://github.com/davidnwobi/Integral-Fitting-Tool and the sasmodels.compare tool we can compare model values and execution time.

Here is an example showing a 22% deviation between a 10000 point gaussian integration scheme and ellipsoid_high_res:

$ python -m sasmodels.compare -double! background=0 ellipsoid,ellipsoid_high_res.py -ngauss=10000,0 -random=656333 -pars -nq=1000
Randomize using -random=656333
==== ellipsoid =====
scale: 0.00173375
background: 0
sld: 11.8516
sld_solvent: 6.02864
radius_polar: 659.823
radius_equatorial: 33071.7
==== ellipsoid_high_res =====
scale: 0.00173375
background: 0
sld: 11.8516
sld_solvent: 6.02864
radius_polar: 659.823
radius_equatorial: 33071.7
DLL[64] t=84.29 ms, intensity=1160661912
DLL[64] t=4.17 ms, intensity=1160661682
|DLL[64]-DLL[64]|            max:1.220e+01  median:1.630e-06  98%:3.440e+00  rms:9.596e-01  zero-offset:+2.304e-01
|(DLL[64]-DLL[64])/DLL[64]|  max:2.223e-01  median:4.301e-10  98%:8.113e-03  rms:1.311e-02  zero-offset:+1.434e-03

Speed is generally comparable to the 76 point non-adaptive integration, being faster for small shapes but slower for large shapes. For example:

$ python -m sasmodels.compare -double! background=0 ellipsoid,ellipsoid_high_res.py -ngauss=76,0 -random=65012 -pars -neval=10
Randomize using -random=65012
==== ellipsoid =====
scale: 0.027392
background: 0
sld: 6.25461
sld_solvent: 5.21939
radius_polar: 96.5405
radius_equatorial: 83119.2
...
DLL[64] t=0.15 ms, intensity=10960877
DLL[64] t=1.16 ms, intensity=10968737
...

The simple adaptive integration branch which uses qr as a heuristic to decide the number of integration points is more accurate but generally slower than the high res branch. When running the simple integration branch on the GPU with polydispersity (Apple M2) the speed comparison is mixed, sometimes 2x faster and sometimes 2x slower than the high precision branch on CPU.

Here's a shape with a large difference between the high res integration and the 10000 point integration (85x):

$ python -m sasmodels.compare background=0 core_shell_cylinder,core_shell_cylinder_high_res.py -double! -ngauss=10000,0 -random=83174 -nq=1000 -pars
Randomize using -random=83174
...
==== core_shell_cylinder_high_res =====
scale: 0.291548
background: 0
sld_core: 8.92736
sld_shell: 11.2834
sld_solvent: 10.7719
radius: 291.224
thickness: 33528.1
length: 1.53983
DLL[64] t=41.60 ms, intensity=36122229857
DLL[64] t=172.93 ms, intensity=28338819779
|DLL[64]-DLL[64]|            max:1.717e+08  median:7.214e+01  98%:1.290e+08  rms:2.878e+07  zero-offset:+7.783e+06
|(DLL[64]-DLL[64])/DLL[64]|  max:8.495e+01  median:3.732e-01  98%:2.028e+01  rms:6.405e+00  zero-offset:+2.910e+00

The simple adaptive scheme is also out of spec for this shape (3.5% difference).

davidnwobi and others added 12 commits August 4, 2024 22:23

New cylinder model and library functions

a075f0c

A more accurate cylinder model has been added to the sasmodels library and library functions have been added to support the model

New models for cylinder and core_shell_bicelle

f53d4e1

Added New Models

1d10a05

New model for flexible cylinder

6096386

added hollow_cylinder2 model

fd792fb

Pringle transformation in progress

dfa8fd6

Updating the models

287572b

Initial commit

e858eaa

Added models with new integration system

49fea21

The new models are appended with "_high_res" to indicate that they are using the new integration system.

Deleted benchamrking_files

9893480

Some clean up

2c704ec

Specfied upper limits for the _high_res models

44ccd75

pkienzle mentioned this pull request Jul 30, 2025

Simple adaptive integration #658

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Better integration #608

Better integration #608

Uh oh!

davidnwobi commented Aug 29, 2024 •

edited

Loading

Uh oh!

pkienzle commented Sep 6, 2024

Uh oh!

pkienzle commented Sep 6, 2024

Uh oh!

davidnwobi commented Sep 6, 2024 •

edited

Loading

Uh oh!

wpotrzebowski commented Oct 27, 2024

Uh oh!

wpotrzebowski commented Oct 27, 2024

Uh oh!

pkienzle commented Feb 1, 2025

Uh oh!

pkienzle commented Jul 25, 2025

Uh oh!

pkienzle commented Aug 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Better integration #608

Are you sure you want to change the base?

Better integration #608

Uh oh!

Conversation

davidnwobi commented Aug 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pkienzle commented Sep 6, 2024

Uh oh!

pkienzle commented Sep 6, 2024

Uh oh!

davidnwobi commented Sep 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wpotrzebowski commented Oct 27, 2024

Uh oh!

wpotrzebowski commented Oct 27, 2024

Uh oh!

pkienzle commented Feb 1, 2025

Uh oh!

pkienzle commented Jul 25, 2025

Uh oh!

pkienzle commented Aug 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

davidnwobi commented Aug 29, 2024 •

edited

Loading

davidnwobi commented Sep 6, 2024 •

edited

Loading