Use randomized sobol sampling #167

Vaibhavdixit02 · 2024-05-17T23:17:19Z

Checklist

Appropriate tests were added
Any code changes were done in a way that does not break public API
All documentation related to code changes were updated
The new code follows the
contributor guidelines, in particular the SciML Style Guide and
COLPRAC.
Any new documentation only uses public API

Additional context

Add any other context about the problem here.

Fixes #166

src/sobol_sensitivity.jl

Co-authored-by: David Widmann <[email protected]>

src/sobol_sensitivity.jl

Co-authored-by: David Widmann <[email protected]>

codecov · 2024-05-29T23:41:26Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.83%. Comparing base (71fcc16) to head (facbac8).

❗ Current head facbac8 differs from pull request most recent head 73c74eb

Please upload reports for the commit 73c74eb to get more accurate results.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #167      +/-   ##
==========================================
+ Coverage   92.78%   92.83%   +0.04%     
==========================================
  Files          11       11              
  Lines         832      837       +5     
==========================================
+ Hits          772      777       +5     
  Misses         60       60

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/sobol_sensitivity.jl

Co-authored-by: David Widmann <[email protected]>

devmotion · 2024-05-30T01:23:33Z

test/sobol_method.jl

@@ -22,7 +22,7 @@ function linear(X)
    A * X[1] + B * X[2]
 end

-n = 600000
+n = 524288
 lb = -ones(4) * π
 ub = ones(4) * π
 sampler = SobolSample()


This should be changed as well, shouldn't it?

Reading the comment from the PR that added RQMC that you had put in the issue. The OP mentions

To nuance my words, for the use case M=2 of sensitivity analysis this is probably not as bad as I show for small enough cases.

So it makes sense to enforce the randomization only for bootstrap cases, what do you think? Changing this is significantly affecting the results of these trivial tests and the indices values for this are pretty commonly accepted, though to be honest I guess the new values could be justified a bit better I am just worried about users from R trying to match the values and then having to defend this

Regardless of what this PR ends up doing: I think the tests, examples and recommendations should consistently do the same thing as the gsa implementations with samples kwarg (is this function tested at all if the changes do not affect the tests?).

My main concern is that generally doing non-standard things with low discrepancy sequences is a bad idea. E.g., even just omitting the initial point of zeros (as done in Sobol.jl) is a bad idea, as discussed in the issue in Sobol.jl and shown in https://arxiv.org/pdf/2008.08051. What's currently done in GlobalSensitivity - and what IMO rightfully shows a warning to users - is that the samples * num_mats samples are generated by taking a single consecutive subset of samples * num_mats and partitioning it: https://github.com/SciML/QuasiMonteCarlo.jl/blob/5c5483565d5b6a083256861bcf7e00cf7075c5f4/src/RandomizedQuasiMonteCarlo/iterators.jl#L268-L271 In general this won't preserve any of the distribution properties of the original sequence.

Switching to a proper RQMC approach would be much more sound. However, as far as I know, generally it shouldn't lead to (much) different/worse estimates of integrals, so something seems off or some other part of the algorithm seems to interact badly with the randomized sequences. A quick search reveals quite a few papers though regarding global sensitivity and RQMC, so I think it should work and be an improvement in general.

Regardless of the mathematical details, IMO just sticking with the current implementation is not a good idea for another reason: Users might get the same results as with other (R) packages - but every time they run gsa without pre-computed design matrices they will see a warning.

So in case it's too unclear yet how to fix the RQMC issues, I suggest as a temporary workaround to just do a single sample call (as in the implementation of generate_design_matrices that's currently used) in tests, examples, and gsa that at least makes it apparent how the samples are generated and fixes the warning (and should even lead to a more efficient gsa implementation since some of the partitioning and hcating can be avoided, only a single split is needed). Probably it would be good to still add a comment about this choice and links to RQMC in the docs.

This was pretty helpful, I need to get to it @devmotion though sorry for the delay.

Is this something blocking or making your work harder?

* Update shapley_sensitivity.jl * Update shapley_method.jl

src/shapley_sensitivity.jl

Update sobol_sensitivity.jl

52a924d

Vaibhavdixit02 mentioned this pull request May 17, 2024

Warnings because design matrices for Sobol's method are not randomized #166

Open

devmotion reviewed May 18, 2024

View reviewed changes

src/sobol_sensitivity.jl Outdated Show resolved Hide resolved

Vaibhavdixit02 and others added 2 commits May 18, 2024 12:49

Update src/sobol_sensitivity.jl

8142a18

Co-authored-by: David Widmann <[email protected]>

Update sobol_sensitivity.jl

846f306

devmotion reviewed May 18, 2024

View reviewed changes

src/sobol_sensitivity.jl Outdated Show resolved Hide resolved

Vaibhavdixit02 and others added 2 commits May 20, 2024 11:45

Update src/sobol_sensitivity.jl

a0d9364

Co-authored-by: David Widmann <[email protected]>

Change number of samples in sobol to 2^n and fix shapley indexing

04ab43b

rem old qmc support

d060c3c

Vaibhavdixit02 force-pushed the Vaibhavdixit02-patch-1 branch from e12a571 to b49608a Compare May 30, 2024 00:08

format

8fa44eb

Vaibhavdixit02 force-pushed the Vaibhavdixit02-patch-1 branch from b49608a to 8fa44eb Compare May 30, 2024 00:14

devmotion reviewed May 30, 2024

View reviewed changes

src/sobol_sensitivity.jl Outdated Show resolved Hide resolved

src/sobol_sensitivity.jl Show resolved Hide resolved

Vaibhavdixit02 and others added 3 commits May 29, 2024 20:23

Update src/sobol_sensitivity.jl

ff2dc3c

Co-authored-by: David Widmann <[email protected]>

Docs and tests

65f5d8d

sobol test value update

facbac8

devmotion reviewed May 30, 2024

View reviewed changes

missed updates

3b7cf99

Vaibhavdixit02 force-pushed the Vaibhavdixit02-patch-1 branch from 52c3ddf to 3b7cf99 Compare May 30, 2024 01:39

Vaibhavdixit02 added a commit that referenced this pull request May 30, 2024

Move Shapley fix from #167 (#168)

5a389f3

* Update shapley_sensitivity.jl * Update shapley_method.jl

Merge branch 'master' into Vaibhavdixit02-patch-1

19e83ad

Vaibhavdixit02 commented May 30, 2024

View reviewed changes

src/shapley_sensitivity.jl Outdated Show resolved Hide resolved

Update src/shapley_sensitivity.jl

73c74eb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use randomized sobol sampling #167

Use randomized sobol sampling #167

Vaibhavdixit02 commented May 17, 2024

codecov bot commented May 29, 2024 •

edited

Loading

devmotion May 30, 2024

Vaibhavdixit02 May 30, 2024

devmotion May 30, 2024 •

edited

Loading

Vaibhavdixit02 Jun 11, 2024

Vaibhavdixit02 Jun 11, 2024

Use randomized sobol sampling #167

Are you sure you want to change the base?

Use randomized sobol sampling #167

Conversation

Vaibhavdixit02 commented May 17, 2024

Checklist

Additional context

codecov bot commented May 29, 2024 • edited Loading

Codecov Report

devmotion May 30, 2024

Choose a reason for hiding this comment

Vaibhavdixit02 May 30, 2024

Choose a reason for hiding this comment

devmotion May 30, 2024 • edited Loading

Choose a reason for hiding this comment

Vaibhavdixit02 Jun 11, 2024

Choose a reason for hiding this comment

Vaibhavdixit02 Jun 11, 2024

Choose a reason for hiding this comment

codecov bot commented May 29, 2024 •

edited

Loading

devmotion May 30, 2024 •

edited

Loading