Skip to content

Convergence of bootstrapping results, nan values for small sample sizes #7

@privong

Description

@privong

If the sample size is small (or the number of bootstraps is large), the correlation coefficients can be undefined and return nan values. The use of np.percentile() then returns nan from pymccorrelation(). If there's many nan values this probably suggests the bootstrapping is not well-converged. When looking at the mock dataset to check recovery (#4), the convergence of bootstrapping would be good to consider.

Ultimately, decide if nanpercentile() should be used, optionally with a warning if the size of the dataset is too small for reliable bootstrap error estimation.

There is probably statistics literature about this too...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions