Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparing Networks of different sample size #136

Open
nmacknight opened this issue Feb 12, 2025 · 0 comments
Open

Comparing Networks of different sample size #136

nmacknight opened this issue Feb 12, 2025 · 0 comments

Comments

@nmacknight
Copy link

Hi!

Loving NetCoMi !!

I have datasets of different sizes and would like to compare the network metrics of these various datasets. However, through my own troubleshooting I see that the replicate size has an effect on the network metrics and so it would be ideal to have the same replicate size when comparing networks.

My Question is, how do people handle different replicate sizes when constructing networks they intend to compare?

Here, I have four datasets, Resistant has 39 samples and the others have 150,170,200. So I ran this loop to sample 10 replicates up to 200 replicates in intervals of 10 to see if there was a saturation point where adding more samples no longer has an effect on network metrics. Observationally, anything over 50 really isnt adding much clarity. The red vertical line represents 39 samples. This is the same idea when looking at read depth in a rarefaction curve.

Should I just sample at the lowest replicate number among the datasets I plan to compare so each dataset randomly pulls 39 replicates? The thing is that the 39 samples that happen to be pulled from the larger datasets will be slightly different each time. If I wanted to capture that variation I could iteratively sample, say 1000 times, but then my Resistant dataset would have no variation because the max number of its samples are being sampled which is a bias. So then I should consider sampling less than 39 samples, but how can that number be selected in a formal way? This is just my train of thought and I am seeking a formal decision-making method for comparing networks of different replicate size. Thank you for your time!

Image
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant