Skip to content

Scaden simulate ValueError and choosing input parameters #124

@elise-smith

Description

@elise-smith

Hi Kevin,

Thank you for the great package.

I am trying to run scaden simulate on a .h5ad object with ~21,000 cells and 25 cell types. I previously ran this on another .h5ad object successfullly.

I am using the following command:
scaden simulate --out /data/Deconvolution/Scaden/Output/ --cells 200 --n_samples 1000 --data /data/Deconvolution/Scaden/Input/ --data-format h5ad --pattern *.h5ad

However, I receive the following error:

INFO     Datasets: ['data']                            bulk_simulator.py:84
INFO     Simulating data from data                     bulk_simulator.py:89
INFO     Loading data dataset ...                     bulk_simulator.py:141
INFO     Merging unknown cell types: ['unknown']           bulk_simulator.py:107
INFO     Subsampling data ...                         bulk_simulator.py:110
Traceback (most recent call last):
  File "/data/anaconda/envs/scaden/bin/scaden", line 8, in <module>
    sys.exit(main())
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/__main__.py", line 48, in main
    cli()
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/__main__.py", line 207, in simulate
    simulation(
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/simulate.py", line 22, in simulation
    bulk_simulator.simulate()
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/simulation/bulk_simulator.py", line 90, in simulate
    self.simulate_dataset(dataset)
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/simulation/bulk_simulator.py", line 114, in simulate_dataset
    tmp_x, tmp_y = self.create_subsample_dataset(
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/simulation/bulk_simulator.py", line 253, in create_subsample_dataset
    sample, label = self.create_subsample(x, y, celltypes)
  File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/simulation/bulk_simulator.py", line 305, in create_subsample
    cells_fraction = np.random.randint(0, cells_sub.shape[0], samp_fracs[i])
  File "mtrand.pyx", line 748, in numpy.random.mtrand.RandomState.randint
  File "_bounded_integers.pyx", line 1247, in numpy.random._bounded_integers._rand_int64
ValueError: high <= 0

My matrix of the input .h5ad looks like this:
adata[0:5,0:5].X.todense()
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0.9539254, 1.9078507, 0. ],
[0. , 0. , 0. , 4.070004 , 0. ]

Please could you let me know if you know how I might be able to fix this.

Additionally, do you have any advice on how to select the --cells and --n_samples parameters or can these generally be kept as the default values?

Many thanks,
Elise

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions