Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translational efficiency analysis doesn't work when there are two conditions with less than three replicates #66

Open
Xinlei-Gao opened this issue Aug 2, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@Xinlei-Gao
Copy link

Description of the bug

I have tried to run this pipeline on a dataset with matched Ribo-seq and RNA-seq data, to perform differential translational efficiency analysis. My dataset contained three conditions, two knockouts and a control, each condition with two biological replicates.

First I tried to provide a contrasts file with two comparisons as below:

id,variable,reference,target,batch,pair
EIF5A_KO_vs_control,treatment,control,EIF5A_KO,,pair
DOHH_KO_vs_control,treatment,control,DOHH_KO,,pair

But it reported an error as follows:

Error in anota2seqCheckInput(dataP = Anota2seqDataSet@dataP, dataT = Anota2seqDataSet@dataT, :
Too few custom contrasts supplied.
Please check your contrast matrix.
Calls: anota2seqRun -> anota2seqCheckInput
Execution halted

Then I tried to do two comparisons separately, one at a time.
I created two contrasts files as below:

contrast_1.csv:

id,variable,reference,target,batch,pair
EIF5A_KO_vs_control,treatment,control,EIF5A_KO,,pair

and contrast_2.csv:

id,variable,reference,target,batch,pair
DOHH_KO_vs_control,treatment,control,DOHH_KO,,pair

It worked fine until it performed translational efficiency analysis by 'anota2seq'. It reported an error:

Error in anota2seqCheckInput(dataP, dataT, phenoVec, batchVec, NULL, "BH", :
Sample class control has less than three samples.
anota2seq needs at least 3 samples per sample class if there are only 2 sample classes.
Calls: do.call -> -> anota2seqCheckInput
Execution halted

I checked the documentation of 'anota2seq', and it turned out that 'anota2seq' can only deal with two conditions with at least three replicates.

I think it maybe helps to clarify it in the pipeline documentation that "the step of translational efficiency analysis can only deal with two conditions with at least three replicates in each condition". So the users can skip this step if their data doesn't meet the requirement.

I also saw 'anota2seq' can perform the analysis when there are more than two conditions and each with less than three replicates, however, it needs special parameter settings. It would be ideal to include that in the pipeline, if it is applicable.

Thank you for your contributions!

Command used and terminal output

nextflow run nf-core/riboseq \
   -profile singularity \
   --input samplesheet.csv \
   --contrasts contrasts.csv \
   --multiqc_title 'multiQCReport' \
   --fasta /nfs/genomes/human_hg38_dec13_no_random/fasta_whole_genome/hg38.fa \
   --gtf /nfs/genomes/human_hg38_dec13_no_random/gtf/Homo_sapiens.GRCh38.106.canonical.gtf \
   --outdir ./nextflow_RiboSeq "

Relevant files

No response

System information

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant