Translational efficiency analysis doesn't work when there are two conditions with less than three replicates #66

Xinlei-Gao · 2024-08-02T15:43:44Z

Description of the bug

I have tried to run this pipeline on a dataset with matched Ribo-seq and RNA-seq data, to perform differential translational efficiency analysis. My dataset contained three conditions, two knockouts and a control, each condition with two biological replicates.

First I tried to provide a contrasts file with two comparisons as below:

id,variable,reference,target,batch,pair
EIF5A_KO_vs_control,treatment,control,EIF5A_KO,,pair
DOHH_KO_vs_control,treatment,control,DOHH_KO,,pair

But it reported an error as follows:

Error in anota2seqCheckInput(dataP = Anota2seqDataSet@dataP, dataT = Anota2seqDataSet@dataT, :
Too few custom contrasts supplied.
Please check your contrast matrix.
Calls: anota2seqRun -> anota2seqCheckInput
Execution halted

Then I tried to do two comparisons separately, one at a time.
I created two contrasts files as below:

contrast_1.csv:

id,variable,reference,target,batch,pair
EIF5A_KO_vs_control,treatment,control,EIF5A_KO,,pair

and contrast_2.csv:

id,variable,reference,target,batch,pair
DOHH_KO_vs_control,treatment,control,DOHH_KO,,pair

It worked fine until it performed translational efficiency analysis by 'anota2seq'. It reported an error:

Error in anota2seqCheckInput(dataP, dataT, phenoVec, batchVec, NULL, "BH", :
Sample class control has less than three samples.
anota2seq needs at least 3 samples per sample class if there are only 2 sample classes.
Calls: do.call -> -> anota2seqCheckInput
Execution halted

I checked the documentation of 'anota2seq', and it turned out that 'anota2seq' can only deal with two conditions with at least three replicates.

I think it maybe helps to clarify it in the pipeline documentation that "the step of translational efficiency analysis can only deal with two conditions with at least three replicates in each condition". So the users can skip this step if their data doesn't meet the requirement.

I also saw 'anota2seq' can perform the analysis when there are more than two conditions and each with less than three replicates, however, it needs special parameter settings. It would be ideal to include that in the pipeline, if it is applicable.

Thank you for your contributions!

Command used and terminal output

nextflow run nf-core/riboseq \
   -profile singularity \
   --input samplesheet.csv \
   --contrasts contrasts.csv \
   --multiqc_title 'multiQCReport' \
   --fasta /nfs/genomes/human_hg38_dec13_no_random/fasta_whole_genome/hg38.fa \
   --gtf /nfs/genomes/human_hg38_dec13_no_random/gtf/Homo_sapiens.GRCh38.106.canonical.gtf \
   --outdir ./nextflow_RiboSeq "

Relevant files

No response

System information

No response

Xinlei-Gao added the bug Something isn't working label Aug 2, 2024

FelixKrueger mentioned this issue Jan 30, 2025

anota2seq fails with more than 2 levels in the samplesheet phenotype / contrast variable #89

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translational efficiency analysis doesn't work when there are two conditions with less than three replicates #66

Translational efficiency analysis doesn't work when there are two conditions with less than three replicates #66

Xinlei-Gao commented Aug 2, 2024

Translational efficiency analysis doesn't work when there are two conditions with less than three replicates #66

Translational efficiency analysis doesn't work when there are two conditions with less than three replicates #66

Comments

Xinlei-Gao commented Aug 2, 2024

Description of the bug

Command used and terminal output

Relevant files

System information