NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC (SRR15731653)` terminated with an error exit status (1) #1348

Dokmen · 2024-07-25T14:22:05Z

Description of the bug

Hi,
I have been using nf-core/rnasq for a long time, but with the update of Nextflow, I am receiving the following error messages. I downloaded my fastq files with aspera and there was no breakage. I don't understand why this is happening. I've never encountered this error before. I downloaded my fastq files again and again and installed nextflow on conda again and I get this error every time I run it. Does this error have anything to do with the new update? I am using HPC, maximum cpu is 40 or 56 and maximum ram is 190 or 380. I send the slurm job to the queue and use 3 nodes. Can you help with this issue?

Command used and terminal output

$
#SBATCH -N 3
#SBATCH --ntasks-per-node=56
#SBATCH --time=72:00:00
#SBATCH --output=/truba_scratch/dokmen/StarRsem/test.log
#SBATCH --error=/truba_scratch/dokmen/StarRsem/test.err 
echo "SLURM_NODELIST $SLURM_NODELIST"
echo "NUMBER OF CORES $SLURM_NTASKS"
echo "SLURM_CPUS_PER_TASK $SLURM_CPUS_PER_TASK"
eval "$(/truba/home/dokmen/miniconda3/bin/conda shell.bash hook)"
conda activate nf-core
export NXF_CLUSTER_SEED=$(shuf -i 0-16777216 -n 1)
export NXF_OPTS="-Xms500M -Xmx2G"
wdir=/truba_scratch/dokmen/StarRsem/ 
cd $wdir
nextflow run \
   nf-core/rnaseq -r 3.14.0\
     -profile singularity \
     -params-file nf-params.json \
     -c GSE.config        
exit

config file: 

params {
    config_profile_name        = 'GSE183533 profile'
    // Limit resources so that this can run on GitHub Actions
    max_cpus   = 56
    max_memory = '190.GB'
    max_time   = '72.h'
}

process {
    errorStrategy = { task.exitStatus in [143,137,104,134,139,140,247] ? 'retry' : 'finish' }
    maxRetries    = 2

    // process labels

}


singularity {
  enabled = true
  autoMounts = true
  cacheDir = '/truba/home/dokmen/.singularity'
}

process {
executer ='slurm'
scratch = true
submitRateLimit = '10 sec'
queueSize = 50
}

// When using RSEM, remove warning from STAR whilst building tiny indices
process {
    withName: 'RSEM_PREPAREREFERENCE_GENOME' {
        ext.args2 = "--genomeSAindexNbases 7"
    }
}

Output:

executor >  local (5)
[-        ] NFC…REPARE_GENOME:GUNZIP_FASTA | 0 of 1
[-        ] NFC…:PREPARE_GENOME:GUNZIP_GTF | 0 of 1
[-        ] NFC…:PREPARE_GENOME:GTF_FILTER -
[-        ] NFC…ME:GUNZIP_TRANSCRIPT_FASTA | 0 of 1
[-        ] NFC…_TRANSCRIPTS_FASTA_GENCODE -
[-        ] NFC…ENOME:CUSTOM_GETCHROMSIZES -
[-        ] NFC…EM_PREPAREREFERENCE_GENOME -
[-        ] NFCORE_RNASEQ:RNASEQ:CAT_FASTQ -
[7e/0ad9d7] NFC…ALORE:FASTQC (SRR15731644) | 4 of 41, failed: 4
[fb/bc2726] NFC…E:TRIMGALORE (SRR15731658) | 1 of 41, failed: 1
[-        ] NFC…PLE_FQ_SALMON:SALMON_INDEX -
[-        ] NFC…PLE_FQ_SALMON:FQ_SUBSAMPLE -
[-        ] NFC…PLE_FQ_SALMON:SALMON_QUANT -
[-        ] NFC…M:RSEM_CALCULATEEXPRESSION -
[-        ] NFC…ATS_SAMTOOLS:SAMTOOLS_SORT -
[-        ] NFC…TS_SAMTOOLS:SAMTOOLS_INDEX -
[-        ] NFC…TS_SAMTOOLS:SAMTOOLS_STATS -
[-        ] NFC…SAMTOOLS:SAMTOOLS_FLAGSTAT -
[-        ] NFC…SAMTOOLS:SAMTOOLS_IDXSTATS -
[-        ] NFC…IFY_RSEM:RSEM_MERGE_COUNTS -
Plus 25 more processes waiting for tasks…
-[nf-core/rnaseq] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC (SRR15731653)'

Caused by:
  Process `NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC (SRR15731653)` terminated with an error exit status (1)


Command executed:

  printf "%s %s\n" SRR15731653_1.fastq.gz SRR15731653_1.gz SRR15731653_2.fastq.gz SRR15731653_2.gz | while read old_name new_name; do
      [ -f "${new_name}" ] || ln -s $old_name $new_name
  done
  
  fastqc \
      --quiet \
      --threads 6 \
      SRR15731653_1.gz SRR15731653_2.gz
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC":
      fastqc: $( fastqc --version | sed '/FastQC v/!d; s/.*v//' )
  END_VERSIONS

Command exit status:
  1

Command output:
  application/gzip
  application/gzip

Command error:
  INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
  INFO:    Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
  INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
  application/gzip
  application/gzip
  Failed to process file SRR15731653_2.gz
  uk.ac.babraham.FastQC.Sequence.SequenceFormatException: Midline 'F,FFFFFFCCCATGGA868:186:H2G5KDSX2:2:1105:26865:8343/2' didn't start with '+' at 885703
  	at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:179)
  	at uk.ac.babraham.FastQC.Sequence.FastQFile.next(FastQFile.java:129)
  	at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:77)
  	at java.base/java.lang.Thread.run(Thread.java:833)
  Failed to process file SRR15731653_1.gz
  uk.ac.babraham.FastQC.Sequence.SequenceFormatException: Midline 'FFFFFACTCTTC111:29957:19617/1' didn't start with '+' at 2218703
  	at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:179)
  	at uk.ac.babraham.FastQC.Sequence.FastQFile.next(FastQFile.java:129)
  	at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:77)
  	at java.base/java.lang.Thread.run(Thread.java:833)

Relevant files

.nextflow.log

System information

No response

Tasks

Give feedback

No tasks being tracked yet.

Options

MatthiasZepper · 2024-08-19T16:03:26Z

I don't think this is a pipeline error:

  Failed to process file SRR15731653_2.gz
  uk.ac.babraham.FastQC.Sequence.SequenceFormatException: Midline 'F,FFFFFFCCCATGGA868:186:H2G5KDSX2:2:1105:26865:8343/2' didn't start with '+' at 885703

  Failed to process file SRR15731653_1.gz
  uk.ac.babraham.FastQC.Sequence.SequenceFormatException: Midline 'FFFFFACTCTTC111:29957:19617/1' didn't start with '+' at 2218703

To me, it seems that the input FastQ is not formatted correctly. Particularly the second error to me seems as if the sequence directly blends into the ID of the next read without quality scores.

If you have downloaded the files multiple times already, it might be that they are already corrupted at submission? I recommend some additional data integrity checks first, e.g. with seqfu check or fq lint. You can then for example use seqkit sana to fix errors and drop malformed reads.

PS: We also have an #rnaseq channel in the nf-core Slack space where you could get faster help on issues like this.

pinin4fjords · 2024-10-16T08:52:54Z

I think @MatthiasZepper was correct here, and since there's been no follow-up I'm going to consider this issue resolved. Feel free to reopen if you have convincing examples showing that this is not due to malformed FASTQs.

Dokmen added the bug Something isn't working label Jul 25, 2024

pinin4fjords closed this as completed Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC (SRR15731653)` terminated with an error exit status (1) #1348

NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC (SRR15731653)` terminated with an error exit status (1) #1348

Dokmen commented Jul 25, 2024 •

edited

Loading

Tasks

MatthiasZepper commented Aug 19, 2024

pinin4fjords commented Oct 16, 2024

NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC (SRR15731653)` terminated with an error exit status (1) #1348

NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC (SRR15731653)` terminated with an error exit status (1) #1348

Comments

Dokmen commented Jul 25, 2024 • edited Loading

Description of the bug

Command used and terminal output

Relevant files

System information

Tasks

MatthiasZepper commented Aug 19, 2024

pinin4fjords commented Oct 16, 2024

Dokmen commented Jul 25, 2024 •

edited

Loading