[mira_talk] Re: error: Read is present more than once

  • From: lenis vasilis <val1@xxxxxxxxxx>
  • To: <mira_talk@xxxxxxxxxxxxx>
  • Date: Fri, 8 May 2015 15:48:28 +0100

Thank you very much Andrej,

As it seems SPades for a reason that I cannot understand cuts the
identification of the pair.

Vasilis.

On May 8, 2015, at 3:35 PM, Andrej Benjak <abenjak@xxxxxxxxx>
wrote:

Hi Vasilis,

If these are PEs, the read names for each pair should look like this:
@HWI-D00173:80:C42H0ACXX:8:1101:1295:2161/1
@HWI-D00173:80:C42H0ACXX:8:1101:1295:2161/2

or:
@HWI-D00173:80:C42H0ACXX:8:1101:1295:2161 1:<whatever...>
@HWI-D00173:80:C42H0ACXX:8:1101:1295:2161 2:<whatever...>

And make sure you define them as as PEs in the manifest.
(I am not sure if MIRA would accept PEs without the above naming scheme.)

Andrej



On 05/08/2015 04:24 PM, lenis vasilis wrote:
Hello everyone,

I'm running Mira in reference based mode and I face the following problem:

Read HWI-D00173:80:C42H0ACXX:8:1101:1295:2161 is present more than once in
the data set. Did you load a file twice in the manifest? Is a read present
more than once in your file(s)?
……..

More than 2000 cases like the above, will not report more. Fix your input!


This problem occurred when I tried to run Mira with fastq files that were
corrected by SPades.
I search some of the reads but are unique reads in the files (I'm using
paired end reads)
the only difference that I noticed with the "original" read files is the
following:
Original file:

@HWI-D00173:80:C42H0ACXX:8:1101:1295:2161
CCCCTGTCCACTTCTCTGTGAAAATGAGGGTAATTGACATGATTTCTGTCCCTTTCTATGTGCTTCCATACGTTTATAGAAACCTGTACAGAAG
+

corrected reads file:

@HWI-D00173:80:C42H0ACXX:8:1101:1295:2161
CCCCTGTCCACTTCTCTGTGAAAATGAGGGTAATTGACATGATTTCTGTCCCTTTCTATGTGCTTCCATACGTTTATAGAAACCTGTACAGAAG
+HWI-D00173:80:C42H0ACXX:8:1101:1295:2161

The only difference is that after the "+" spades corrector has copied the
name of the read.
Could be the reason that Mira complains?

Thank you very much,
Vasilis.




Other related posts: