[mira_talk] Re: duplicate read names not allowed ?

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 17 Sep 2012 20:41:54 +0200

On Sep 17, 2012, at 15:55 , Laurent MANCHON wrote:
> [...]
> in log file i have: MIRA found duplicate read names in your data,
> This should never, never be!
> 
> but in manuel we can read: "the names of the reads can be either the very 
> same in both files or already have a /1 or /2 appended."

The full sentence in the manual is:
  "Depending on the preprocessing pipeline of your sequencing provider, the 
names of the reads can ..."

I do concede that the sentence is ambiguous, I will change that to something 
like:

The FASTQ naming must follow one of the known Illumina schemes: either have /1 
and /2 appended to the read names (old Illumina pipeline) or have the read 
names without /1 and /2, but hen supplemented with the standard Illumina 
comment field which contains pair information. See also: 
http://en.wikipedia.org/wiki/FASTQ_format#Illumina_sequence_identifiers


I suspect your files do not adhere to the above :-)

B.

Other related posts: