[mira_talk] Re: duplicate read names not allowed ?

  • From: "Langhorst, Brad" <Langhorst@xxxxxxx>
  • To: "mira_talk@xxxxxxxxxxxxx" <mira_talk@xxxxxxxxxxxxx>, Laurent MANCHON <lmanchon@xxxxxxxxxxxxxx>
  • Date: Tue, 18 Sep 2012 12:09:37 +0000

no spaces….
@D61655M1:276:D10YJACXX:8:1101:1456:1955 /1
should be
@D61655M1:276:D10YJACXX:8:1101:1456:1955/1

Brad

On Sep 18, 2012, at 4:18 AM, Laurent MANCHON 
<lmanchon@xxxxxxxxxxxxxx<mailto:lmanchon@xxxxxxxxxxxxxx>> wrote:


something wrong,
after adding /1 and /2 at the end of headers into the respective files i have 
the same error:
"Read ...... is present more than once in the data set. Did you load a file 
twice in the manifest? Is a read present more than once in your file(s)?"

head GAMMA-1.solexa.fastq
@D61655M1:276:D10YJACXX:8:1101:1456:1955 /1
NCTGNAGTGTNNGATNCGGGGTTNNNNNNNNNNACNTTNNNNNNNNNNGNNNGNNNTTGAANNNNNNNNTNACNGTNTNATNAANNNNTTCGGACNACAT
+
#0;@#2@=@?##32@#2=>@@<@#############################################################################
@D61655M1:276:D10YJACXX:8:1101:1494:1967 /1
NTTTNGGCTCTGTACCTTTTGTATCAGGGGAACCTAAAAGTGTAAAAAGTATACAATCGAGTGGGTTAGATCTTTCTAAGACTAGTTTAAAGTTAAATTG
+
#0;@#28=????9@?<=?<>;3@?==?><@???<?>????6><9=?<>;0=?=??;???;?);/=36>>=9>?====??>><>?<>???=>?7::====:
@D61655M1:276:D10YJACXX:8:1101:1551:1953 /1
NGAGNACATGAGACGGACGTTGTAGAGCATTCGAACTTTGCCAGCAAGCATACTCCACCAGTTTTTTTGAGCTAGAGTGATAATGGTCAGATCGGAAGAG

head GAMMA-2.solexa.fastq
@D61655M1:276:D10YJACXX:8:1101:1456:1955 /2
CACCAATCCAAGCTCCACAACTTGATGTAGTCCGAACAGTTTCATCATACGGTCAAAATCAAATTCAACGTCCATCATTTGACTTAAACGTTCTTCCCTC
+
C@CFFFFFHHGHHJJJIJJCHIJJIJJHIIHIIIIJJEIHIIIIJGIIJJIIHIHHJJJIJIJJJIJIHHEEFFFEEEE@CDEDDDDDCD<C?@CDCCDD
@D61655M1:276:D10YJACXX:8:1101:1494:1967 /2
TAAAAATCCTTTAAAACAATATGCTAACTTAGCTAATTTAAAGCCTAGTTTTTGTAAAAAACAACCCGTGTAGCACTGCAATAACTTAAACTTTTGCGTA
+
?@BFFFFFDDHGFGGIGGIIEHJGGGIGGIHGIJJIJJJIIJGGGGEHHGGHIJGJJJJIJJIGIGIHHGEFFFFEEEEEEDDDDDDDCCCCDDDCDB>@
@D61655M1:276:D10YJACXX:8:1101:1551:1953 /2
GACCATTATCACTCTAGCTCAAAAAAACTGGTGGAGTATGCTTGCTGGCAAAGTTCGAATGCTCTACAACGTCCGTCTCATGTTCTCCAGATCGGAAGAG


manifest-file:

project = GAMMA
job = denovo,est,accurate
parameters = -DI:cwd=/scratch/piquemald,trt=/scratch/piquemald
readgroup
data = GAMMA-1.solexa.fastq GAMMA-2.solexa.fastq
technology = solexa
template_size = 250 550
segment_placement = ---> <---

parameters = COMMON_SETTINGS -GE:not=80 -SK:mnr=yes,mhpr=10,pr=98 
-AS:nop=7:sd=yes:rbl=6 -MI:somrnl=0 SOLEXA_SETTINGS --noclipping=all -LR:wqf=no 
-CO:rodirs=5,msr=no -OUT:sssip=no -AL:mrs=95:shme=0:egp=yes -
AS:mrl=15,epoq=no



Le 17/09/2012 22:06, Bastien Chevreux a écrit :
On Sep 17, 2012, at 20:57 , Laurent MANCHON wrote:
I suspect your files do not adhere to the above :-)
you are right, this is what i have:
[...]
it seems to be a Casava 1.8 format but pair member is missing

Well, renaming the reads to have /1 and /2 will do the trick. a sed one-line 
should be able to take care of that.

yes, adding /1 or / 2 at the end of each header that should be enough, isn't it 
?


B.






--
Brad Langhorst
langhorst@xxxxxxx<mailto:langhorst@xxxxxxx>
978-380-7564




Other related posts: