[mira_talk] contigs built only by one sequence

  • From: Laurent Manchon <lmanchon@xxxxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 17 Dec 2009 22:30:16 +0100

--Hi,

i have made assembly on 454 reads (~600 000 reads) using mira 2.9.43 and with this command line:

bin/mira -project=planaire -fasta=planaire.fa -job=denovo,est,normal,454 -notraceinfo -GENERAL:kcim=yes,not=3 -SK:rt=4 -CO:fnicpst=yes,rodirs=10,asir=yes -CL:cpat=no,pec=no,pvlc=no,qc=no,bsqc=no,mbc=no,emlc=yes,mlcr=0,smlc=0,emrc=yes,mrcr=0,smrc=0 -AL:bip=5,bmin=10,mrs=90:egp=yes:egpl=reject_codongaps:megpp=100,mo=60 -AS:mrl=30,bdq=30 -SK:mnr=yes,rt=8,mmhr=6

and i obtain 53867 contigs but some contigs (~18163) are only built using one sequence, why ?
So, i don't understand...
I thought that contigs established(constituted) by a single sequence was singulets and had to appear in debrislists, don't you ?

Laurent --


planaire_info_assembly.txt file :

Assembly information:
=====================

Num. reads assembled: 469768
Num. singlets: 18163

Large contigs:
--------------
With    Contig size             >= 500
       AND (Total avg. Cov     >= 5
            OR Cov(san)        >= 0
            OR Cov(454)        >= 5
            OR Cov(sxa)        >= 0
            OR Cov(sid)        >= 0
           )

 Length assessment:
 ------------------
 Number of contigs:    6322
 Total consensus:      6868966
 Largest contig:       7814
 N50 contig size:      1170
 N90 contig size:      701
 N95 contig size:      614

 Coverage assessment:
 --------------------
 Max coverage (total): 2681
 Max coverage
       Sanger: 0
       454:    4285
       Solexa: 0
       Solid:  0
 Avg. total coverage (size >= 5000): 14.27
 Avg. coverage (contig size >= 5000)
       Sanger: 0.00
       454:    14.27
       Solexa: 0.00
       Solid:  0.00

 Quality assessment:
 -------------------
 Average consensus quality:                    19
Consensus bases with IUPAC (IUPc): 11449 (you might want to check these) Strong unresolved repeat positions (SRMc): 73 (you might want to check these)
 Weak unresolved repeat positions (WRMc):      0       (excellent)
 Sequencing Type Mismatch Unsolved (STMU):     0       (excellent)
 Contigs having only reads wo qual:            0       (excellent)
 Contigs with reads wo qual values:            0       (excellent)


All contigs:
------------
 Length assessment:
 ------------------
 Number of contigs:    35715
 Total consensus:      30892000
Largest contig:       7814
 N50 contig size:      681
 N90 contig size:      349
 N95 contig size:      283

 Coverage assessment:
 --------------------
 Max coverage (total): 2681
 Max coverage
       Sanger: 0
       454:    4285
       Solexa: 0
       Solid:  0
 Avg. total coverage (size >= 5000): 14.27
 Avg. coverage (contig size >= 5000)
       Sanger: 0.00
       454:    14.27
       Solexa: 0.00
       Solid:  0.00

 Quality assessment:
 -------------------
 Average consensus quality:                    11
Consensus bases with IUPAC (IUPc): 53965 (you might want to check these) Strong unresolved repeat positions (SRMc): 73 (you might want to check these)
 Weak unresolved repeat positions (WRMc):      0       (excellent)
 Sequencing Type Mismatch Unsolved (STMU):     0       (excellent)
 Contigs having only reads wo qual:            0       (excellent)
 Contigs with reads wo qual values:            0       (excellent)





--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: