[mira_talk] Problem with Paired 454, MP Illumina, MIRA and Bambus

  • From: Nestor Zaburannyi <nestor@xxxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 6 Jul 2011 23:19:34 +0200

Dear all. Excuse me for a long letter.

It seems i am following many steps of Juan specifically from this thread, but i 
am opening new one as suggested:
//www.freelists.org/post/mira_talk/fn-in-pairedend-reads,4


I have 454 paired data and Illumina Mate-Pair data. It took me some time to 
figure out that not the Bambus is wrong... Long story short:

When i use all the data and both pairing schemes, MIRA result is outstanding, 
but Bambus, depending on several factors which i can present, either crashes or 
chokes for DAYS on contigs constructed by MIRA.

These are ridiculous results from bambus when using both pairing schemes. Note 
the "203415" number.

Library:        lib_roche_4kb
no. valid links:        2124
no. incorrect len. links:       281
no. incorrect ori. links:       13
no. unchecked links:    715

Library:        lib_illumina_4kb
no. valid links:        613
no. incorrect len. links:       203415
no. incorrect ori. links:       9
no. unchecked links:    408


Most of the contigs are not joined properly and we get hundreds of thousands 
"Invalid" links with consistent distance between them. Something like:

contig_c46 (260871, 277262) ====> v:93 l:1382 o:0 ====> contig_c33 (277800, 
308449)
...
Invalid length:
  library lib_illumina_4kb:
    5:117:14385:3543/1   16063   16113 <---  ...  4435 ...  --->    3619    
3669 5:117:14385:3543/2
    5:33:7647:14602/1   15968   16018 <---  ...  4702 ...  --->    3791    3841 
5:33:7647:14602/2
    5:33:8278:18410/1   16134   16184 <---  ...  4736 ...  --->    3991    4041 
5:33:8278:18410/2
    5:47:1673:4256/1   15946   15996 <---  ...  4660 ...  --->    3727    3777 
5:47:1673:4256/2
    5:11:1902:7439/2   14782   14832 <---  ...  4721 ...  --->    2624    2663 
5:11:1902:7439/1
    5:35:18689:20790/1   14758   14808 <---  ...  4323 ...  --->    2202    
2244 5:35:18689:20790/2
    5:9:8606:11124/1   15673   15723 <---  ...  5317 ...  --->    4111    4161 
5:9:8606:11124/2
    5:50:8824:11116/2   15738   15788 <---  ...  4299 ...  --->    3158    3208 
5:50:8824:11116/1
    ...
    
Now, i tried to assemble subset of the data in three combinations:
1- only PE information used
2- only MP information used
3- without any template information

Results of MIRA assembly are absolutely comparable, but take a look at bambus 
result:

1:

Library:        lib_roche_4kb
no. valid links:        5746
no. incorrect len. links:       984
no. incorrect ori. links:       4
no. unchecked links:    284

Library:        lib_illumina_4kb
no. valid links:        479
no. incorrect len. links:       19346
no. incorrect ori. links:       1
no. unchecked links:    115

2:

Library:        lib_roche_4kb
no. valid links:        569
no. incorrect len. links:       23
no. incorrect ori. links:       16
no. unchecked links:    87

Library:        lib_illumina_4kb
no. valid links:        5303
no. incorrect len. links:       404
no. incorrect ori. links:       21
no. unchecked links:    231

3:

Library:        lib_roche_4kb
no. valid links:        632
no. incorrect len. links:       13
no. incorrect ori. links:       16
no. unchecked links:    75

Library:        lib_illumina_4kb
no. valid links:        5785
no. incorrect len. links:       274
no. incorrect ori. links:       32
no. unchecked links:    271


As you can see, using 454 paired-end information for MIRA assembly results in 
the disaster downstream. Please, correct me if i am wrong.


Sincerely yours
Nestor


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: