[mira_talk] Re: Problem with Paired 454, MP Illumina, MIRA and Bambus

And Bambus 2?

Sent from Samsung Mobile


Robin Kramer <kodream@xxxxxxxxx> wrote:


I have had a very difficult time getting Bambus to scale to Illumina Paired end 
scales.  I was determined enough to rewrite all of the perl conversion scripts, 
simply to realize that the core c++ program gromit will crash very 
unexpectedly, and if debug the C++ code you will find that the results are very 
suspect, possible do to a non-scaling design, or maybe some implementation 
flaw, and the project has been non-active for six years, and there is no one 
answering the help messages for Bambus, though the AMOS project is still active.

Sincerely yours,

Robin


2011/7/6 Nestor Zaburannyi <nestor@xxxxxxxxxxxxx<mailto:nestor@xxxxxxxxxxxxx>>
Dear all. Excuse me for a long letter.

It seems i am following many steps of Juan specifically from this thread, but i 
am opening new one as suggested:
http://www.freelists.org/post/mira_talk/fn-in-pairedend-reads,4


I have 454 paired data and Illumina Mate-Pair data. It took me some time to 
figure out that not the Bambus is wrong... Long story short:

When i use all the data and both pairing schemes, MIRA result is outstanding, 
but Bambus, depending on several factors which i can present, either crashes or 
chokes for DAYS on contigs constructed by MIRA.

These are ridiculous results from bambus when using both pairing schemes. Note 
the "203415" number.

Library:        lib_roche_4kb
no. valid links:        2124
no. incorrect len. links:       281
no. incorrect ori. links:       13
no. unchecked links:    715

Library:        lib_illumina_4kb
no. valid links:        613
no. incorrect len. links:       203415
no. incorrect ori. links:       9
no. unchecked links:    408


Most of the contigs are not joined properly and we get hundreds of thousands 
"Invalid" links with consistent distance between them. Something like:

contig_c46 (260871, 277262) ====> v:93 l:1382 o:0 ====> contig_c33 (277800, 
308449)
...
Invalid length:
 library lib_illumina_4kb:
   5:117:14385:3543/1   16063   16113 <---  ...  4435 ...  --->    3619    3669 
5:117:14385:3543/2
   5:33:7647:14602/1   15968   16018 <---  ...  4702 ...  --->    3791    3841 
5:33:7647:14602/2
   5:33:8278:18410/1   16134   16184 <---  ...  4736 ...  --->    3991    4041 
5:33:8278:18410/2
   5:47:1673:4256/1 15946<tel:4256%2F1%20%20%2015946>   15996 <---  ...  4660 
...  --->    3727    3777 5:47:1673:4256/2
   5:11:1902:7439/2   14782   14832 <---  ...  4721 ...  --->    2624    2663 
5:11:1902:7439/1
   5:35:18689:20790/1   14758   14808 <---  ...  4323 ...  --->    2202    2244 
5:35:18689:20790/2
   5:9:8606:11124/1   15673   15723 <---  ...  5317 ...  --->    4111    4161 
5:9:8606:11124/2
   5:50:8824:11116/2   15738   15788 <---  ...  4299 ...  --->    3158    3208 
5:50:8824:11116/1
   ...

Now, i tried to assemble subset of the data in three combinations:
1- only PE information used
2- only MP information used
3- without any template information

Results of MIRA assembly are absolutely comparable, but take a look at bambus 
result:

1:

Library:        lib_roche_4kb
no. valid links:        5746
no. incorrect len. links:       984
no. incorrect ori. links:       4
no. unchecked links:    284

Library:        lib_illumina_4kb
no. valid links:        479
no. incorrect len. links:       19346
no. incorrect ori. links:       1
no. unchecked links:    115

2:

Library:        lib_roche_4kb
no. valid links:        569
no. incorrect len. links:       23
no. incorrect ori. links:       16
no. unchecked links:    87

Library:        lib_illumina_4kb
no. valid links:        5303
no. incorrect len. links:       404
no. incorrect ori. links:       21
no. unchecked links:    231

3:

Library:        lib_roche_4kb
no. valid links:        632
no. incorrect len. links:       13
no. incorrect ori. links:       16
no. unchecked links:    75

Library:        lib_illumina_4kb
no. valid links:        5785
no. incorrect len. links:       274
no. incorrect ori. links:       32
no. unchecked links:    271


As you can see, using 454 paired-end information for MIRA assembly results in 
the disaster downstream. Please, correct me if i am wrong.


Sincerely yours
Nestor


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: