[mira_talk] Re: megahubs

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 25 Jun 2014 21:08:59 +0200

On 25 Jun 2014, at 16:34 , Jose Huguet Tapia <jch63@xxxxxxxxxxx> wrote:
> I am using MIRA 4 to assembly an Oomycete (Eukaryotic). I believe that the 
> organism has a highly heterozygous genome.

One warning: a few weeks ago I’ve had the unpleasant surprise to discover that 
the HGAP3 pipeline totally fails to give reliable corrected reads for a diploid 
genome. It simply mixed differences fromdifferent ploidies into single reads. 
Of course assemblers which recognise this barf and will create many more 
contigs than you’d expect (MIRA certainly does, the one from the HGAP pipeline 
also).

>   I run a "first test" in MIRA with Pacbio corrected Long reads. The assembly 
> was going ok until I got the message of megahubs. From previous discussion I 
> learned that megahubs are quite common in eukarotycs. 
> My concern is the level of total megahubs . It is more than 90 %.

Errrm, yes. 99% megahubs points to a hefty problem. I know that MIRA 4.0.2 has 
some trouble correctly defining megahubs for long read data, but 99% is just 
ridiculous. Something feels fishy, either regarding MIRA or the data.

> In the message It says that I set the a maxium allowed ratio of 90. I believe 
> that I did not set this parameter though. Does MIRA 4 have a default for this.

Yes, MIRA sets default according to the data you feed it. Am I correct in 
assuming you used a pretty standard manifest to launch this assembly?

B.




--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: