[mira_talk] Re: assembly parameters and more
- From: Bastien Chevreux <bach@xxxxxxxxxxxx>
- To: mira_talk@xxxxxxxxxxxxx
- Date: Wed, 18 Mar 2009 19:37:16 +0100
On Sunday 15 March 2009 Davide Sassera wrote:
> I hope I understood correctly what you need and that the file I'm
> sending is the right one.
Hi Davide,
thanks for the files you sent. Took me some time to have a look at, but they
were basically what I needed.
Looks like there is a combination of a few things going on. I made a some
tests and discovered that massive coverage together highly repetitive data and
turning on masking nasty sequences gave the behaviour you saw.
First: the coverage (and megahubs)
100x is already quite massive for de-novo. If the genome you have contains
some nasty surprises (say, a repetitive element, longer than a read, is
contained 15x in the genome), then this explains the 1500x coverages you saw
in some passes of MIRA. In earlier times one saw this kind of coverage only
for EST sequencing projects ... in non-normalised libraries.
Therefore, these things tend to trigger the megahub detector ... and I cannot
really blame it for that :-)
Second: the long runtime and insane memory requirements
These were triggered by the massive coverage in conjunction with the masking
of nasty repeats ... and a "security feature" I built into MIRA that backfired.
Short story: So as not to loose alignments when parts of it (those with nasty
sequences) are masked, I told MIRA to take every alignment there. Which in
your case led to "a lot" of reads having *all* their alignments analysed and
stored.
> [...]
> I'm currently using version 2.39, I suppose I should update to 2.42
I've removed said "security feature" from the code, have a try at 2.9.43 :-)
Also, if you have time, please try with and without masking nasty repeats. I'm
curious about how it behaves in your real world case and what you thing is
better. To get MIRA going when masking is off, increase -SK:mmhr (5 should be
enough).
Regards,
Bastien
--
You have received this mail because you are subscribed to the mira_talk mailing
list. For information on how to subscribe or unsubscribe, please visit
http://www.chevreux.org/mira_mailinglists.html
Other related posts: