[mira_talk] Re: 454/Solexa hybrid assembly of a 35Mbp genome?
- From: Bastien Chevreux <bach@xxxxxxxxxxxx>
- To: mira_talk@xxxxxxxxxxxxx
- Date: Wed, 3 Jun 2009 00:08:03 +0200
On Dienstag 02 Juni 2009 Jan Paces wrote:
> [...]
> I think Mira keeps in memory few bytes about each read, which makes it
> impossible to use it with huge amount of SOLiD or SOLEXA reads.
> [...]
Yep, this is currently a real problem. Unfortunately, not only a few bytes. On
64 bit machines, I have ~280 bytes of overhead per read just for alll the
empty pointers, strings, vectors, clipping points and lists. Then around 10
bytes per read base (nucleotide, quality, adjustement positions etc.). An then
MIRA uses an assembly strategy the keeps copies of reads untouched until the
contig they were put in has been accepted as good.
This was perfectly reasonable for Sanger project up to medium eukaryote size
(say, 80 megabases with 1m Sanger reads) and is still manageable now for 454
Titanium, but the small reads break my neck there. I sometimes wish I could
squeeze in more work in the evening hours and weekends to get around this ;-)
Regards,
Bastien
--
You have received this mail because you are subscribed to the mira_talk mailing
list. For information on how to subscribe or unsubscribe, please visit
http://www.chevreux.org/mira_mailinglists.html
Other related posts: