[mira_talk] Re: not=4 does not appear to be working on my Mac OS; system freezes after long read name lengths

  • From: John DeFilippo <defilippo.john@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Sun, 31 Aug 2014 13:21:24 -0400

Hi Bastien,

> Huh … 800? 8-0-0?

yup, a sea urchin, about 1/4 the human genome

> I’m not sure whether you should try to assembly such a large genome with MIRA.

A bioinformatician at IonTorrent who was familiar with our PGM and Proton 
sequencing results had suggested either MIRA or Newbler as IonTorrent-friendly 
commercial assembly tools. Since I’m attempting a hybrid denovo assembly using 
long PacBio reads to supplement the short IonTorrent reads, some research I did 
indicated MIRA was a good candidate for such an assembly. I hoped the size of 
the genome would be more of a time-to-run issue, not a make or break issue for 
the assembler.

> I know I wouldn’t.

Keeping in mind that I’m a biologist, not a bioinformatician or computer 
scientist, whose sole bioinformatics experience is limited to running command 
line BLAST, but who doesn’t mind devoting the time to teach myself new skills, 
what would you recommend? (BTW, I am the entire 'bioinformatics department' in 
our tiny underfunded university lab).

> You’d probably need a couple of dozen GiB (if not in the hundreds) to 
> assemble such a genome with MIRA. 

I do have access to a group HPCC that our university is part of. I’ve been 
working on my Mac because being such a newbie at all of this I like to work at 
home, as it takes me all day to figure out how to do things, and they don’t 
like to hand out VPNs to access it from home. But I can access it from our lab. 
So on a high performance computing cluster, is MIRA a viable choice for doing 
the kind of large genome hybrid denovo assembly I’m attempting?

Thanks.

JD

On Aug 31, 2014, at 2:52 AM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote:

> On 31 Aug 2014, at 4:56 , John DeFilippo <defilippo.john@xxxxxxxxx> wrote:
>> This is my first time using MIRA, and my first attempt at an assembly.
>> It’s an ~ 800 MB genome, and I’m attempting a denovo assembly using Ion 
>> Torrent PGM (FASTQ ~ 3 GB), Proton (FASTQ ~ 9 GB), and PacBIo (FASTQ ~ 78 
>> MB) reads.
> 
> Huh … 800? 8-0-0? I’m not sure whether you should try to assembly such a 
> large genome with MIRA. I know I wouldn’t.
> 
>> 1. parameter set to not=4, but CPU usage shows only using 1 thread
> 
> Not all parts of MIRA run in multithread: some are not worth it, others 
> cannot be multithreaded.
> 
>> 2. After about 10-20 minutes of CPU time my system freezes and I have to 
>> reboot.
> 
> I suspect a RAM problem coupled with an OSX memory management weirdness. 
> You’d probably need a couple of dozen GiB (if not in the hundreds) to 
> assemble such a genome with MIRA. There’s no way your Mac has that. Normally 
> the OS should, at one point, simply return a memory allocation failure and 
> that would be the end of the story … I have no idea why it decides to freeze 
> instead.
> 
> B.
> 
> 
> 
> --
> You have received this mail because you are subscribed to the mira_talk 
> mailing list. For information on how to subscribe or unsubscribe, please 
> visit http://www.chevreux.org/mira_mailinglists.html


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: