[mira_talk] Re: not=4 does not appear to be working on my Mac OS; system freezes after long read name lengths

  • From: Rick Westerman <westerman@xxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Sun, 31 Aug 2014 13:35:10 -0400

Did you read chapter "3.14.1.  Estimating needed memory for an assembly 
project” and then run miramem?  That should give you a *rough* idea of how much 
memory you will need thus if you should even try to attempt to use your HPCC 
resource.

Also it is unclear to me if your sizes refer to the number of bases or the 
actual file size.  In other words when you say, “Ion Torrent PGM (FASTQ ~ 3 
GB), Proton (FASTQ ~ 9 GB), and PacBIo (FASTQ ~ 78 MB) reads” does the 3 GB 
mean 3 billion bases or 3 billion bytes in the file?  And if the latter is the 
file zipped (compressed) or not?  I am trying to figure out if you have enough 
depth of coverage to assemble a 0.8 Gbase genome.

--
Rick Westerman
westerman@xxxxxxxxxx




On Aug 31, 2014, at 1:21 PM, John DeFilippo <defilippo.john@xxxxxxxxx> wrote:

> Hi Bastien,
> 
>> Huh … 800? 8-0-0?
> 
> yup, a sea urchin, about 1/4 the human genome
> 
>> I’m not sure whether you should try to assembly such a large genome with 
>> MIRA.
> 
> A bioinformatician at IonTorrent who was familiar with our PGM and Proton 
> sequencing results had suggested either MIRA or Newbler as 
> IonTorrent-friendly commercial assembly tools. Since I’m attempting a hybrid 
> denovo assembly using long PacBio reads to supplement the short IonTorrent 
> reads, some research I did indicated MIRA was a good candidate for such an 
> assembly. I hoped the size of the genome would be more of a time-to-run 
> issue, not a make or break issue for the assembler.
> 
>> I know I wouldn’t.
> 
> Keeping in mind that I’m a biologist, not a bioinformatician or computer 
> scientist, whose sole bioinformatics experience is limited to running command 
> line BLAST, but who doesn’t mind devoting the time to teach myself new 
> skills, what would you recommend? (BTW, I am the entire 'bioinformatics 
> department' in our tiny underfunded university lab).
> 
>> You’d probably need a couple of dozen GiB (if not in the hundreds) to 
>> assemble such a genome with MIRA. 
> 
> I do have access to a group HPCC that our university is part of. I’ve been 
> working on my Mac because being such a newbie at all of this I like to work 
> at home, as it takes me all day to figure out how to do things, and they 
> don’t like to hand out VPNs to access it from home. But I can access it from 
> our lab. So on a high performance computing cluster, is MIRA a viable choice 
> for doing the kind of large genome hybrid denovo assembly I’m attempting?
> 
> Thanks.
> 
> JD
> 
> On Aug 31, 2014, at 2:52 AM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote:
> 
>> On 31 Aug 2014, at 4:56 , John DeFilippo <defilippo.john@xxxxxxxxx> wrote:
>>> This is my first time using MIRA, and my first attempt at an assembly.
>>> It’s an ~ 800 MB genome, and I’m attempting a denovo assembly using Ion 
>>> Torrent PGM (FASTQ ~ 3 GB), Proton (FASTQ ~ 9 GB), and PacBIo (FASTQ ~ 78 
>>> MB) reads.
>> 
>> Huh … 800? 8-0-0? I’m not sure whether you should try to assembly such a 
>> large genome with MIRA. I know I wouldn’t.
>> 
>>> 1. parameter set to not=4, but CPU usage shows only using 1 thread
>> 
>> Not all parts of MIRA run in multithread: some are not worth it, others 
>> cannot be multithreaded.
>> 
>>> 2. After about 10-20 minutes of CPU time my system freezes and I have to 
>>> reboot.
>> 
>> I suspect a RAM problem coupled with an OSX memory management weirdness. 
>> You’d probably need a couple of dozen GiB (if not in the hundreds) to 
>> assemble such a genome with MIRA. There’s no way your Mac has that. Normally 
>> the OS should, at one point, simply return a memory allocation failure and 
>> that would be the end of the story … I have no idea why it decides to freeze 
>> instead.
>> 
>> B.
>> 
>> 
>> 
>> --
>> You have received this mail because you are subscribed to the mira_talk 
>> mailing list. For information on how to subscribe or unsubscribe, please 
>> visit http://www.chevreux.org/mira_mailinglists.html
> 
> 
> --
> You have received this mail because you are subscribed to the mira_talk 
> mailing list. For information on how to subscribe or unsubscribe, please 
> visit http://www.chevreux.org/mira_mailinglists.html


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: