[mira_talk] Re: should this assembly be taking several days?

From: Bharat Patel <b.patel@xxxxxxxxxxxxxxx>
To: mira_talk@xxxxxxxxxxxxx
Date: Sun, 04 Jul 2010 09:45:44 +1000

Hi Corey

After posting my earlier response to your question on running mira onsmall microbial genomes, I saw that the post has moved on to gsAssemblerand mira.

I had run gsAssembler on my the small genome (8 kb paired end andfragment libraries) with depending on the changes to the defaultsettings obtained between 30 and 40 scaffolds. After performing blastanalysis on these scaffolds and than comparing the organisation of thegenes with its nearest neighbors, I realised that the break points werealways around tRNA and rRNA genes. There are many rRNA operons repeatedin microbial genomes but in my analysis I could only find a single 16SrRNA and a partial 23S rRNA gene and the break points were around thetRNA genes. I remembered reading posts from users who suggested thatrepeats may not be handled well by gsAssembler. From my understanding anupdate of gsAssembler will be out to address the issue of repeats.

Mira on the other hand produced produced 1245 contigs and with Bambusone large scaffold (and small less than 1000bp assemblages) with theexpected genome size and had 4 rRNA gene operons. I have selected PCRpriimer pairs from the mira scaffold and compared the primers pickedwith consed which I had run with the ace output from gsAssembler andwith the exception of some primers, most matched. So in essence, I likewhat mira does at least for small microbial genomes which do not havemany repeats (not sure about genomes with many repeats though). Once youhave a handle on mira, it is a breeze to use and understand (mostly). Ifound bambus a bit problematic especially the structure of mates filebut after many tries got the mates file to work correctly.

Thanks to the mira users who respond to questions and motivate firsttime users. My sincere thanks and appreciation to Bastein for his workon mira and especially for his quick response to questions posted by theusers. It is only because of this I have spent many months persistingwith using mira.


Bharat
Professor,
Microbial Gene Research & Resources Facility (in Extremophiles)
School of Biomolecular & Physical Sciences
Griffith University
Brisbane, Australia


mira_talk-bounce@xxxxxxxxxxxxx wrote:

Thanks Bastien (and Bharat)
I'll increase the ram and give it another shot.
I'm not sure how repetitive my bug is relative to others, but I did
notice that there were a few small contigs in the gsAssembler output
that were several times the average coverage.
Cheers

Corey

On Sat, 2010-07-03 at 01:22 +0200, Bastien Chevreux wrote:
On Freitag 02 Juli 2010 Corey Frazer wrote:
[...]
So, has something gone off the rails here, or am I just going to have to
let it run?
I think the problem is RAM: the machine is swapping itself to death, Obviouslythe miramem estimate was a bit wrong in this special case ... perhaps morerepeats than "normal" in your genome?
If you could find a machine with 8 or better 12 GiB, I think you'd see MIRAfinishing within a couple of hours.
Also, I suppose you are using 3.0.5. Try 3.1.15 (development version) whichshould use perhaps 20% less memory than the ... 8 GIB or so your processcurrently needs.
 http://www.chevreux.org/tmp/mira_3.1.15_dev_linux-gnu_x86_64_static.tar.bz2

Or wait for after the week-end, when 3.2.0rc1 will be released.

Regards,
  Bastien

Follow-Ups:
- [mira_talk] Re: should this assembly be taking several days?
  - From: Bastien Chevreux

References:
- [mira_talk] should this assembly be taking several days?
  - From: Corey Frazer
- [mira_talk] Re: should this assembly be taking several days?
  - From: Bastien Chevreux
- [mira_talk] Re: should this assembly be taking several days?
  - From: Corey Frazer

[mira_talk] Re: should this assembly be taking several days?

Other related posts: