[mira_talk] Re: memory usage

  • From: Dimitar Kenanov <dimitar@xxxxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 31 Oct 2012 09:44:21 +0800

On 10/31/2012 03:25 AM, Bastien Chevreux wrote:
On Oct 30, 2012, at 10:16 , Dimitar Kenanov <dimitar@xxxxxxxxxxxxxx> wrote:
i was reading the manual about memory usage and saw that i can use my swap for 
up to 20% of the needed total memory.
I have 25M PE Illumina reads. So if MIRA is using about 1.5 Gb RAM per 1M reads 
i need roughly 25*3=75GB RAM. Well i dont have so much :) but i have 64 and i 
specially made a swap of 32GB RAM. So i would need additional ~ 11GB RAM from 
the swap which is less then 20% of the total needed.
Do you think i can try MIRA under that circumstances?
It will be painful, even if the swap is on SSD. If it's on disk, forget it 
immediately. However, I think the memory is too tight.

I see :) Yeah my swap is on a disk. Than i think it is better to remove that part from the manual (in section 3.13.1) :

"If your RAM is not large enough, you can still assemble projects by using disk swap. Up to 20% of the needed memory can be provided by swap without the speed penalty getting too large. Going above 20% is not recommended though, above 30% the
machine will be almost permanently swapping at some point or another."

But from what ive read about SSD is not really wise to use it as a swap cos too much read and write will decrease its lifespan drastically.


And one more technical question i think. Surely there is connection between the 
length of the reads and the RAM needed. For example the above estimation is for 
reads which are 100bp (from previous communication). But my reads after the 
trimming are 64bp. How does this relate to the memory consumption?
First: I hope you did not trim via quality values with fastx or similar 
programs. You'll get a bias.
I used PRINSEQ. Seems to be doing good job. I just trimmed all of the seqs by 12 bp from the 5' end cos there was some systematical GC bias in all 3 samples in these 12bp. Removed duplicates and also masked LC regions. I am not sure if the masking helps the assembly but seems logical to help.


Regarding RAM: of course there is a linear effect, but there's also a basic memory need 
of >200 bytes per read … even if the read is 0 bytes long.
Regarding memory used by threads: compared to the memory needed by 50m reads, 
the additional needs are infinitesimal.

I see. Thanks for the info :)


Hope that helps,
   Bastien


Well it helps. I need more RAM but there is no space on my machine. Need to buy server grade MB and super expensive ECC RAM and as i remember the cost of such baby goes to around 40-50K.


Regards
Dimitar

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: