[mira_talk] RE assembly parameters and more

Hi Davide,

I agree with you mira is a great tool and this mailing list is very 
usefull, unfortunatly,
the only person who usually can answer you the best is Bastien...

Although i can try to help.

I can't answer your questions in details, but my feeling is you need more 
memory.

From my experience, swapping is not good and can slow down things a lot.

But maybe you already knew this ;-)

Good luck
jorge.




"Davide Sassera (davide.sassera)" <davide.sassera@xxxxxxxx> 
Envoyé par : mira_talk-bounce@xxxxxxxxxxxxx
11/03/2009 11:10
Veuillez répondre à
mira_talk@xxxxxxxxxxxxx


A
mira_talk@xxxxxxxxxxxxx
cc
mira_talk@xxxxxxxxxxxxx
Objet
[mira_talk] assembly parameters and more






Dear All,
I?m currently assembling a 1,5 Mb bacterial genome.
I have half a titanium plate (520K reads), half a normal gs-flx plate 
(270k reads) and a little bit of sanger just to spice it up.
I know it?s a lot of sequences for such a small genome but we were afraid 
of having issues with contaminating DNA and chimeras formed by whole 
genome amplification, both of which in fact we have.
 
I?m working on a 3.16Ghz dual core with 16GB of ram, and since I was sure 
my system was going to handle the data, I went with an ultra accurate 
assembly:
 
mira -project=050309 -job=denovo,accurate,454,sanger,genome -GE:not=2 
-SK:pr=97:mnr=YES -AS::rbl=8:nop=10:urdsip:8:klrs=1 454_SETTINGS 
-AS:mrl=30
 
Now: it seems things are harder than I thought, after 6 days it is still 
doing the second step and it?s swapping 12gigs!!!
 
Now on with the questions:
1.      Am I being too strict with the parameters I?m using?
2.      Do the assembly steps take all the same time? It seems that step 2 
is taking much longer than step 1
3.      Do all step take the same memory? Again it seems the second step 
is more demanding
4.      If my assumptions are correct I will either wait for months or the 
assembly will stop for lack of memory soon, right?
5.      So what should I do now? Restart with softer parameters? Wait for 
4-5 steps to be completed, quit the mira and use the latest caf I get? 
Stop bothering you guys (Bastien above all) with stupid questions?
 
Thank you in advance, I really feel that the constant updates and this 
competent and relaxed mailing list make Mira stand above all the 
competitors 
 
Davide 

Davide Sassera
DIPAV
Università degli Studi di Milano
Milano, Italy


----- Messaggio Originale -----
Da: Andreas Petzold <andpet@xxxxxxxxxxxxxx>
Data: Mercoledi', Marzo 11, 2009 10:56 am
Oggetto: [mira_talk] provide known repeats
A: mira_talk@xxxxxxxxxxxxx

> Hi Bastien,
> 
> the last assembly worked fine and now I have at least 3 % of my 
> fish genome (simply too low coverage and to few data but I have 
> to work with that). But I have another question (and maybe 
> another feature): Is it possible to provide a file that contains 
> already known repeats that should be considered for assembly ? 
> Or if I masked the read with RepeatMasker first can mira use the 
> information for assembly ?
> 
> On the other hand, is it neccessary ? Would it improve the 
> repeat tackling ?
> 
> Greets,
> 
> Andreas
> 
> -- 
> 
> Andreas Petzold
> Genome Analysis
> Fritz Lipmann Institute
> Beutenbergstrasse 11, D-07745 Jena
> voice : ++49-3641-656488
> fax   : ++49-3641-656488
> email : andpet@xxxxxxxxxxxxxx
> 
> -- 
> You have received this mail because you are subscribed to the 
> mira_talk mailing list. For information on how to subscribe or 
> unsubscribe, please visit 
> http://www.chevreux.org/mira_mailinglists.html 

Other related posts: