[mira_talk] mira versions comparison

  • From: Davide Sassera <davide.sassera@xxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 22 Apr 2009 09:32:12 +0200

Dear Bastien, dear all,

Some of you may remember me for a series of mails on problems with my assembly more than a month ago. It's not that I did not want to share my results, as Bastien asked, it's just that the assembly (version 2.9.43) is still running (34 days now), due to massive memory use which caused a lot of swapping (13GB).

By the way...

I did some test on the 2.9.44, comparing the assembly with 2.9.37 and 2.9.43

I used a subset of my dataset:
280623 gs-flx reads
455 sanger reads
estimated genome size 1.7MB
estimated GC% 36%
chimera presence: yes, due to MDA
contaminating DNA presence: probable (host DNA)

machine:
Intel core duo 3,16 GHZ
8 GB of RAM

I ran the following:
mira -job=denovo,accurate,454,sanger,genome -GE:not=2 -AS:klrs=1 -AL:mrs=90 -CO:rodirs=15 -SK:mmhr=2 454_SETTINGS -AS:mrl=80 -AL:mrs=90 -CO:rodirs=20:mrpg=10


-SK:mmhr=2 was because I had hubs

considerations:
the 2.9.44 seems to have run smoothly, with no particular memory hogging (9.43 seemed much more demanding). It has found a number of possible chimeras and cut them.
The contig length is much higher than both 2.9.43 and 37.
37 has longer contigs than 43, but some seem to be misassemblies (wrong GC%, wrong blast hits).

I attach here the 9.44 log and a small assemblystats file in which I highlight some features (in the attached archive).
If you are interested I can send the 9.37 and 9.43 logs as well.

Hope it helps

I also hope my 9.43 assembly will finish soon, so I will be able to try the 9.44 with the entire dataset (but it seems it will take at least two more weeks).

thanks to Bastien for the novel version, it's very promising!

D.

--
Davide Sassera
Sezione di Patologia Generale e Parassitologia
Dipartimento di Patologia Animale, Igiene e Sanità Pubblica Veterinaria Facoltà di Veterinaria
Università degli Studi di Milano
Via Celoria 10, 20133, Milano, ITALY
Tel: +39 0250318094
Fax: +39 0250318095

Other related posts: