[mira_announce] New version 2.9.39 in 64 and 32 bits

(this mail should have gone to mira_announce in the first place and not to 
mira_talk as it did initially)

Dear all,

version 2.9.39 of MIRA is available for download at the usual place 
(http://chevreux.org/mira_downloads.html).

Please note that this version is the first one compiled on my new machine 
("arcadia" with a core i7 and 12GB RAM) with a new Linux distribution (Kubuntu 
instead of OpenSUSE) and a new compiler version (gcc 4.3 instead of 3.4). 
While the binaries passed my main regression tests, it may be that I've 
overlooked something in adaptation and that the binaries don't run on some 
machines (although I don't really expect this). In any case, should you 
observe weird behaviour of the new binaries, please drop me a line and we'll 
see how to get that fixed.

A surprisingly high number of people inquired about 32 bit versions (I had to 
drop support on my old machine due to various upgrade reasons). Fortunately, 
VirtualBox and Kubuntu now make it way too easy for me to quickly install a 32 
bit compile environment ... so 32 bit versions are available again.

On the development side, only smaller improvements and bugfixes since 2.9.37. 
As can be seen from the change log, development lately concentrated on 
handling of Solexa data, mainly for mapping. It now reached a state where I 
need just half a day (including some prettifying work and manual check in 
gap4) to accurately pinpoint mutations in bacteria (with a false negative rate 
<<1% and a false positive rate of perhaps 5%).

Next on the development list:
- re-activation of miraEST in the 2.9.x development line (a couple of people
  asked for it)
- decreasing memory requirements to enable full de-novo hybrid assemblies of
  "long" sequences (Sanger, 454) and Solexa

To celebrate the transition to my new machine (geee, it's powerful), here's a 
small contest: what was the inspiration to name it "arcadia"? First one with 
correct solution (one try per person, send private mail to my email address, 
not the list :-) wins a free customisation wish for MIRA (if it's not too 
difficult ... things like "complete multiprocessor support now" or "assembly of 
eukaryotes in 4GB RAM" would not be possible, sorry).

Regards,
  Bastien


Change log since 2.9.37:
===============

2.9.39
------
- in mapping assemblies, repetitive reads are now distributed evenly and not
  stochastically over the backbone repeats
- mapping assemblies with Solexa now have some adjusted default parameters for
  "normal" and "accurate" levels. They run a bit slower but will squeeze a
  maximum out of your data.
- read clustering now temporarily needs more memory, but runs in a few seconds
  instead of hours for projects with 10 million reads
- new parameter -AL:shme (a temporary hack to handle Solexa reads more
  thoroughly)
- to counter a current defficiency of the Solexa technology, a new clipping
  filter for Solexa data now filters out reads that have stretches of 20 or
  more "A" bases or stretches of 12 or more "A" bases and more than 80% "A" in
  total.
- on "out-of-memory" errors, MIRA now dumps a self assessment on where the
  memory went to get an idea what really happened. Note 1: this is bound to
  happen only with eukaryots or on very small machines. Note 2: development
  versions of MIRA by default dump some assements also during the assembly.
- documented -OUT:sssip:stsip (which appeared in 2.9.12, my apologies)
- changed documentation for 454 assembly to point at publicly available data
  instead of the spneu project (which put too much strain on my website).
- renamed -CL:prc to -CL:pec to reflect it's use on both ends of a read


2.9.38x1
--------
- -CL:prc now also clips left (will have to rename that option). This catches
   very efficiently vector leftovers in Sanger reads and adaptor leftovers in
   454 reads (which also can occur there).
- -CL:prc now also clips when a non-ACGT base is at the ends
- bugfix: saving as gap4 directory did not save the first contig due to wrong
  handling of directory creation.
- bugfix: convert_project now sets the minimum coverage to 1 to circumvent a
  quirk in the computation of "Large contigs" of the assembly info
  display. Better fix in the future.
- version 1a: testing new pathfinder algorithm enabled


Other related posts: