[mira_talk] Re: new version 2.9.45 and solexa
- From: Bastien Chevreux <bach@xxxxxxxxxxxx>
- To: mira_talk@xxxxxxxxxxxxx
- Date: Wed, 13 May 2009 23:19:31 +0200
On Mittwoch 13 Mai 2009 Jan Paces wrote:
> Previous versions of mira did not recognized solexa naming scheme (/1
> /2), I had to rename all reads (using .r .f). Now it seems mira can use
> the original naming scheme? Can also .r.f scheme be used for solexa reads?
Yes. You just have to set the naming scheme you want for the sequencing type
you want (-LR:rns=...).
> Also quality score previously used by mira was phred score. Now it looks
> mira can use solexa quality scoring as well. Am I right? What is the
> default? Is it possible to explicitly tell mira which scoring scheme
> should be used? Because both schemes are very similar, can we expect no
> big changes in assembly?
MIRA has a kind of autodetection for the quality schemes. If negatgive values
are present in the quality files, it's Solexa scoring. No negative values is
guessed to be phred scoring. Forcing MIRA to treat the quality values as phred
style from the beginning on can be done with -LR:ssiqf=no
I cannot imaging the scoring scheme having any impact on the assembly itself,
they just differ in the very low quality scores. You might have a tiny
difference in consensus quality scores for bad areas with low coverage and bad
base qualities, but even that should be very rare.
> My first impression is, that mapping with new version is very slow.
> for a contig of the size 3.3k and average coverage of solexa reads 10x
> it took 25 minutes:
Jan, would you be interested to shift your interest away from eukaryotes with
100MB towards something smaller, say ... a nice Bacillus with 4MB for example?
:-) You are pushing MIRA a bit.
But these times hurt. I think I have an idea what caused this: setting up a
contig as backbone and then preparing all reads for a mapping it is
comparatively costly at the moment (uses same routines as de-novo). If you had
only a couple of contigs it does not matter, but for unfinished genomes ...
Actually, can you please send me the whole log file for the building of contig
1? I'd like to check a few things?
I cannot promise quick help, but I'll have a look.
> these are parameters I used for mapping:
>
> <code>
> mira
> -project=mi_v8 -job=mapping,genome,solexa,normal
> COMMON_SETTINGS
> -GE:not=16:uti=on:tismin=2300:tismax=2700
> -SB:lb=on:sbuip=1:bft=caf
> -AS:nop=3 -DP:ure=on -ED:ace=off
> -SK:mnr=on:bph=14:hss=4
> SOLEXA_SETTINGS
> -GE:uti=on:tismin=2300:tismax=2700
> </code>
Set mapping of short Solexas, set -AS: nop=1. Furthermore, do not switch on -
DP:ure (only if you have clipped longer reads in the lot).
Lastly, -SK:mnr is not needed for pure mapping assemblies: MIRA will evenly
distribute repeats by itself and the whole megahub problematic is virtually
non-existent for mappings.
Regards,
Bastien
--
You have received this mail because you are subscribed to the mira_talk mailing
list. For information on how to subscribe or unsubscribe, please visit
http://www.chevreux.org/mira_mailinglists.html
Other related posts: