[mira_talk] Mira Question: Solexa reads...memory error or bug?
- From: Alex Washington <alexwashington@xxxxxxxxx>
- To: mira_talk@xxxxxxxxxxxxx
- Date: Fri, 13 Mar 2009 11:55:27 -0800
Mira community,
Greetings, I have throughly searched through the Mira package for an answer to
this problem but to no avail.
After Running Mira with the following command:
input files:
sa2_in.solexa.fasta
sa2_straindata_in.txt
sa2_backbone_in.fasta
***no qualilty files available in project, thus using default!
command:
> mira --project=sa2 --job=mapping,genome,normal,solexa
-SB:lsd=yes:bsn=sa2:bft=fasta:bbq=30 >2_log_assembly_sa.txt
the program quits and with the following error....
“ error ... out of memory...if you have any questions please send the
last 1000 lines of output log to the author, “
I should have sufficient amount of memory thus this error is unexpected. Is
this a bug?
Project Description:
S. aureus
Illumina Solexa data
No. of reads: 14, 771, 764 reads
Attached: alex_log_assembly.txt
Thanks
Alex Washintong
Graduate Student:VCU
alexwashington@xxxxxxxxx
This is MIRA V2.9.39 (development version).
Please cite: Chevreux, B., Wetter, T. and Suhai, S. (1999), Genome Sequence
Assembly Using Trace Signals and Additional Sequence Information.
Computer Science and Biology: Proceedings of the German Conference on
Bioinformatics (GCB) 99, pp. 45-56.
Mail questions,ideas or suggestions to the MIRA talk mailing list:
mira_talk@xxxxxxxxxxxxx
To (un-)subsubcribe the MIRA mailing lists, see:
http://www.chevreux.org/mira_mailinglists.html
Mail bug reports to:
bach@xxxxxxxxxxxx
Compiled in boundtracking mode.
Compiled in bugtracking mode.
-SB:sbuip is 3, but must be no more than 2. Setting to 2
Parsing parameters: --project=sa2 --job=mapping,genome,normal,solexa
-SB:lsd=yes:bsn=sa2:bft=fasta
:bbq=30
Parameters parsed without error, perfect.
------------------------------------------------------------------------------
Parameter settings seen for:
Sanger data (also common parameters), Solexa data
Used parameter settings:
General (-GE):
Project name in (proin) : sa2
Project name out (proout) : sa2
Number of threads (not) : 2
Keep contigs in memory (kcim) : no
EST SNP pipeline step (esps) : 1
Use template information (uti) : [san] yes
[sxa] no
Template insert size minimum (tismin): [san] -1
[sxa] -1
Template insert size maximum (tismax): [san] -1
[sxa] -1
Load reads options (-LR):
Load Sanger data (lsand) : no
Sanger file type (sanft) : fasta
External quality (eq) : none (none)
Ext. qual. override (eqo) : no
Discard reads on e.q. error (droeqe): no
Load 454 data (l454d) : no
Load Solexa data (lsxad) : yes
Solexa scores in qual file (ssiqf) : yes
Load SOLiD data (lsidd) : no
Read naming scheme (rns) : [san] Sanger Institute
(sanger)
[sxa] forward/reverse
(fr)
Merge with XML trace info (mxti) : [san] no
[sxa] no
Filecheck only (fo) : no
Assembly options (-AS):
Number of passes (nop) : 2
Skim each pass (sep) : yes
Maximum number of RMB break loops (rbl) : 1
Minimum read length (mrl) : [san] 80
[sxa] 20
Base default quality (bdq) : [san] 10
[sxa] 10
Automatic repeat detection (ard) : yes
Coverage threshold (ardct) : [san] 2
[sxa] 2
Minimum length (ardml) : [san] 400
[sxa] 200
Grace length (ardgl) : [san] 40
[sxa] 20
Use uniform read distribution (urd) : no
Start in pass (urdsip) : 3
Cutoff multiplier (urdcm) : [san] 1.5
[sxa] 1.5
Keep long repeats separated (klrs) : no
Spoiler detection (sd) : yes
Last pass only (sdlpo) : yes
Use genomic pathfinder (ugpf) : yes
Use emergency search stop (uess) : yes
ESS partner depth (esspd) : 500
Use emergency blacklist (uebl) : yes
Use max. contig build time (umcbt) : no
Build time in seconds (bts) : 10000
Strain and backbone options (-SB):
Load straindata (lsd) : yes
Load backbone (lb) : yes
Start backbone usage in pass (sbuip) : 0
Backbone file type (bft) : fasta
Backbone base quality (bbq) : 30
Backbone strain name (bsn) : sa2
Force for all (bsnffa) : no
Backbone rail from strain (brfs) :
Backbone rail length (brl) : 100
Backbone rail overlap (bro) : 40
Also build new contigs (abnc) : no
Dataprocessing options (-DP):
Use read extensions (ure) : [san] yes
[sxa] no
Read extension window length (rewl) : [san] 30
[sxa] 30
Read extension w. maxerrors (rewme) : [san] 2
[sxa] 2
First extension in pass (feip) : [san] 0
[sxa] 0
Last extension in pass (leip) : [san] 0
Clipping options (-CL):
Merge with SSAHA vector screen (msvs) : [san] no
[sxa] no
Gap size (msvsgs) : [san] 10
[sxa] 1
Max front gap (msvsmfg) : [san] 30
[sxa] 2
Max end gap (msvsmeg) : [san] 60
[sxa] 2
Strict front clip (msvssfc) : [san] 0
[sxa] 0
Strict end clip (msvssec) : [san] 0
[sxa] 0
Possible vector leftover clip (pvlc) : [san] yes
[sxa] no
maximum len allowed (pvcmla) : [san] 18
[sxa] 18
Quality clip (qc) : [san] no
[sxa] no
Minimum quality (qcmq) : [san] 20
[sxa] 20
Window length (qcwl) : [san] 30
[sxa] 30
Bad stretch quality clip (bsqc) : [san] yes
[sxa] no
Minimum quality (bsqcmq) : [san] 20
:
Window length (bsqcwl) : [san] 30
[sxa] 20
Masked bases clip (mbc) : [san] yes
[sxa] no
Gap size (mbcgs) : [san] 20
[sxa] 5
Max front gap (mbcmfg) : [san] 40
[sxa] 12
Max end gap (mbcmeg) : [san] 60
[sxa] 12
Clip poly A/T at ends (cpat) : [san] no
[sxa] no
Keep poly-a signal (cpkps) : [san] no
[sxa] no
Minimum signal length (cpmsl) : [san] 10
[sxa] 10
Max errors allowed (cpmea) : [san] 1
[sxa] 1
Max gap from ends (cpmgfe) : [san] 9
[sxa] 9
Ensure minimum left clip (emlc) : [san] yes
[sxa] yes
Minimum left clip req. (mlcr) : [san] 25
[sxa] 1
Set minimum left clip to (smlc) : [san] 30
[sxa] 1
Ensure minimum right clip (emrc) : [san] no
[sxa] no
Minimum right clip req. (mrcr) : [san] 10
[sxa] 10
:
Set minimum right clip to (smrc) : [san] 20
[sxa] 20
Propose end clips (pec) : yes
Parameters for SKIM algorithm (-SK):
Number of threads (not) : 2
Bases per hash (bph) : 12
Hash save stepping (hss) : 1
Percent required (pr) : 60
Maximum hashes in memory (mhim) : 15000000
Max hits per read (mhpr) : 5
Mask nasty repeats (mnr) : no
Repeat threshold (rt) : 10
Max. megahub ratio (mmhr) : 0
Pathfinder options (-PF):
Use quick rule (uqr) : [san] yes
[sxa] yes
Quick rule min len 1 (qrml1) : [san] 200
[sxa] 36
Quick rule min sim 1 (qrms1) : [san] 90
[sxa] 100
Quick rule min len 2 (qrml2) : [san] 100
[sxa] 35
Quick rule min sim 2 (qrms2) : [san] 95
[sxa] 100
Quick rule min len 2 (qrml2) : [san] 100
[sxa] 35
Quick rule min sim 2 (qrms2) : [san] 95
[sxa] 100
Backbone quick overlap min len (bqoml) : [san] 150
[sxa] 20
Align parameters for Smith-Waterman align (-AL):
Bandwidth in percent (bip) : [san] 15
[sxa] 20
Bandwidth max (bmax) : [san] 100
[sxa] 80
Bandwidth min (bmin) : [san] 25
[sxa] 20
Minimum score (ms) : [san] 30
[sxa] 15
Minimum overlap (mo) : [san] 15
[sxa] 20
Minimum relative score in % (mrs) : [san] 65
[sxa] 60
Extra gap penalty (egp) : [san] no
[sxa] no
Solexa_hack_max_errors (shme) : [san] 3
[sxa] 3
extra gap penalty level (egpl) : [san] reject_codongaps
[sxa] low
Max. egp in percent (megpp) : [san] 100
[sxa] 100
Contig parameters (-CO):
Name prefix (np) : sa2
Reject on drop in relative alignment score in % (rodirs) : [san] 20
[sxa] 30
Mark repeats (mr) : yes
Only in result (mroir) : yes
Assume SNP instead of repeats (asir) : no
Minimum reads per group needed for tagging (mrpg) : [san] 2
[sxa] 3
Minimum neighbour quality needed for tagging (mnq) : [san] 20
[sxa] 20
Minimum Group Quality needed for RMB Tagging (mgqrt) : [san] 30
[sxa] 30
End-read Marking Exclusion Area in bases (emea) : [san] 25
[sxa] 4
Also mark gap bases (amgb) : [san] yes
[sxa] yes
Also mark gap bases - even multicolumn (amgbemc) : [san] yes
[sxa] yes
Also mark gap bases - need both strands (amgbnbs): [san] yes
[sxa] yes
Force non-IUPAC consensus per sequencing type (fnicpst) : [san] no
[sxa] no
Merge short reads (msr) : [san] no
[sxa] yes
Edit options (-ED):
Automatic contig editing (ace) : [san] no
[sxa] no
Sanger only:
Strict editing mode (sem) : no
Confirmation threshold in percent (ct) : 50
Directories (-DI):
When loading EXP files :
When loading SCF files :
For writing result files : sa2_d_results
For writing result info files : sa2_d_info
For writing log files : sa2_d_log
File names (-FN):
When loading sequences from FASTA : [san]
sa2_in.sanger.fasta
[sxa]
sa2_in.solexa.fasta
When loading qualities from FASTA quality : [san]
sa2_in.sanger.fasta.qual
[sxa]
sa2_in.solexa.fasta.qual
When loading project from CAF : sa2_in.caf
When loading EXP fofn : sa2_in.fofn
When loading project from PHD : sa2_in.phd.1
When loading strain data : sa2_straindata_in.txt
When loading XML trace info files : [san]
sa2_traceinfo_in.sanger.xml
[sxa]
sa2_traceinfo_in.solexa.xml
When loading SSAHA vector screen results :
sa2_ssahavectorscreen_in.txt
When loading backbone from CAF : sa2_backbone_in.caf
When loading backbone from GenBank : sa2_backbone_in.gbf
When loading backbone from FASTA : sa2_backbone_in.fasta
Output files (-OUTPUT/-OUT):
Save simple singlets in project (sssip) : [san] no
[sxa] no
Save tagged singlets in project (stsip) : [san] yes
[sxa] yes
Remove rollover logs (rrol) : yes
Remove log directory (rld) : no
Result files:
Saved as CAF (orc) : yes
Saved as FASTA (orf) : yes
Saved as GAP4 (directed assembly) (org) : no
Saved as phrap ACE (ora) : yes
Saved as HTML (orh) : no
Saved as Transposed Contig Summary (ors) : no
Saved as simple text format (ort) : no
Saved as wiggle (ort) : yes
Temporary result files:
Saved as CAF (otc) : yes
Saved as FASTA (otf) : no
Saved as GAP4 (directed assembly) (otg) : no
Saved as phrap ACE (ota) : no
Saved as HTML (oth) : no
Saved as Transposed Contig Summary (ots) : no
Saved as simple text format (ott) : no
Extended temporary result files:
Saved as CAF (oetc) : no
Saved as FASTA (oetf) : no
Saved as GAP4 (directed assembly) (oetg) : no
Saved as phrap ACE (oeta) : no
Saved as HTML (oeth) : no
Save also singlets (oetas) : no
Alignment output customisation:
TEXT characters per line (tcpl) : 60
HTML characters per line (hcpl) : 60
TEXT end gap fill character (tegfc) :
HTML end gap fill character (hegfc) :
File / directory output names:
CAF : sa2_out.caf
FASTA : sa2_out.unpadded.fasta
FASTA quality : sa2_out.unpadded.fasta.qual
FASTA (padded) : sa2_out.padded.fasta
FASTA qual.(pad): sa2_out.padded.fasta.qual
GAP4 (directory): sa2_out.gap4da
ACE : sa2_out.ace
HTML : sa2_out.html
Simple text : sa2_out.txt
TCS overview : sa2_out.tcs
Wiggle : sa2_out.wig
------------------------------------------------------------------------------
Deleting old directory sa2_d_log ... done.
Creating directory sa2_d_log ... done.
Deleting old directory sa2_d_results ... done.
Creating directory sa2_d_results ... done.
Deleting old directory sa2_d_info ... done.
Localtime: Fri Mar 13 14:35:14 2009
Loading backbone from FASTA file: sa2_backbone_in.fasta (quality:
sa2_backbone_in.fasta.qual)
Counting sequences in FASTA file:
[0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|....
[50%] ....|.... [6
0%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%]
Loading sequence data from FASTA file sa2_backbone_in.fasta:
[0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|....
[50%] ....|.... [6
0%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%]
Done.
Loaded 2 reads, 0 of which have quality accounted for.
Postprocessing backbone (this may take a while)
1 to process
gi|150392480|ref|NC_009632.1|_bb 2906507
Localtime: Fri Mar 13 14:35:15 2009
Seeing strain 1: "sa2"
Generated 1 unique strain ids for 2 reads.
Strain "default" has 1 reads.
Strain "sa2" has 1 reads.
Adding rails to 1 contigs (this may take a while).
Loading data (Solexa type data) from FASTA files,
Counting sequences in FASTA file:
[0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|....
[50%] ....|.... [6
0%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%]
========================== Memory self assessment ==============================
System information on process (/proc/self/status):
--------------------------------------------------------------------------------
Name: mira
State: R (running)
SleepAVG: 0%
Tgid: 16009
Pid: 16009
PPid: 15857
TracerPid: 0
Uid: 50104 50104 50104 50104
Gid: 50104 50104 50104 50104
FDSize: 256
Groups: 50104
VmPeak: 473192 kB
VmSize: 407656 kB
VmLck: 0 kB
VmHWM: 337928 kB
VmRSS: 328476 kB
VmData: 403872 kB
VmStk: 88 kB
VmExe: 3660 kB
VmLib: 0 kB
VmPTE: 672 kB
StaBrk: 009d5000 kB
Brk: 293e6000 kB
StaStk: 7fff26b6c490 kB
Threads: 1
SigQ: 0/16384
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
Cpus_allowed:
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed: 00000000,00000001
--------------------------------------------------------------------------------
Information on assembly object:
AS_readpool: 48441 reads.
AS_contigs: 0 contigs.
AS_bbcontigs: 1 contigs.
Mem used for reads: 79297704 (76 MiB)
Memory used in assembly structures:
Eff. Size Free cap. LostByAlign
AS_skim_edges: 0 24 B 0 B 0 B
AS_adsfacts: 0 24 B 0 B 0 B
AS_confirmed_edges: 0 24 B 0 B 0 B
AS_permanent_overlap_bans: 0 24 B 0 B 0 B
AS_readhitmiss: 0 24 B 0 B 0 B
AS_readhmcovered: 0 24 B 0 B 0 B
AS_count_rhm: 0 24 B 0 B 0 B
AS_weakestOvercallDelVote: 0 24 B 0 B 0 B
AS_weakestOvercallCoverage: 0 24 B 0 B 0 B
AS_clipleft: 0 24 B 0 B 0 B
AS_clipright: 0 24 B 0 B 0 B
AS_used_ids: 0 24 B 0 B 0 B
AS_multicopies: 0 24 B 0 B 0 B
AS_hasmcoverlaps: 0 24 B 0 B 0 B
AS_maxcoveragereached: 0 24 B 0 B 0 B
AS_coverageperseqtype: 0 24 B 0 B 0 B
AS_istroublemaker: 0 24 B 0 B 0 B
AS_isdebris: 0 24 B 0 B 0 B
AS_needalloverlaps: 0 40 B 0 B 0 B
AS_readsforrepeatresolve: 0 40 B 0 B 0 B
AS_allrmbsok: 0 24 B 0 B 0 B
AS_probablermbsnotok: 0 24 B 0 B 0 B
AS_weakrmbsnotok: 0 24 B 0 B 0 B
Total: 79298288 (76 MiB)
================================================================================
(END)
Other related posts: