Hi everyone, I know it's probably an old issue. I am trying to assemble an eukaryotic genome with an unknown genome size (a preliminary velvet assembly suggested 150 Mb). I have two Illumina data sets, one from traditional small inserts and another one from a 3.5 kbp mate-pair library. After two days of MIRA run I saw "out of memory detected", and I think that's why the run ended (Log attached). So other than buying more memory (if it was indeed out of memory), would it be wise to do an assembly on short inserts (or mate-pair) first, and then do another run using the reference assembly mode? Does it make sense? Thanks, WJ
This is MIRA 4.9.3 . Please cite: Chevreux, B., Wetter, T. and Suhai, S. (1999), Genome Sequence Assembly Using Trace Signals and Additional Sequence Information. Computer Science and Biology: Proceedings of the German Conference on Bioinformatics (GCB) 99, pp. 45-56. To (un-)subscribe the MIRA mailing lists, see: http://www.chevreux.org/mira_mailinglists.html After subscribing, mail general questions to the MIRA talk mailing list: mira_talk@xxxxxxxxxxxxx To report bugs or ask for features, please use the SourceForge ticketing system at: http://sourceforge.net/p/mira-assembler/tickets/ This ensures that requests do not get lost. Compiled by: bach Sat Nov 8 19:53:37 CET 2014 On: Linux vk10464 2.6.32-41-generic #94-Ubuntu SMP Fri Jul 6 18:00:34 UTC 2012 x86_64 GNU/Linux Compiled in boundtracking mode. Compiled in bugtracking mode. Compiled with ENABLE64 activated. Runtime settings (sorry, for debug): Size of size_t : 8 Size of uint32 : 4 Size of uint32_t: 4 Size of uint64 : 8 Size of uint64_t: 8 Current system: Linux una0002-ib 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue Sep 9 21:36:05 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Looking for files named in data ...Pushing back filename: "/home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R1_001.fastq" Pushing back filename: "/home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R2_001.fastq" Pushing back filename: "/home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R1_001.fastq" Pushing back filename: "/home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R2_001.fastq" Manifest: projectname: MIRA_1st_Cryptocaryon job: genome,denovo,accurate parameters: COMMON_SETTINGS -NW:cmrnl=no -DI:trt=/scratch/biology/bio443/ Manifest load entries: 2 MLE 1: RGID: 1 RGN: Crypto_Theront7_300bp SN: StrainX SP: ---> <--- SPio: 0 SPC: -1 IF: 50 IT: 800 TSio: 0 ST: 6 (Solexa) namschem: 4 SID: 0 DQ: 30 BB: 0 Rail: 0 CER: 0 /home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R1_001.fastq /home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R2_001.fastq MLE 2: RGID: 2 RGN: Crypto_Theront7_3-4kbp SN: StrainX SP: <--- ---> SPio: 0 SPC: -2 IF: 2000 IT: 5000 TSio: 0 ST: 6 (Solexa) namschem: 4 SID: 0 DQ: 30 BB: 0 Rail: 0 CER: 0 /home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R1_001.fastq /home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R2_001.fastq Parameters parsed without error, perfect. -CL:pec and -CO:emeas1clpec are set, setting -CO:emea values to 1. ------------------------------------------------------------------------------ Parameter settings seen for: Sanger data Used parameter settings: General (-GE): Project name : MIRA_1st_Cryptocaryon Number of threads (not) : 7 Automatic memory management (amm) : yes Keep percent memory free (kpmf) : 15 Max. process size (mps) : 0 EST SNP pipeline step (esps) : 0 Colour reads by kmer frequency (crkf) : yes Preprocess only (ppo) : no Load reads options (-LR): Wants quality file (wqf) : [sxa] yes Filecheck only (fo) : no Assembly options (-AS): Number of passes (nop) : 0 Kmer series (kms) : Maximum number of RMB break loops (rbl) : 2 Maximum contigs per pass (mcpp) : 0 Minimum read length (mrl) : [sxa] 20 Minimum reads per contig (mrpc) : [sxa] 10 Enforce presence of qualities (epoq) : [sxa] yes Automatic repeat detection (ard) : yes Coverage threshold (ardct) : [sxa] 2.5 Minimum length (ardml) : [sxa] 300 Grace length (ardgl) : [sxa] 20 Use uniform read distribution (urd) : no Start in pass (urdsip) : 3 Cutoff multiplier (urdcm) : [sxa] 1.5 Spoiler detection (sd) : yes Last pass only (sdlpo) : yes Use genomic pathfinder (ugpf) : yes Use emergency search stop (uess) : yes ESS partner depth (esspd) : 500 Use emergency blacklist (uebl) : yes Use max. contig build time (umcbt) : no Build time in seconds (bts) : 10000 Strain and backbone options (-SB): Bootstrap new backbone (bnb) : [sxa] yes Start backbone usage in pass (sbuip) : 0 Backbone rail from strain (brfs) : Backbone rail length (brl) : 0 Backbone rail overlap (bro) : 0 Trim overhanging reads (tor) : yes (Also build new contigs (abnc)) : yes Dataprocessing options (-DP): Use read extensions (ure) : [sxa] no Read extension window length (rewl) : [sxa] 30 Read extension w. maxerrors (rewme) : [sxa] 2 First extension in pass (feip) : [sxa] 0 Last extension in pass (leip) : [sxa] 0 Clipping options (-CL): SSAHA2 or SMALT clipping: Gap size (msvsgs) : [sxa] 1 Max front gap (msvsmfg) : [sxa] 2 Max end gap (msvsmeg) : [sxa] 2 Strict front clip (msvssfc) : [sxa] 0 Strict end clip (msvssec) : [sxa] 0 Possible vector leftover clip (pvlc) : [sxa] no maximum len allowed (pvcmla) : [sxa] 18 Min qual. threshold for entire read (mqtfer): [sxa] 5 Number of bases (mqtfernob) : [sxa] 15 Quality clip (qc) : [sxa] no Minimum quality (qcmq) : [sxa] 20 Window length (qcwl) : [sxa] 30 Bad stretch quality clip (bsqc) : [sxa] no Minimum quality (bsqcmq) : [sxa] 5 Window length (bsqcwl) : [sxa] 20 Masked bases clip (mbc) : [sxa] no Gap size (mbcgs) : [sxa] 5 Max front gap (mbcmfg) : [sxa] 12 Max end gap (mbcmeg) : [sxa] 12 Lower case clip front (lccf) : [sxa] no Lower case clip back (lccb) : [sxa] no Clip poly A/T at ends (cpat) : [sxa] no Keep poly-a signal (cpkps) : [sxa] no Minimum signal length (cpmsl) : [sxa] 12 Max errors allowed (cpmea) : [sxa] 1 Max gap from ends (cpmgfe) : [sxa] 9 Clip 3 prime polybase (c3pp) : [sxa] yes Minimum signal length (c3ppmsl) : [sxa] 15 Max errors allowed (c3ppmea) : [sxa] 3 Max gap from ends (c3ppmgfe) : [sxa] 9 Clip known adaptors right (ckar) : [sxa] yes Ensure minimum left clip (emlc) : [sxa] no Minimum left clip req. (mlcr) : [sxa] 0 Set minimum left clip to (smlc) : [sxa] 0 Ensure minimum right clip (emrc) : [sxa] no Minimum right clip req. (mrcr) : [sxa] 10 Set minimum right clip to (smrc) : [sxa] 20 Apply SKIM chimera detection clip (ascdc) : yes Apply SKIM junk detection clip (asjdc) : no Propose end clips (pec) : [sxa] yes Kmer size (peckms) : 31 Minimum kmer for forward-rev (pmkfr) : 1 Rare kmer mask (rkm) : [sxa] no Handle Solexa GGCxG problem (pechsgp) : yes Front freq (pffreq) : [sxa] 0 Back freq (pbfreq) : [sxa] 0 Front forward-rev (pffore) : [sxa] yes Back forward-rev (pbfore) : [sxa] yes Front conf. multi-seq type (pfcmst) : [sxa] yes Back conf. multi-seq type (pbcmst) : [sxa] yes Front seen at low pos (pfsalp) : [sxa] no Back seen at low pos (pbsalp) : [sxa] no Clip bad solexa ends (cbse) : [sxa] yes Search PhiX174 (spx174) : [sxa] yes Filter PhiX174 (fpx174) : [sxa] no Parameters for SKIM algorithm (-SK): Number of threads (not) : 7 Also compute reverse complements (acrc) : yes Kmer size (kms) : 17 Automatic increase per pass (kmsaipp) : 1 Kmer size max(kmsmax) : 0 Kmer save stepping (kss) : 1 Percent required (pr) : [sxa] 95 Max hits per read (mhpr) : 2000 Filter megahubs (fmh) : yes Megahub cap (mhc) : 150000 Max megahub ratio (mmhr) : 0 SW check on backbones (swcob) : no Max kmers in memory (mkim) : 15000000 MemCap: hit reduction (mchr) : 4096 Parameters for Kmer Statistics (-KS): Freq. cov. estim. min (fcem) : 0 Freq. estim. min normal (fenn) : 0.4 Freq. estim. max normal (fexn) : 1.6 Freq. estim. repeat (fer) : 1.9 Freq. estim. heavy repeat (fehr) : 8 Freq. estim. crazy (fecr) : 20 Mask nasty repeats (mnr) : yes Nasty repeat ratio (nrr) : 100 Nasty repeat coverage (nrc) : 0 Lossless digital normalisation (ldn) : no Repeat level in info file (rliif) : 6 Million kmers per buffer (mkpb) : 4 Rare kmer early kill (rkek) : no Pathfinder options (-PF): Use quick rule (uqr) : [sxa] yes Quick rule min len 1 (qrml1) : [sxa] -95 Quick rule min sim 1 (qrms1) : [sxa] 100 Quick rule min len 2 (qrml2) : [sxa] -85 Quick rule min sim 2 (qrms2) : [sxa] 100 Backbone quick overlap min len (bqoml) : [sxa] 20 Max. start cache fill time (mscft) : 5 Align parameters for Smith-Waterman align (-AL): Bandwidth in percent (bip) : [sxa] 20 Bandwidth max (bmax) : [sxa] 80 Bandwidth min (bmin) : [sxa] 20 Minimum score (ms) : [sxa] 15 Minimum overlap (mo) : [sxa] 17 Minimum relative score in % (mrs) : [sxa] 90 Solexa_hack_max_errors (shme) : [sxa] -1 Extra gap penalty (egp) : [sxa] yes extra gap penalty level (egpl) : [sxa] reject_codongaps Max. egp in percent (megpp) : [sxa] 100 Contig parameters (-CO): Name prefix (np) : MIRA_1st_Cryptocaryon Reject on drop in relative alignment score in % (rodirs) : [sxa] 30 CMinimum relative score in % (cmrs) : [sxa] -1 Mark repeats (mr) : yes Only in result (mroir) : no Assume SNP instead of repeats (asir) : no Minimum reads per group needed for tagging (mrpg) : [sxa] 4 Minimum neighbour quality needed for tagging (mnq) : [sxa] 20 Minimum Group Quality needed for RMB Tagging (mgqrt) : [sxa] 30 End-read Marking Exclusion Area in bases (emea) : [sxa] 1 Set to 1 on clipping PEC (emeas1clpec) : yes Also mark gap bases (amgb) : [sxa] yes Also mark gap bases - even multicolumn (amgbemc) : [sxa] yes Also mark gap bases - need both strands (amgbnbs): [sxa] yes Force non-IUPAC consensus per sequencing type (fnicpst) : [sxa] no Merge short reads (msr) : [sxa] yes Max errors (msrme) : [sxa] 0 Keep ends unmerged (msrkeu) : [sxa] -1 Gap override ratio (gor) : [sxa] 66 Edit options (-ED): Mira automatic contig editing (mace) : yes Edit kmer singlets (eks) : yes Edit homopolymer overcalls (ehpo) : [sxa] no Misc (-MI): Large contig size (lcs) : 500 Large contig size for stats (lcs4s) : 5000 I know what I do (ikwid) : no Extra flag 1 / sanity track check (ef1) : no Extra flag 2 / dnredreadsatpeaks (ef2) : yes Extra flag 3 / pelibdisassemble (ef3) : no Extended log (el) : no Nag and Warn (-NW): Check NFS (cnfs) : stop Check multi pass mapping (cmpm) : stop Check template problems (ctp) : stop Check SRA read names (csrn) : stop Check duplicate read names (cdrn) : stop Check max read name length (cmrnl) : no Max read name length (mrnl) : 40 Check average coverage (cac) : stop Average coverage value (acv) : 80 Directories (-DI): Top directory for writing files : MIRA_1st_Cryptocaryon_assembly For writing result files : MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_results For writing result info files : MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_info For writing tmp files : /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp Tmp redirected to (trt) : /scratch/biology/bio443/ For writing checkpoint files : MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_chkpt Output files (-OUTPUT/-OUT): Save simple singlets in project (sssip) : [sxa] no Save tagged singlets in project (stsip) : [sxa] yes Remove rollover tmps (rrot) : yes Remove tmp directory (rtd) : no Result files: Saved as CAF (orc) : yes Saved as MAF (orm) : yes Saved as FASTA (orf) : yes Saved as GAP4 (directed assembly) (org) : no Saved as phrap ACE (ora) : no Saved as GFF3 (org3) : no Saved as HTML (orh) : no Saved as Transposed Contig Summary (ors) : yes Saved as simple text format (ort) : no Saved as wiggle (orw) : yes Temporary result files: Saved as CAF (otc) : yes Saved as MAF (otm) : no Saved as FASTA (otf) : no Saved as GAP4 (directed assembly) (otg) : no Saved as phrap ACE (ota) : no Saved as HTML (oth) : no Saved as Transposed Contig Summary (ots) : no Saved as simple text format (ott) : no Extended temporary result files: Saved as CAF (oetc) : no Saved as FASTA (oetf) : no Saved as GAP4 (directed assembly) (oetg) : no Saved as phrap ACE (oeta) : no Saved as HTML (oeth) : no Save also singlets (oetas) : no Alignment output customisation: TEXT characters per line (tcpl) : 60 HTML characters per line (hcpl) : 60 TEXT end gap fill character (tegfc) : HTML end gap fill character (hegfc) : File / directory output names: CAF : MIRA_1st_Cryptocaryon_out.caf MAF : MIRA_1st_Cryptocaryon_out.maf FASTA : MIRA_1st_Cryptocaryon_out.unpadded.fasta FASTA quality : MIRA_1st_Cryptocaryon_out.unpadded.fasta.qual FASTA (padded) : MIRA_1st_Cryptocaryon_out.padded.fasta FASTA qual.(pad): MIRA_1st_Cryptocaryon_out.padded.fasta.qual GAP4 (directory): MIRA_1st_Cryptocaryon_out.gap4da ACE : MIRA_1st_Cryptocaryon_out.ace HTML : MIRA_1st_Cryptocaryon_out.html Simple text : MIRA_1st_Cryptocaryon_out.txt TCS overview : MIRA_1st_Cryptocaryon_out.tcs Wiggle : MIRA_1st_Cryptocaryon_out.wig ------------------------------------------------------------------------------ Creating directory MIRA_1st_Cryptocaryon_assembly ... done. Creating directory MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_results ... done. Creating directory MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_info ... done. Creating directory MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_chkpt ... done. Symlink MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_tmp now pointing to /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta Tmp directory is not on a NFS mount, good. Localtime: Thu Nov 13 21:16:43 2014 Loading reads from /home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R1_001.fastq type fastq [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] Looking at FASTQ type ... guessing FASTQ-33 (Sanger) Running quality values adaptation ... done. Loading reads from /home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R2_001.fastq type fastq [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] Looking at FASTQ type ... guessing FASTQ-33 (Sanger) Running quality values adaptation ... done. Loading reads from /home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R1_001.fastq type fastq [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] Looking at FASTQ type ... guessing FASTQ-33 (Sanger) Running quality values adaptation ... done. Loading reads from /home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R2_001.fastq type fastq [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] Looking at FASTQ type ... guessing FASTQ-33 (Sanger) Running quality values adaptation ... done. List of read names which have problems with name length: Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:1943:2191/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:1763:2236/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2098:2150/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2021:2153/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2065:2198/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2090:2220/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2409:2150/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2277:2162/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2384:2177/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2492:2179/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2374:2226/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2669:2152/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2554:2214/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2767:2130/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2819:2168/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2852:2175/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2935:2190/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2998:2191/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2914:2221/1 Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:3111:2123/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1432:2086/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1471:2089/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1357:2093/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1421:2138/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1365:2155/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1479:2155/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1389:2166/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1448:2169/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1352:2169/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1494:2173/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1428:2180/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1461:2212/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1500:2104/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1712:2108/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1624:2108/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1522:2128/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1646:2141/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1574:2151/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1713:2171/1 Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1558:2178/1 200285038 reads had a long name length, for brevity's sake not all were listed. WARNING! -------- MINOR warning -------- MIRA warncode: READ_NAME_TOO_LONG Title: Long read names 200285038 reads were detected with names longer than 40 characters (see output log for more details). While MIRA and many other programs have no problem with that, some older programs have restrictions concerning the length of the read name. Example given: the pipeline CAF -> caf2gap -> gap2caf will stop working at the gap2caf stage if there are read names having > 40 characters where the names differ only at >40 characters. This is a warning only, but as a couple of people were bitten by this, the default behaviour of MIRA is to stop when it sees that potential problem. You might want to rename your reads to have <= 40 characters. Instead of renaming reads in the input files, maybe the 'rename_prefix' functionality of manifest files is useful for you there. On the other hand, you also can ignore this potential problem and force MIRA to continue by using the parameter: '-NW:cmrnl=warn' or '-NW:cmrnl=no' Checking reads for trace data (loading qualities if needed): [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] No SCF data present in any read, EdIt automatic contig editing for Sanger data is now switched off. 200285038 reads with valid data for assembly. Localtime: Fri Nov 14 00:13:57 2014 Generated 100142519 unique DNA template ids for 200285038 valid reads. TODO: Like Readpool: strain x has y reads Have read pool with 200285038 reads. =========================================================================== Backbones: 0 Backbone rails: 0 Sequencing technology statistics: Sanger 454 IonTor PcBioHQ PcBioLQ Text Solexa Solid ------------------------------------------------------------ Total reads 0 0 0 0 0 0 200285038 0 Reads wo qual 0 0 0 0 0 0 0 0 Used reads 0 0 0 0 0 0 200285038 0 Avg. tot rlen 0 0 0 0 0 0 101 0 Avg. used rlen 0 0 0 0 0 0 101 0 W/o clips 0 0 0 0 0 0 200285038 0 Readgroup statistics: RG 1 Solexa avg total len: 101 avg clip len: 101 total bases: 6948467306 used bases: 6948467306 RG 2 Solexa avg total len: 101 avg clip len: 101 total bases: 13280321532 used bases: 13280321532 =========================================================================== Checking pairs of readgroup 1 (named: 'Crypto_Theront7_300bp'): found 68796706 Checking pairs of readgroup 2 (named: 'Crypto_Theront7_3-4kbp'): found 131488332 /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t0.0.txt /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t1.0.txt /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t2.0.txt /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t3.0.txt /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t4.0.txt /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t5.0.txt /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t6.0.txt Post-load clips: Localtime: Fri Nov 14 03:03:31 2014 freemem: 574005248 TNH: 5356 XME 1: 0.000212828 XME 2: 0.1 NEPB 1: 104857 NEPB 2: 104857 Writing temporary hstat files: [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] done Localtime: Fri Nov 14 03:03:31 2014 Flushing buffers to disk: [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] done Analysing hstat files: [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] Localtime: Fri Nov 14 03:03:31 2014 clean up temporary stat files...Localtime: Fri Nov 14 03:03:31 2014 Localtime: Fri Nov 14 03:03:31 2014 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] CLIP MSG: Adaptor right found: 37070249 =========================================================================== Backbones: 0 Backbone rails: 0 Sequencing technology statistics: Sanger 454 IonTor PcBioHQ PcBioLQ Text Solexa Solid ------------------------------------------------------------ Total reads 0 0 0 0 0 0 200285038 0 Reads wo qual 0 0 0 0 0 0 0 0 Used reads 0 0 0 0 0 0 189862369 0 Avg. tot rlen 0 0 0 0 0 0 101 0 Avg. used rlen 0 0 0 0 0 0 93 0 W/o clips 0 0 0 0 0 0 161173976 0 Readgroup statistics: RG 1 Solexa avg total len: 101 avg clip len: 100 total bases: 6948467306 used bases: 6889335079 RG 2 Solexa avg total len: 101 avg clip len: 89 total bases: 13280321532 used bases: 10884842967 =========================================================================== Sorting reads ... done. Symlink MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_tmp now pointing to /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta Could not perform NFS check for directory /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta For a check to run smoothly, please make sure the Unix 'stat' command is available and understands the following call: stat -f -L -c %T /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta Make sure /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta is *NOT* on a NFS mount or else MIRA will run *very* slowly. PRED MAXTID 100142518 Hash analysis for proposed cutbacks:Localtime: Fri Nov 14 14:23:36 2014 freemem: 581263360 TNH: 12104703467 XME 1: 480.998 XME 2: 4 NEPB 1: 4194304 NEPB 2: 4194304 Writing temporary hstat files: [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] done Localtime: Fri Nov 14 18:39:26 2014 Flushing buffers to disk: [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] done Analysing hstat files: Ouch, out of memory detected. ========================== Memory self assessment ============================== Running in 64 bit mode. Dump from /proc/meminfo -------------------------------------------------------------------------------- MemTotal: 198337536 kB MemFree: 515652 kB Buffers: 7208 kB Cached: 665896 kB SwapCached: 1664068 kB Active: 187157832 kB Inactive: 8809564 kB Active(anon): 187131668 kB Inactive(anon): 8178952 kB Active(file): 26164 kB Inactive(file): 630612 kB Unevictable: 56888 kB Mlocked: 30316 kB SwapTotal: 18481148 kB SwapFree: 2203220 kB Dirty: 256816 kB Writeback: 0 kB AnonPages: 193686900 kB Mapped: 20564 kB Shmem: 0 kB Slab: 88632 kB SReclaimable: 46556 kB SUnreclaim: 42076 kB KernelStack: 2080 kB PageTables: 414288 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 117649916 kB Committed_AS: 213819196 kB VmallocTotal: 34359738367 kB VmallocUsed: 683980 kB VmallocChunk: 34256604296 kB HardwareCorrupted: 0 kB AnonHugePages: 21653504 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 7680 kB DirectMap2M: 201310208 kB -------------------------------------------------------------------------------- Dump from /proc/self/status -------------------------------------------------------------------------------- Name: mira State: R (running) Tgid: 9442 Pid: 9442 PPid: 9441 TracerPid: 0 Uid: 4317 4317 4317 4317 Gid: 4300 4300 4300 4300 Utrace: 0 FDSize: 64 Groups: 1000 4300 VmPeak: 213641816 kB VmSize: 213641816 kB VmLck: 0 kB VmHWM: 195985648 kB VmRSS: 193618032 kB VmData: 213537320 kB VmStk: 92 kB VmExe: 7492 kB VmLib: 0 kB VmPTE: 409948 kB VmSwap: 16240736 kB Threads: 1 SigQ: 0/1549352 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000000381000 SigCgt: 0000000180000000 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: ffffffffffffffff Cpus_allowed: ffff Cpus_allowed_list: 0-15 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003 Mems_allowed_list: 0-1 voluntary_ctxt_switches: 6101548 nonvoluntary_ctxt_switches: 1748266 -------------------------------------------------------------------------------- Information on current assembly object: AS_readpool: 200285038 reads. AS_contigs: 0 contigs. AS_bbcontigs: 0 contigs. Mem used for reads: 192 (192 B) Memory used in assembly structures: Eff. Size Free cap. LostByAlign AS_writtenskimhitsperid: 0 24 B 0 B 0 B AS_skim_edges: 0 24 B 0 B 0 B AS_adsfacts: 0 24 B 0 B 0 B AS_confirmed_edges: 0 24 B 0 B 0 B AS_permanent_overlap_bans: 1 24 B 0 B 0 B AS_readhitmiss: 0 24 B 0 B 0 B AS_readhmcovered: 0 24 B 0 B 0 B AS_count_rhm: 0 24 B 0 B 0 B AS_clipleft: 200285038 764 MiB 0 B 0 B AS_clipright: 200285038 764 MiB 0 B 0 B AS_used_ids: 200285038 191 MiB 0 B 2 B AS_multicopies: 0 191 MiB 191 MiB 2 B AS_hasmcoverlaps: 0 191 MiB 191 MiB 2 B AS_maxcoveragereached: 200285038 764 MiB 0 B 0 B AS_coverageperseqtype: 0 24 B 0 B 0 B AS_istroublemaker: 200285038 191 MiB 0 B 2 B AS_isdebris: 200285038 191 MiB 0 B 2 B AS_needalloverlaps: 200285038 191 MiB 0 B 2 B AS_readsforrepeatresolve: 0 40 B 0 B 0 B AS_allrmbsok: 0 764 MiB 764 MiB 0 B AS_probablermbsnotok: 0 764 MiB 764 MiB 0 B AS_weakrmbsnotok: 0 764 MiB 764 MiB 0 B AS_readmaytakeskim: 0 40 B 0 B 0 B AS_skimstaken: 0 40 B 0 B 0 B AS_numskimoverlaps: 0 24 B 0 B 0 B AS_numleftextendskims: 0 24 B 0 B 0 B AS_rightextendskims: 0 24 B 0 B 0 B AS_skimleftextendratio: 0 24 B 0 B 0 B AS_skimrightextendratio: 0 24 B 0 B 0 B AS_skimmegahubs: 0 24 B 0 B 0 B AS_usedtmpfiles: 8 272 B 0 B 0 B Total: 6008552384 (5.6 GiB) ================================================================================ ========================== Memory self assessment ============================== Running in 64 bit mode. Dump from /proc/meminfo -------------------------------------------------------------------------------- MemTotal: 198337536 kB MemFree: 539592 kB Buffers: 7208 kB Cached: 641396 kB SwapCached: 1664068 kB Active: 187157368 kB Inactive: 8785500 kB Active(anon): 187131204 kB Inactive(anon): 8178952 kB Active(file): 26164 kB Inactive(file): 606548 kB Unevictable: 56888 kB Mlocked: 30316 kB SwapTotal: 18481148 kB SwapFree: 2203220 kB Dirty: 256664 kB Writeback: 0 kB AnonPages: 193686788 kB Mapped: 20776 kB Shmem: 0 kB Slab: 88996 kB SReclaimable: 46628 kB SUnreclaim: 42368 kB KernelStack: 2072 kB PageTables: 414288 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 117649916 kB Committed_AS: 213819196 kB VmallocTotal: 34359738367 kB VmallocUsed: 683980 kB VmallocChunk: 34256604296 kB HardwareCorrupted: 0 kB AnonHugePages: 21653504 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 7680 kB DirectMap2M: 201310208 kB -------------------------------------------------------------------------------- Dump from /proc/self/status -------------------------------------------------------------------------------- Name: mira State: R (running) Tgid: 9442 Pid: 9442 PPid: 9441 TracerPid: 0 Uid: 4317 4317 4317 4317 Gid: 4300 4300 4300 4300 Utrace: 0 FDSize: 64 Groups: 1000 4300 VmPeak: 213641816 kB VmSize: 213641816 kB VmLck: 0 kB VmHWM: 195985648 kB VmRSS: 193618124 kB VmData: 213537320 kB VmStk: 92 kB VmExe: 7492 kB VmLib: 0 kB VmPTE: 409948 kB VmSwap: 16240736 kB Threads: 1 SigQ: 0/1549352 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000000381000 SigCgt: 0000000180000000 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: ffffffffffffffff Cpus_allowed: ffff Cpus_allowed_list: 0-15 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003 Mems_allowed_list: 0-1 voluntary_ctxt_switches: 6101548 nonvoluntary_ctxt_switches: 1748345 -------------------------------------------------------------------------------- Information on current assembly object: AS_readpool: 200285038 reads. AS_contigs: 0 contigs. AS_bbcontigs: 0 contigs. Mem used for reads: 192 (192 B) Memory used in assembly structures: Eff. Size Free cap. LostByAlign AS_writtenskimhitsperid: 0 24 B 0 B 0 B AS_skim_edges: 0 24 B 0 B 0 B AS_adsfacts: 0 24 B 0 B 0 B AS_confirmed_edges: 0 24 B 0 B 0 B AS_permanent_overlap_bans: 1 24 B 0 B 0 B AS_readhitmiss: 0 24 B 0 B 0 B AS_readhmcovered: 0 24 B 0 B 0 B AS_count_rhm: 0 24 B 0 B 0 B AS_clipleft: 200285038 764 MiB 0 B 0 B AS_clipright: 200285038 764 MiB 0 B 0 B AS_used_ids: 200285038 191 MiB 0 B 2 B AS_multicopies: 0 191 MiB 191 MiB 2 B AS_hasmcoverlaps: 0 191 MiB 191 MiB 2 B AS_maxcoveragereached: 200285038 764 MiB 0 B 0 B AS_coverageperseqtype: 0 24 B 0 B 0 B AS_istroublemaker: 200285038 191 MiB 0 B 2 B AS_isdebris: 200285038 191 MiB 0 B 2 B AS_needalloverlaps: 200285038 191 MiB 0 B 2 B AS_readsforrepeatresolve: 0 40 B 0 B 0 B AS_allrmbsok: 0 764 MiB 764 MiB 0 B AS_probablermbsnotok: 0 764 MiB 764 MiB 0 B AS_weakrmbsnotok: 0 764 MiB 764 MiB 0 B AS_readmaytakeskim: 0 40 B 0 B 0 B AS_skimstaken: 0 40 B 0 B 0 B AS_numskimoverlaps: 0 24 B 0 B 0 B AS_numleftextendskims: 0 24 B 0 B 0 B AS_rightextendskims: 0 24 B 0 B 0 B AS_skimleftextendratio: 0 24 B 0 B 0 B AS_skimrightextendratio: 0 24 B 0 B 0 B AS_skimmegahubs: 0 24 B 0 B 0 B AS_usedtmpfiles: 8 272 B 0 B 0 B Total: 6008552384 (5.6 GiB) ================================================================================ Dynamic s allocs: 0 Dynamic m allocs: 0 Align allocs: 0 Out of memory detected, exception message is: std::bad_alloc If you have questions on why this happened, please send the last 1000 lines of the output log (or better: the complete file) to the author together with a short summary of your assembly project. VCODE: 4.9.3 For general help, you will probably get a quicker response on the MIRA talk mailing list than if you mailed the author directly. To report bugs or ask for features, please use the SourceForge ticketing system at: http://sourceforge.net/p/mira-assembler/tickets/ This ensures that requests do not get lost. Failure, wrapped MIRA process aborted.