[mira_talk] ouch... that memory

  • From: Wei-Jen Chang <wchang@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 17 Nov 2014 10:09:43 -0500

Hi everyone,

I know it's probably an old issue.

I am trying to assemble an eukaryotic genome with an unknown genome size (a
preliminary velvet assembly suggested 150 Mb). I have two Illumina data
sets, one from traditional small inserts and another one from a 3.5 kbp
mate-pair library. After two days of MIRA run I saw "out of memory
detected", and I think that's why the run ended (Log attached).

So other than buying more memory (if it was indeed out of memory), would it
be wise to do an assembly on short inserts (or mate-pair) first, and then
do another run using the reference assembly mode? Does it make sense?

Thanks,


WJ
This is MIRA 4.9.3 .

Please cite: Chevreux, B., Wetter, T. and Suhai, S. (1999), Genome Sequence
Assembly Using Trace Signals and Additional Sequence Information.
Computer Science and Biology: Proceedings of the German Conference on
Bioinformatics (GCB) 99, pp. 45-56.

To (un-)subscribe the MIRA mailing lists, see:
        http://www.chevreux.org/mira_mailinglists.html

After subscribing, mail general questions to the MIRA talk mailing list:
        mira_talk@xxxxxxxxxxxxx


To report bugs or ask for features, please use the SourceForge ticketing
system at:
        http://sourceforge.net/p/mira-assembler/tickets/
This ensures that requests do not get lost.


Compiled by: bach
Sat Nov  8 19:53:37 CET 2014
On: Linux vk10464 2.6.32-41-generic #94-Ubuntu SMP Fri Jul 6 18:00:34 UTC 2012 
x86_64 GNU/Linux
Compiled in boundtracking mode.
Compiled in bugtracking mode.
Compiled with ENABLE64 activated.
Runtime settings (sorry, for debug):
        Size of size_t  : 8
        Size of uint32  : 4
        Size of uint32_t: 4
        Size of uint64  : 8
        Size of uint64_t: 8
Current system: Linux una0002-ib 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue Sep 9 
21:36:05 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Looking for files named in data ...Pushing back filename: 
"/home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R1_001.fastq"
Pushing back filename: 
"/home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R2_001.fastq"
Pushing back filename: 
"/home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R1_001.fastq"
Pushing back filename: 
"/home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R2_001.fastq"
Manifest:
projectname: MIRA_1st_Cryptocaryon
job: genome,denovo,accurate
parameters: COMMON_SETTINGS -NW:cmrnl=no -DI:trt=/scratch/biology/bio443/
Manifest load entries: 2
MLE 1:
RGID: 1
RGN: Crypto_Theront7_300bp      SN: StrainX
SP: ---> <---   SPio: 0 SPC: -1 IF: 50  IT: 800 TSio: 0
ST: 6 (Solexa)  namschem: 4     SID: 0
DQ: 30
BB: 0   Rail: 0 CER: 0

/home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R1_001.fastq 
/home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R2_001.fastq MLE 2:
RGID: 2
RGN: Crypto_Theront7_3-4kbp     SN: StrainX
SP: <--- --->   SPio: 0 SPC: -2 IF: 2000        IT: 5000        TSio: 0
ST: 6 (Solexa)  namschem: 4     SID: 0
DQ: 30
BB: 0   Rail: 0 CER: 0

/home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R1_001.fastq 
/home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R2_001.fastq 

Parameters parsed without error, perfect.

-CL:pec and -CO:emeas1clpec are set, setting -CO:emea values to 1.
------------------------------------------------------------------------------
Parameter settings seen for:
Sanger data

Used parameter settings:
  General (-GE):
        Project name                                : MIRA_1st_Cryptocaryon
        Number of threads (not)                     : 7
        Automatic memory management (amm)           : yes
            Keep percent memory free (kpmf)         : 15
            Max. process size (mps)                 : 0
        EST SNP pipeline step (esps)                : 0
        Colour reads by kmer frequency (crkf)       : yes
        Preprocess only (ppo)                       : no

  Load reads options (-LR):
        Wants quality file (wqf)                    :  [sxa]  yes

        Filecheck only (fo)                         : no

  Assembly options (-AS):
        Number of passes (nop)                      : 0
        Kmer series (kms)                           : 
        Maximum number of RMB break loops (rbl)     : 2
        Maximum contigs per pass (mcpp)             : 0

        Minimum read length (mrl)                   :  [sxa]  20
        Minimum reads per contig (mrpc)             :  [sxa]  10
        Enforce presence of qualities (epoq)        :  [sxa]  yes

        Automatic repeat detection (ard)            : yes
            Coverage threshold (ardct)              :  [sxa]  2.5
            Minimum length (ardml)                  :  [sxa]  300
            Grace length (ardgl)                    :  [sxa]  20
            Use uniform read distribution (urd)     : no
              Start in pass (urdsip)                : 3
              Cutoff multiplier (urdcm)             :  [sxa]  1.5

        Spoiler detection (sd)                      : yes
            Last pass only (sdlpo)                  : yes

        Use genomic pathfinder (ugpf)               : yes

        Use emergency search stop (uess)            : yes
            ESS partner depth (esspd)               : 500
        Use emergency blacklist (uebl)              : yes
        Use max. contig build time (umcbt)          : no
            Build time in seconds (bts)             : 10000

  Strain and backbone options (-SB):
        Bootstrap new backbone (bnb)                :  [sxa]  yes
        Start backbone usage in pass (sbuip)        : 0
        Backbone rail from strain (brfs)            : 
        Backbone rail length (brl)                  : 0
        Backbone rail overlap (bro)                 : 0
        Trim overhanging reads (tor)                : yes

        (Also build new contigs (abnc))             : yes

  Dataprocessing options (-DP):
        Use read extensions (ure)                   :  [sxa]  no
            Read extension window length (rewl)     :  [sxa]  30
            Read extension w. maxerrors (rewme)     :  [sxa]  2
            First extension in pass (feip)          :  [sxa]  0
            Last extension in pass (leip)           :  [sxa]  0

  Clipping options (-CL):
        SSAHA2 or SMALT clipping:
            Gap size (msvsgs)                       :  [sxa]  1
            Max front gap (msvsmfg)                 :  [sxa]  2
            Max end gap (msvsmeg)                   :  [sxa]  2
            Strict front clip (msvssfc)             :  [sxa]  0
            Strict end clip (msvssec)               :  [sxa]  0
        Possible vector leftover clip (pvlc)        :  [sxa]  no
            maximum len allowed (pvcmla)            :  [sxa]  18
        Min qual. threshold for entire read (mqtfer):  [sxa]  5
            Number of bases (mqtfernob)             :  [sxa]  15
        Quality clip (qc)                           :  [sxa]  no
            Minimum quality (qcmq)                  :  [sxa]  20
            Window length (qcwl)                    :  [sxa]  30
        Bad stretch quality clip (bsqc)             :  [sxa]  no
            Minimum quality (bsqcmq)                :  [sxa]  5
            Window length (bsqcwl)                  :  [sxa]  20
        Masked bases clip (mbc)                     :  [sxa]  no
            Gap size (mbcgs)                        :  [sxa]  5
            Max front gap (mbcmfg)                  :  [sxa]  12
            Max end gap (mbcmeg)                    :  [sxa]  12
        Lower case clip front (lccf)                :  [sxa]  no
        Lower case clip back (lccb)                 :  [sxa]  no
        Clip poly A/T at ends (cpat)                :  [sxa]  no
            Keep poly-a signal (cpkps)              :  [sxa]  no
            Minimum signal length (cpmsl)           :  [sxa]  12
            Max errors allowed (cpmea)              :  [sxa]  1
            Max gap from ends (cpmgfe)              :  [sxa]  9
        Clip 3 prime polybase (c3pp)                :  [sxa]  yes
            Minimum signal length (c3ppmsl)         :  [sxa]  15
            Max errors allowed (c3ppmea)            :  [sxa]  3
            Max gap from ends (c3ppmgfe)            :  [sxa]  9
        Clip known adaptors right (ckar)            :  [sxa]  yes
        Ensure minimum left clip (emlc)             :  [sxa]  no
            Minimum left clip req. (mlcr)           :  [sxa]  0
            Set minimum left clip to (smlc)         :  [sxa]  0
        Ensure minimum right clip (emrc)            :  [sxa]  no
            Minimum right clip req. (mrcr)          :  [sxa]  10
            Set minimum right clip to (smrc)        :  [sxa]  20

        Apply SKIM chimera detection clip (ascdc)   : yes
        Apply SKIM junk detection clip (asjdc)      : no

        Propose end clips (pec)                     :  [sxa]  yes
            Kmer size (peckms)                      : 31
            Minimum kmer for forward-rev (pmkfr)    : 1
            Rare kmer mask (rkm)                    :  [sxa]  no
            Handle Solexa GGCxG problem (pechsgp)   : yes

            Front freq (pffreq)                     :  [sxa]  0
            Back freq (pbfreq)                      :  [sxa]  0
            Front forward-rev (pffore)              :  [sxa]  yes
            Back forward-rev (pbfore)               :  [sxa]  yes
            Front conf. multi-seq type (pfcmst)     :  [sxa]  yes
            Back conf. multi-seq type (pbcmst)      :  [sxa]  yes
            Front seen at low pos (pfsalp)          :  [sxa]  no
            Back seen at low pos (pbsalp)           :  [sxa]  no

        Clip bad solexa ends (cbse)                 :  [sxa]  yes
        Search PhiX174 (spx174)                     :  [sxa]  yes
            Filter PhiX174 (fpx174)                 :  [sxa]  no

  Parameters for SKIM algorithm (-SK):
        Number of threads (not)                     : 7

        Also compute reverse complements (acrc)     : yes
        Kmer size (kms)                             : 17
            Automatic increase per pass (kmsaipp)   : 1
            Kmer size max(kmsmax)                   : 0
        Kmer save stepping (kss)                    : 1
        Percent required (pr)                       :  [sxa]  95

        Max hits per read (mhpr)                    : 2000

        Filter megahubs (fmh)                       : yes
            Megahub cap (mhc)                       : 150000
            Max megahub ratio (mmhr)                : 0

        SW check on backbones (swcob)               : no

        Max kmers in memory (mkim)                  : 15000000
        MemCap: hit reduction (mchr)                : 4096

  Parameters for Kmer Statistics (-KS):
        Freq. cov. estim. min (fcem)                : 0
        Freq. estim. min normal (fenn)              : 0.4
        Freq. estim. max normal (fexn)              : 1.6
        Freq. estim. repeat (fer)                   : 1.9
        Freq. estim. heavy repeat (fehr)            : 8
        Freq. estim. crazy (fecr)                   : 20
        Mask nasty repeats (mnr)                    : yes
            Nasty repeat ratio (nrr)                : 100
            Nasty repeat coverage (nrc)             : 0
            Lossless digital normalisation (ldn)    : no

        Repeat level in info file (rliif)           : 6

        Million kmers per buffer (mkpb)             : 4
        Rare kmer early kill (rkek)                 : no

  Pathfinder options (-PF):
        Use quick rule (uqr)                        :  [sxa]  yes
            Quick rule min len 1 (qrml1)            :  [sxa]  -95
            Quick rule min sim 1 (qrms1)            :  [sxa]  100
            Quick rule min len 2 (qrml2)            :  [sxa]  -85
            Quick rule min sim 2 (qrms2)            :  [sxa]  100
        Backbone quick overlap min len (bqoml)      :  [sxa]  20
        Max. start cache fill time (mscft)          : 5

  Align parameters for Smith-Waterman align (-AL):
        Bandwidth in percent (bip)             :  [sxa]  20
        Bandwidth max (bmax)                   :  [sxa]  80
        Bandwidth min (bmin)                   :  [sxa]  20
        Minimum score (ms)                     :  [sxa]  15
        Minimum overlap (mo)                   :  [sxa]  17
        Minimum relative score in % (mrs)      :  [sxa]  90
        Solexa_hack_max_errors (shme)          :  [sxa]  -1
        Extra gap penalty (egp)                :  [sxa]  yes
            extra gap penalty level (egpl)     :  [sxa] reject_codongaps
            Max. egp in percent (megpp)        :  [sxa]  100

  Contig parameters (-CO):
        Name prefix (np)                                         : 
MIRA_1st_Cryptocaryon
        Reject on drop in relative alignment score in % (rodirs) :  [sxa]  30
        CMinimum relative score in % (cmrs)                      :  [sxa]  -1
        Mark repeats (mr)                                        : yes
            Only in result (mroir)                               : no
            Assume SNP instead of repeats (asir)                 : no
            Minimum reads per group needed for tagging (mrpg)    :  [sxa]  4
            Minimum neighbour quality needed for tagging (mnq)   :  [sxa]  20
            Minimum Group Quality needed for RMB Tagging (mgqrt) :  [sxa]  30
            End-read Marking Exclusion Area in bases (emea)      :  [sxa]  1
                Set to 1 on clipping PEC (emeas1clpec)           : yes
            Also mark gap bases (amgb)                           :  [sxa]  yes
                Also mark gap bases - even multicolumn (amgbemc) :  [sxa]  yes
                Also mark gap bases - need both strands (amgbnbs):  [sxa]  yes
        Force non-IUPAC consensus per sequencing type (fnicpst)  :  [sxa]  no
        Merge short reads (msr)                                  :  [sxa]  yes
            Max errors (msrme)                                   :  [sxa]  0
            Keep ends unmerged (msrkeu)                          :  [sxa]  -1
        Gap override ratio (gor)                                 :  [sxa]  66

  Edit options (-ED):
        Mira automatic contig editing (mace)        : yes
            Edit kmer singlets (eks)                : yes
            Edit homopolymer overcalls (ehpo)       :  [sxa]  no

  Misc (-MI):
        Large contig size (lcs)                     : 500
        Large contig size for stats (lcs4s)         : 5000

        I know what I do (ikwid)                    : no

        Extra flag 1 / sanity track check (ef1)     : no
        Extra flag 2 / dnredreadsatpeaks (ef2)      : yes
        Extra flag 3 / pelibdisassemble (ef3)       : no
        Extended log (el)                           : no

  Nag and Warn (-NW):
        Check NFS (cnfs)                            : stop
        Check multi pass mapping (cmpm)             : stop
        Check template problems (ctp)               : stop
        Check SRA read names (csrn)                 : stop
        Check duplicate read names (cdrn)           : stop
        Check max read name length (cmrnl)          : no
            Max read name length (mrnl)             : 40
        Check average coverage (cac)                : stop
            Average coverage value (acv)            : 80

  Directories (-DI):
        Top directory for writing files   : MIRA_1st_Cryptocaryon_assembly
        For writing result files          : 
MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_results
        For writing result info files     : 
MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_info
        For writing tmp files             : 
/scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp
        Tmp redirected to (trt)           : /scratch/biology/bio443/
        For writing checkpoint files      : 
MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_chkpt

  Output files (-OUTPUT/-OUT):
        Save simple singlets in project (sssip)      :  [sxa]  no
        Save tagged singlets in project (stsip)      :  [sxa]  yes

        Remove rollover tmps (rrot)                  : yes
        Remove tmp directory (rtd)                   : no

    Result files:
        Saved as CAF                       (orc)     : yes
        Saved as MAF                       (orm)     : yes
        Saved as FASTA                     (orf)     : yes
        Saved as GAP4 (directed assembly)  (org)     : no
        Saved as phrap ACE                 (ora)     : no
        Saved as GFF3                     (org3)     : no
        Saved as HTML                      (orh)     : no
        Saved as Transposed Contig Summary (ors)     : yes
        Saved as simple text format        (ort)     : no
        Saved as wiggle                    (orw)     : yes

    Temporary result files:
        Saved as CAF                       (otc)     : yes
        Saved as MAF                       (otm)     : no
        Saved as FASTA                     (otf)     : no
        Saved as GAP4 (directed assembly)  (otg)     : no
        Saved as phrap ACE                 (ota)     : no
        Saved as HTML                      (oth)     : no
        Saved as Transposed Contig Summary (ots)     : no
        Saved as simple text format        (ott)     : no

    Extended temporary result files:
        Saved as CAF                      (oetc)     : no
        Saved as FASTA                    (oetf)     : no
        Saved as GAP4 (directed assembly) (oetg)     : no
        Saved as phrap ACE                (oeta)     : no
        Saved as HTML                     (oeth)     : no
        Save also singlets               (oetas)     : no

    Alignment output customisation:
        TEXT characters per line (tcpl)              : 60
        HTML characters per line (hcpl)              : 60
        TEXT end gap fill character (tegfc)          :  
        HTML end gap fill character (hegfc)          :  

    File / directory output names:
        CAF             : MIRA_1st_Cryptocaryon_out.caf
        MAF             : MIRA_1st_Cryptocaryon_out.maf
        FASTA           : MIRA_1st_Cryptocaryon_out.unpadded.fasta
        FASTA quality   : MIRA_1st_Cryptocaryon_out.unpadded.fasta.qual
        FASTA (padded)  : MIRA_1st_Cryptocaryon_out.padded.fasta
        FASTA qual.(pad): MIRA_1st_Cryptocaryon_out.padded.fasta.qual
        GAP4 (directory): MIRA_1st_Cryptocaryon_out.gap4da
        ACE             : MIRA_1st_Cryptocaryon_out.ace
        HTML            : MIRA_1st_Cryptocaryon_out.html
        Simple text     : MIRA_1st_Cryptocaryon_out.txt
        TCS overview    : MIRA_1st_Cryptocaryon_out.tcs
        Wiggle          : MIRA_1st_Cryptocaryon_out.wig
------------------------------------------------------------------------------
Creating directory MIRA_1st_Cryptocaryon_assembly ... done.
Creating directory 
MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_results ... done.
Creating directory MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_info 
... done.
Creating directory MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_chkpt 
... done.
Symlink MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_tmp now pointing 
to /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta

Tmp directory is not on a NFS mount, good.

Localtime: Thu Nov 13 21:16:43 2014

Loading reads from 
/home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R1_001.fastq type fastq
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... 
[50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... 
[100%] Looking at FASTQ type ... guessing FASTQ-33 (Sanger)
Running quality values adaptation ... done.
Loading reads from 
/home/bio443/NGS/Crypto/300bp/Therout-D7_CCGTCC_L001_R2_001.fastq type fastq
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... 
[50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... 
[100%] Looking at FASTQ type ... guessing FASTQ-33 (Sanger)
Running quality values adaptation ... done.
Loading reads from 
/home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R1_001.fastq type fastq
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... 
[50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... 
[100%] Looking at FASTQ type ... guessing FASTQ-33 (Sanger)
Running quality values adaptation ... done.
Loading reads from 
/home/bio443/NGS/Crypto/3-4kb/Therout-D7_GTCCGC_L008_R2_001.fastq type fastq
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... 
[50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... 
[100%] Looking at FASTQ type ... guessing FASTQ-33 (Sanger)
Running quality values adaptation ... done.
List of read names which have problems with name length:
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:1943:2191/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:1763:2236/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2098:2150/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2021:2153/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2065:2198/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2090:2220/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2409:2150/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2277:2162/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2384:2177/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2492:2179/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2374:2226/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2669:2152/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2554:2214/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2767:2130/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2819:2168/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2852:2175/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2935:2190/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2998:2191/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:2914:2221/1
Name too long: GWZHISEQ03:450:C5B99ACXX:1:1101:3111:2123/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1432:2086/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1471:2089/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1357:2093/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1421:2138/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1365:2155/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1479:2155/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1389:2166/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1448:2169/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1352:2169/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1494:2173/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1428:2180/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1461:2212/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1500:2104/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1712:2108/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1624:2108/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1522:2128/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1646:2141/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1574:2151/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1713:2171/1
Name too long: GWZHISEQ01:321:C5AL1ACXX:8:1101:1558:2178/1
200285038 reads had a long name length, for brevity's sake not all were listed.
WARNING!

-------- MINOR warning --------

MIRA warncode: READ_NAME_TOO_LONG
Title: Long read names

200285038 reads were detected with names longer than 40 characters (see output
log for more details).

While MIRA and many other programs have no problem with that, some older
programs have restrictions concerning the length of the read name.

Example given: the pipeline
     CAF -> caf2gap -> gap2caf
will stop working at the gap2caf stage if there are read names having > 40
characters where the names differ only at >40 characters.

This is a warning only, but as a couple of people were bitten by this, the
default behaviour of MIRA is to stop when it sees that potential problem.

You might want to rename your reads to have <= 40 characters. Instead of
renaming reads in the input files, maybe the 'rename_prefix' functionality of
manifest files is useful for you there.

On the other hand, you also can ignore this potential problem and force MIRA to
continue by using the parameter: '-NW:cmrnl=warn' or '-NW:cmrnl=no'

Checking reads for trace data (loading qualities if needed):
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... 
[50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... 
[100%] 
No SCF data present in any read, EdIt automatic contig editing for Sanger data 
is now switched off.
200285038 reads with valid data for assembly.
Localtime: Fri Nov 14 00:13:57 2014

Generated 100142519 unique DNA template ids for 200285038 valid reads.
TODO: Like Readpool: strain x has y reads
Have read pool with 200285038 reads.

===========================================================================
Backbones: 0    Backbone rails: 0
Sequencing technology statistics:

                Sanger  454     IonTor  PcBioHQ PcBioLQ Text    Solexa  Solid
                ------------------------------------------------------------
Total reads     0       0       0       0       0       0       200285038       0
Reads wo qual   0       0       0       0       0       0       0       0
Used reads      0       0       0       0       0       0       200285038       0
Avg. tot rlen   0       0       0       0       0       0       101     0
Avg. used rlen  0       0       0       0       0       0       101     0
W/o clips       0       0       0       0       0       0       200285038       0


Readgroup statistics:
RG 1    Solexa  avg total len: 101      avg clip len: 101       total bases: 
6948467306 used bases: 6948467306
RG 2    Solexa  avg total len: 101      avg clip len: 101       total bases: 
13280321532        used bases: 13280321532
===========================================================================


Checking pairs of readgroup 1 (named: 'Crypto_Theront7_300bp'):  found 68796706
Checking pairs of readgroup 2 (named: 'Crypto_Theront7_3-4kbp'):  found 
131488332
/scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t0.0.txt
/scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t1.0.txt
/scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t2.0.txt
/scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t3.0.txt
/scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t4.0.txt
/scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t5.0.txt
/scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta/MIRA_1st_Cryptocaryon_int_clippings_t6.0.txt
Post-load clips:
Localtime: Fri Nov 14 03:03:31 2014
freemem: 574005248
TNH: 5356
XME 1: 0.000212828
XME 2: 0.1
NEPB 1: 104857
NEPB 2: 104857
Writing temporary hstat files:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... 
[50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... 
[100%] done
Localtime: Fri Nov 14 03:03:31 2014
Flushing buffers to disk:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... 
[50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... 
[100%] done

Analysing hstat files:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... 
[50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... 
[100%] 
Localtime: Fri Nov 14 03:03:31 2014
clean up temporary stat files...Localtime: Fri Nov 14 03:03:31 2014
Localtime: Fri Nov 14 03:03:31 2014
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... 
[50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... 
[100%] 
CLIP MSG: Adaptor right found: 37070249

===========================================================================
Backbones: 0    Backbone rails: 0
Sequencing technology statistics:

                Sanger  454     IonTor  PcBioHQ PcBioLQ Text    Solexa  Solid
                ------------------------------------------------------------
Total reads     0       0       0       0       0       0       200285038       0
Reads wo qual   0       0       0       0       0       0       0       0
Used reads      0       0       0       0       0       0       189862369       0
Avg. tot rlen   0       0       0       0       0       0       101     0
Avg. used rlen  0       0       0       0       0       0       93      0
W/o clips       0       0       0       0       0       0       161173976       0


Readgroup statistics:
RG 1    Solexa  avg total len: 101      avg clip len: 100       total bases: 
6948467306 used bases: 6889335079
RG 2    Solexa  avg total len: 101      avg clip len: 89        total bases: 
13280321532        used bases: 10884842967
===========================================================================


Sorting reads ... done.
Symlink MIRA_1st_Cryptocaryon_assembly/MIRA_1st_Cryptocaryon_d_tmp now pointing 
to /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta
Could not perform NFS check for directory 
/scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta

For a check to run smoothly, please make sure the Unix 'stat' command is 
available
and understands the following call: stat -f -L -c %T 
/scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta



Make sure /scratch/biology/bio443//MIRA_1st_Cryptocaryon_d_tmp_ZFeDta is *NOT* 
on a NFS mount or else MIRA will run *very* slowly.
PRED MAXTID 100142518
Hash analysis for proposed cutbacks:Localtime: Fri Nov 14 14:23:36 2014
freemem: 581263360
TNH: 12104703467
XME 1: 480.998
XME 2: 4
NEPB 1: 4194304
NEPB 2: 4194304
Writing temporary hstat files:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... 
[50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... 
[100%] done
Localtime: Fri Nov 14 18:39:26 2014
Flushing buffers to disk:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... 
[50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... 
[100%] done

Analysing hstat files:
Ouch, out of memory detected.


========================== Memory self assessment ==============================
Running in 64 bit mode.

Dump from /proc/meminfo
--------------------------------------------------------------------------------
MemTotal:       198337536 kB
MemFree:          515652 kB
Buffers:            7208 kB
Cached:           665896 kB
SwapCached:      1664068 kB
Active:         187157832 kB
Inactive:        8809564 kB
Active(anon):   187131668 kB
Inactive(anon):  8178952 kB
Active(file):      26164 kB
Inactive(file):   630612 kB
Unevictable:       56888 kB
Mlocked:           30316 kB
SwapTotal:      18481148 kB
SwapFree:        2203220 kB
Dirty:            256816 kB
Writeback:             0 kB
AnonPages:      193686900 kB
Mapped:            20564 kB
Shmem:                 0 kB
Slab:              88632 kB
SReclaimable:      46556 kB
SUnreclaim:        42076 kB
KernelStack:        2080 kB
PageTables:       414288 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    117649916 kB
Committed_AS:   213819196 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      683980 kB
VmallocChunk:   34256604296 kB
HardwareCorrupted:     0 kB
AnonHugePages:  21653504 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        7680 kB
DirectMap2M:    201310208 kB
--------------------------------------------------------------------------------

Dump from /proc/self/status
--------------------------------------------------------------------------------
Name:   mira
State:  R (running)
Tgid:   9442
Pid:    9442
PPid:   9441
TracerPid:      0
Uid:    4317    4317    4317    4317
Gid:    4300    4300    4300    4300
Utrace: 0
FDSize: 64
Groups: 1000 4300 
VmPeak: 213641816 kB
VmSize: 213641816 kB
VmLck:         0 kB
VmHWM:  195985648 kB
VmRSS:  193618032 kB
VmData: 213537320 kB
VmStk:        92 kB
VmExe:      7492 kB
VmLib:         0 kB
VmPTE:    409948 kB
VmSwap: 16240736 kB
Threads:        1
SigQ:   0/1549352
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000381000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed:   ffff
Cpus_allowed_list:      0-15
Mems_allowed:   
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003
Mems_allowed_list:      0-1
voluntary_ctxt_switches:        6101548
nonvoluntary_ctxt_switches:     1748266
--------------------------------------------------------------------------------

Information on current assembly object:

AS_readpool: 200285038 reads.
AS_contigs: 0 contigs.
AS_bbcontigs: 0 contigs.
Mem used for reads: 192 (192 B)

Memory used in assembly structures:
                                           Eff. Size   Free cap. LostByAlign
     AS_writtenskimhitsperid:          0        24 B         0 B         0 B
               AS_skim_edges:          0        24 B         0 B         0 B
                 AS_adsfacts:          0        24 B         0 B         0 B
          AS_confirmed_edges:          0        24 B         0 B         0 B
   AS_permanent_overlap_bans:          1        24 B         0 B         0 B
              AS_readhitmiss:          0        24 B         0 B         0 B
            AS_readhmcovered:          0        24 B         0 B         0 B
                AS_count_rhm:          0        24 B         0 B         0 B
                 AS_clipleft:  200285038     764 MiB         0 B         0 B
                AS_clipright:  200285038     764 MiB         0 B         0 B
                 AS_used_ids:  200285038     191 MiB         0 B         2 B
              AS_multicopies:          0     191 MiB     191 MiB         2 B
            AS_hasmcoverlaps:          0     191 MiB     191 MiB         2 B
       AS_maxcoveragereached:  200285038     764 MiB         0 B         0 B
       AS_coverageperseqtype:          0        24 B         0 B         0 B
           AS_istroublemaker:  200285038     191 MiB         0 B         2 B
                 AS_isdebris:  200285038     191 MiB         0 B         2 B
          AS_needalloverlaps:  200285038     191 MiB         0 B         2 B
    AS_readsforrepeatresolve:          0        40 B         0 B         0 B
                AS_allrmbsok:          0     764 MiB     764 MiB         0 B
        AS_probablermbsnotok:          0     764 MiB     764 MiB         0 B
            AS_weakrmbsnotok:          0     764 MiB     764 MiB         0 B
          AS_readmaytakeskim:          0        40 B         0 B         0 B
               AS_skimstaken:          0        40 B         0 B         0 B
          AS_numskimoverlaps:          0        24 B         0 B         0 B
       AS_numleftextendskims:          0        24 B         0 B         0 B
         AS_rightextendskims:          0        24 B         0 B         0 B
      AS_skimleftextendratio:          0        24 B         0 B         0 B
     AS_skimrightextendratio:          0        24 B         0 B         0 B
             AS_skimmegahubs:          0        24 B         0 B         0 B
             AS_usedtmpfiles:          8       272 B         0 B         0 B
Total: 6008552384 (5.6 GiB)

================================================================================


========================== Memory self assessment ==============================
Running in 64 bit mode.

Dump from /proc/meminfo
--------------------------------------------------------------------------------
MemTotal:       198337536 kB
MemFree:          539592 kB
Buffers:            7208 kB
Cached:           641396 kB
SwapCached:      1664068 kB
Active:         187157368 kB
Inactive:        8785500 kB
Active(anon):   187131204 kB
Inactive(anon):  8178952 kB
Active(file):      26164 kB
Inactive(file):   606548 kB
Unevictable:       56888 kB
Mlocked:           30316 kB
SwapTotal:      18481148 kB
SwapFree:        2203220 kB
Dirty:            256664 kB
Writeback:             0 kB
AnonPages:      193686788 kB
Mapped:            20776 kB
Shmem:                 0 kB
Slab:              88996 kB
SReclaimable:      46628 kB
SUnreclaim:        42368 kB
KernelStack:        2072 kB
PageTables:       414288 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    117649916 kB
Committed_AS:   213819196 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      683980 kB
VmallocChunk:   34256604296 kB
HardwareCorrupted:     0 kB
AnonHugePages:  21653504 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        7680 kB
DirectMap2M:    201310208 kB
--------------------------------------------------------------------------------

Dump from /proc/self/status
--------------------------------------------------------------------------------
Name:   mira
State:  R (running)
Tgid:   9442
Pid:    9442
PPid:   9441
TracerPid:      0
Uid:    4317    4317    4317    4317
Gid:    4300    4300    4300    4300
Utrace: 0
FDSize: 64
Groups: 1000 4300 
VmPeak: 213641816 kB
VmSize: 213641816 kB
VmLck:         0 kB
VmHWM:  195985648 kB
VmRSS:  193618124 kB
VmData: 213537320 kB
VmStk:        92 kB
VmExe:      7492 kB
VmLib:         0 kB
VmPTE:    409948 kB
VmSwap: 16240736 kB
Threads:        1
SigQ:   0/1549352
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000381000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed:   ffff
Cpus_allowed_list:      0-15
Mems_allowed:   
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003
Mems_allowed_list:      0-1
voluntary_ctxt_switches:        6101548
nonvoluntary_ctxt_switches:     1748345
--------------------------------------------------------------------------------

Information on current assembly object:

AS_readpool: 200285038 reads.
AS_contigs: 0 contigs.
AS_bbcontigs: 0 contigs.
Mem used for reads: 192 (192 B)

Memory used in assembly structures:
                                           Eff. Size   Free cap. LostByAlign
     AS_writtenskimhitsperid:          0        24 B         0 B         0 B
               AS_skim_edges:          0        24 B         0 B         0 B
                 AS_adsfacts:          0        24 B         0 B         0 B
          AS_confirmed_edges:          0        24 B         0 B         0 B
   AS_permanent_overlap_bans:          1        24 B         0 B         0 B
              AS_readhitmiss:          0        24 B         0 B         0 B
            AS_readhmcovered:          0        24 B         0 B         0 B
                AS_count_rhm:          0        24 B         0 B         0 B
                 AS_clipleft:  200285038     764 MiB         0 B         0 B
                AS_clipright:  200285038     764 MiB         0 B         0 B
                 AS_used_ids:  200285038     191 MiB         0 B         2 B
              AS_multicopies:          0     191 MiB     191 MiB         2 B
            AS_hasmcoverlaps:          0     191 MiB     191 MiB         2 B
       AS_maxcoveragereached:  200285038     764 MiB         0 B         0 B
       AS_coverageperseqtype:          0        24 B         0 B         0 B
           AS_istroublemaker:  200285038     191 MiB         0 B         2 B
                 AS_isdebris:  200285038     191 MiB         0 B         2 B
          AS_needalloverlaps:  200285038     191 MiB         0 B         2 B
    AS_readsforrepeatresolve:          0        40 B         0 B         0 B
                AS_allrmbsok:          0     764 MiB     764 MiB         0 B
        AS_probablermbsnotok:          0     764 MiB     764 MiB         0 B
            AS_weakrmbsnotok:          0     764 MiB     764 MiB         0 B
          AS_readmaytakeskim:          0        40 B         0 B         0 B
               AS_skimstaken:          0        40 B         0 B         0 B
          AS_numskimoverlaps:          0        24 B         0 B         0 B
       AS_numleftextendskims:          0        24 B         0 B         0 B
         AS_rightextendskims:          0        24 B         0 B         0 B
      AS_skimleftextendratio:          0        24 B         0 B         0 B
     AS_skimrightextendratio:          0        24 B         0 B         0 B
             AS_skimmegahubs:          0        24 B         0 B         0 B
             AS_usedtmpfiles:          8       272 B         0 B         0 B
Total: 6008552384 (5.6 GiB)

================================================================================
Dynamic s allocs: 0
Dynamic m allocs: 0
Align allocs: 0
Out of memory detected, exception message is: std::bad_alloc


If you have questions on why this happened, please send the last 1000
lines of the output log (or better: the complete file) to the author
together with a short summary of your assembly project.



VCODE: 4.9.3 


For general help, you will probably get a quicker response on the
    MIRA talk mailing list
than if you mailed the author directly.

To report bugs or ask for features, please use the SourceForge ticketing
system at:
        http://sourceforge.net/p/mira-assembler/tickets/
This ensures that requests do not get lost.
Failure, wrapped MIRA process aborted.

Other related posts: