I am doing a de-novo assembly of a transcriptome. I have paired reads as
fastq files. I did use these files in an assembly using Trinity and they
worked OK. I am trying to do a similar assembly with MIRA for comparison. I
used all default parameters. The log file is attached.
I am just trying to get a suggestion what could be e problem. Where to
start troubleshooting?
Thanks
--
Ewelina Rubin, PhD
Postdoctoral Research Fellow
Graduate School of Oceanography
University of Rhode Island
215 South Ferry Road
Narragansett, RI 02882
<401.874.6105>
This is MIRA 4.0.2 .
Please cite: Chevreux, B., Wetter, T. and Suhai, S. (1999), Genome Sequence
Assembly Using Trace Signals and Additional Sequence Information.
Computer Science and Biology: Proceedings of the German Conference on
Bioinformatics (GCB) 99, pp. 45-56.
To (un-)subscribe the MIRA mailing lists, see:
http://www.chevreux.org/mira_mailinglists.html
After subscribing, mail general questions to the MIRA talk mailing list:
mira_talk@xxxxxxxxxxxxx
To report bugs or ask for features, please use the SourceForge ticketing
system at:
http://sourceforge.net/p/mira-assembler/tickets/
This ensures that requests do not get lost.
Compiled by: bach
Fri Apr 18 14:57:20 CEST 2014
On: Linux vk10464 2.6.32-41-generic #94-Ubuntu SMP Fri Jul 6 18:00:34 UTC 2012
x86_64 GNU/Linux
Compiled in boundtracking mode.
Compiled in bugtracking mode.
Compiled with ENABLE64 activated.
Runtime settings (sorry, for debug):
Size of size_t : 8
Size of uint32 : 4
Size of uint32_t: 4
Size of uint64 : 8
Size of uint64_t: 8
Current system: Linux node582 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue Sep 9
21:36:05 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Looking for files named in data ...Pushing back filename: "SBPLim1Q30B.fastq"
Pushing back filename: "SBPLim2Q30B.fastq"
Manifest:
projectname: 3096_Plim_assembly
job: est,denovo,accurate
parameters:
Manifest load entries: 1
MLE 1:
RGID: 1
RGN: DataIlluminaPairedLib SN: StrainX
SP: SPio: 0 SPC: 0 IF: -1 IT: -1 TSio: 0
ST: 6 (Solexa) namschem: 4 SID: 0
DQ: 30
BB: 0 Rail: 0 CER: 0
SBPLim1Q30B.fastq SBPLim2Q30B.fastq
Parameters parsed without error, perfect.
-CL:pec and -CO:emeas1clpec are set, setting -CO:emea values to 1.
------------------------------------------------------------------------------
Parameter settings seen for:
Sanger data
Used parameter settings:
General (-GE):
Project name : 3096_Plim_assembly
Number of threads (not) : 15
Automatic memory management (amm) : yes
Keep percent memory free (kpmf) : 15
Max. process size (mps) : 0
EST SNP pipeline step (esps) : 0
Colour reads by hash frequency (crhf) : no
Load reads options (-LR):
Wants quality file (wqf) : [sxa] yes
Filecheck only (fo) : no
Assembly options (-AS):
Number of passes (nop) : 4
Skim each pass (sep) : yes
Maximum number of RMB break loops (rbl) : 2
Maximum contigs per pass (mcpp) : 0
Minimum read length (mrl) : [sxa] 20
Minimum reads per contig (mrpc) : [sxa] 4
Enforce presence of qualities (epoq) : [sxa] yes
Automatic repeat detection (ard) : no
Coverage threshold (ardct) : [sxa] 2.5
Minimum length (ardml) : [sxa] 300
Grace length (ardgl) : [sxa] 20
Use uniform read distribution (urd) : no
Start in pass (urdsip) : 3
Cutoff multiplier (urdcm) : [sxa] 1.5
Spoiler detection (sd) : no
Last pass only (sdlpo) : yes
Use genomic pathfinder (ugpf) : no
Use emergency search stop (uess) : yes
ESS partner depth (esspd) : 500
Use emergency blacklist (uebl) : yes
Use max. contig build time (umcbt) : yes
Build time in seconds (bts) : 360
Strain and backbone options (-SB):
Bootstrap new backbone (bnb) : yes
Start backbone usage in pass (sbuip) : 3
Backbone rail from strain (brfs) :
Backbone rail length (brl) : 0
Backbone rail overlap (bro) : 0
Trim overhanging reads (tor) : yes
(Also build new contigs (abnc)) : yes
Dataprocessing options (-DP):
Use read extensions (ure) : [sxa] no
Read extension window length (rewl) : [sxa] 30
Read extension w. maxerrors (rewme) : [sxa] 2
First extension in pass (feip) : [sxa] 0
Last extension in pass (leip) : [sxa] 0
Clipping options (-CL):
SSAHA2 or SMALT clipping:
Gap size (msvsgs) : [sxa] 1
Max front gap (msvsmfg) : [sxa] 2
Max end gap (msvsmeg) : [sxa] 2
Strict front clip (msvssfc) : [sxa] 0
Strict end clip (msvssec) : [sxa] 0
Possible vector leftover clip (pvlc) : [sxa] no
maximum len allowed (pvcmla) : [sxa] 18
Min qual. threshold for entire read (mqtfer): [sxa] 5
Number of bases (mqtfernob) : [sxa] 15
Quality clip (qc) : [sxa] no
Minimum quality (qcmq) : [sxa] 20
Window length (qcwl) : [sxa] 30
Bad stretch quality clip (bsqc) : [sxa] no
Minimum quality (bsqcmq) : [sxa] 5
Window length (bsqcwl) : [sxa] 20
Masked bases clip (mbc) : [sxa] yes
Gap size (mbcgs) : [sxa] 5
Max front gap (mbcmfg) : [sxa] 12
Max end gap (mbcmeg) : [sxa] 12
Lower case clip front (lccf) : [sxa] no
Lower case clip back (lccb) : [sxa] no
Clip poly A/T at ends (cpat) : [sxa] yes
Keep poly-a signal (cpkps) : [sxa] yes
Minimum signal length (cpmsl) : [sxa] 15
Max errors allowed (cpmea) : [sxa] 1
Max gap from ends (cpmgfe) : [sxa] 20000
Clip 3 prime polybase (c3pp) : [sxa] yes
Minimum signal length (c3ppmsl) : [sxa] 15
Max errors allowed (c3ppmea) : [sxa] 3
Max gap from ends (c3ppmgfe) : [sxa] 9
Clip known adaptors right (ckar) : [sxa] yes
Ensure minimum left clip (emlc) : [sxa] no
Minimum left clip req. (mlcr) : [sxa] 0
Set minimum left clip to (smlc) : [sxa] 0
Ensure minimum right clip (emrc) : [sxa] no
Minimum right clip req. (mrcr) : [sxa] 10
Set minimum right clip to (smrc) : [sxa] 20
Apply SKIM chimera detection clip (ascdc) : yes
Apply SKIM junk detection clip (asjdc) : no
Propose end clips (pec) : [sxa] yes
Bases per hash (pecbph) : 31
Handle Solexa GGCxG problem (pechsgp) : yes
Front freq (pffreq) : [sxa] 0
Back freq (pbfreq) : [sxa] 0
Minimum kmer for forward-rev (pmkfr) : 1
Front forward-rev (pffore) : [sxa] yes
Back forward-rev (pbfore) : [sxa] yes
Front conf. multi-seq type (pfcmst) : [sxa] yes
Back conf. multi-seq type (pbcmst) : [sxa] yes
Front seen at low pos (pfsalp) : [sxa] no
Back seen at low pos (pbsalp) : [sxa] no
Clip bad solexa ends (cbse) : [sxa] yes
Search PhiX174 (spx174) : [sxa] yes
Filter PhiX174 (fpx174) : [sxa] yes
Rare kmer mask (rkm) : [sxa] 2
Parameters for SKIM algorithm (-SK):
Number of threads (not) : 15
Also compute reverse complements (acrc) : yes
Bases per hash (bph) : 23
Automatic increase per pass (bphaipp) : 1
Automatic incr. cov. threshold (bphaict): 20
Hash save stepping (hss) : 1
Percent required (pr) : [sxa] 95
Max hits per read (mhpr) : 30
Max megahub ratio (mmhr) : 0
SW check on backbones (swcob) : no
Max hashes in memory (mhim) : 15000000
MemCap: hit reduction (mchr) : 4096
Parameters for Hash Statistics (-HS):
Freq. cov. estim. min (fcem) : 30
Freq. estim. min normal (fenn) : 0.4
Freq. estim. max normal (fexn) : 1.6
Freq. estim. repeat (fer) : 1.9
Freq. estim. heavy repeat (fehr) : 8
Freq. estim. crazy (fecr) : 20
Mask nasty repeats (mnr) : yes
Nasty repeat ratio (nrr) : 100
Nasty repeat coverage (nrc) : 200
Lossless digital normalisation (ldn) : yes
Repeat level in info file (rliif) : 6
Million hashes per buffer (mhpb) : 16
Rare kmer early kill (rkek) : no
Pathfinder options (-PF):
Use quick rule (uqr) : [sxa] yes
Quick rule min len 1 (qrml1) : [sxa] -95
Quick rule min sim 1 (qrms1) : [sxa] 100
Quick rule min len 2 (qrml2) : [sxa] -85
Quick rule min sim 2 (qrms2) : [sxa] 100
Backbone quick overlap min len (bqoml) : [sxa] 20
Max. start cache fill time (mscft) : 5
Align parameters for Smith-Waterman align (-AL):
Bandwidth in percent (bip) : [sxa] 20
Bandwidth max (bmax) : [sxa] 80
Bandwidth min (bmin) : [sxa] 20
Minimum score (ms) : [sxa] 15
Minimum overlap (mo) : [sxa] 25
Minimum relative score in % (mrs) : [sxa] 90
Solexa_hack_max_errors (shme) : [sxa] -1
Extra gap penalty (egp) : [sxa] yes
extra gap penalty level (egpl) : [sxa] reject_codongaps
Max. egp in percent (megpp) : [sxa] 100
Contig parameters (-CO):
Name prefix (np) :
3096_Plim_assembly
Reject on drop in relative alignment score in % (rodirs) : [sxa] 15
Mark repeats (mr) : yes
Only in result (mroir) : no
Assume SNP instead of repeats (asir) : no
Minimum reads per group needed for tagging (mrpg) : [sxa] 4
Minimum neighbour quality needed for tagging (mnq) : [sxa] 20
Minimum Group Quality needed for RMB Tagging (mgqrt) : [sxa] 30
End-read Marking Exclusion Area in bases (emea) : [sxa] 1
Set to 1 on clipping PEC (emeas1clpec) : yes
Also mark gap bases (amgb) : [sxa] yes
Also mark gap bases - even multicolumn (amgbemc) : [sxa] yes
Also mark gap bases - need both strands (amgbnbs): [sxa] yes
Force non-IUPAC consensus per sequencing type (fnicpst) : [sxa] no
Merge short reads (msr) : [sxa] yes
Max errors (msrme) : [sxa] 0
Keep ends unmerged (msrkeu) : [sxa] -1
Gap override ratio (gor) : [sxa] 66
Edit options (-ED):
Mira automatic contig editing (mace) : yes
Edit kmer singlets (eks) : yes
Edit homopolymer overcalls (ehpo) : [sxa] no
Misc (-MI):
Large contig size (lcs) : 500
Large contig size for stats (lcs4s) : 1000
I know what I do (ikwid) : no
Extra flag 1 / sanity track check (ef1) : no
Extra flag 2 / dnredreadsatpeaks (ef2) : yes
Extra flag 3 / pelibdisassemble (ef3) : yes
Extended log (el) : no
Nag and Warn (-NW):
Check NFS (cnfs) : stop
Check multi pass mapping (cmpm) : stop
Check template problems (ctp) : stop
Check duplicate read names (cdrn) : stop
Check max read name length (cmrnl) : stop
Max read name length (mrnl) : 40
Check average coverage (cac) : stop
Average coverage value (acv) : 80
Directories (-DI):
Top directory for writing files : 3096_Plim_assembly_assembly
For writing result files :
3096_Plim_assembly_assembly/3096_Plim_assembly_d_results
For writing result info files :
3096_Plim_assembly_assembly/3096_Plim_assembly_d_info
For writing tmp files :
3096_Plim_assembly_assembly/3096_Plim_assembly_d_tmp
Tmp redirected to (trt) :
For writing checkpoint files :
3096_Plim_assembly_assembly/3096_Plim_assembly_d_chkpt
Output files (-OUTPUT/-OUT):
Save simple singlets in project (sssip) : [sxa] no
Save tagged singlets in project (stsip) : [sxa] yes
Remove rollover tmps (rrot) : yes
Remove tmp directory (rtd) : no
Result files:
Saved as CAF (orc) : yes
Saved as MAF (orm) : yes
Saved as FASTA (orf) : yes
Saved as GAP4 (directed assembly) (org) : no
Saved as phrap ACE (ora) : no
Saved as GFF3 (org3) : no
Saved as HTML (orh) : no
Saved as Transposed Contig Summary (ors) : yes
Saved as simple text format (ort) : no
Saved as wiggle (orw) : no
Temporary result files:
Saved as CAF (otc) : yes
Saved as MAF (otm) : no
Saved as FASTA (otf) : no
Saved as GAP4 (directed assembly) (otg) : no
Saved as phrap ACE (ota) : no
Saved as HTML (oth) : no
Saved as Transposed Contig Summary (ots) : no
Saved as simple text format (ott) : no
Extended temporary result files:
Saved as CAF (oetc) : no
Saved as FASTA (oetf) : no
Saved as GAP4 (directed assembly) (oetg) : no
Saved as phrap ACE (oeta) : no
Saved as HTML (oeth) : no
Save also singlets (oetas) : no
Alignment output customisation:
TEXT characters per line (tcpl) : 60
HTML characters per line (hcpl) : 60
TEXT end gap fill character (tegfc) :
HTML end gap fill character (hegfc) :
File / directory output names:
CAF : 3096_Plim_assembly_out.caf
MAF : 3096_Plim_assembly_out.maf
FASTA : 3096_Plim_assembly_out.unpadded.fasta
FASTA quality : 3096_Plim_assembly_out.unpadded.fasta.qual
FASTA (padded) : 3096_Plim_assembly_out.padded.fasta
FASTA qual.(pad): 3096_Plim_assembly_out.padded.fasta.qual
GAP4 (directory): 3096_Plim_assembly_out.gap4da
ACE : 3096_Plim_assembly_out.ace
HTML : 3096_Plim_assembly_out.html
Simple text : 3096_Plim_assembly_out.txt
TCS overview : 3096_Plim_assembly_out.tcs
Wiggle : 3096_Plim_assembly_out.wig
------------------------------------------------------------------------------
Creating directory 3096_Plim_assembly_assembly ... done.
Creating directory 3096_Plim_assembly_assembly/3096_Plim_assembly_d_results ...
done.
Creating directory 3096_Plim_assembly_assembly/3096_Plim_assembly_d_info ...
done.
Creating directory 3096_Plim_assembly_assembly/3096_Plim_assembly_d_chkpt ...
done.
Creating directory 3096_Plim_assembly_assembly/3096_Plim_assembly_d_tmp ...
done.
Tmp directory is not on a NFS mount, good.
Localtime: Tue Dec 6 15:24:56 2016
Loading reads from SBPLim1Q30B.fastq type fastq
Localtime: Tue Dec 6 15:24:56 2016
Loading data from FASTQ file: SBPLim1Q30B.fastq
(sorry, no progress indicator for that, possible only with zlib >=1.34)
========================== Memory self assessment ==============================
Running in 64 bit mode.
Dump from /proc/meminfo
--------------------------------------------------------------------------------
MemTotal: 132273356 kB
MemFree: 126483024 kB
Buffers: 0 kB
Cached: 1215228 kB
SwapCached: 0 kB
Active: 700432 kB
Inactive: 916444 kB
Active(anon): 664160 kB
Inactive(anon): 905896 kB
Active(file): 36272 kB
Inactive(file): 10548 kB
Unevictable: 2097152 kB
Mlocked: 36800 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 2499964 kB
Mapped: 235484 kB
Shmem: 1168196 kB
Slab: 750028 kB
SReclaimable: 458036 kB
SUnreclaim: 291992 kB
KernelStack: 6992 kB
PageTables: 7964 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 66136676 kB
Committed_AS: 3708772 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 720740 kB
VmallocChunk: 34289539476 kB
HardwareCorrupted: 0 kB
AnonHugePages: 2162688 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 4104 kB
DirectMap2M: 2072576 kB
DirectMap1G: 132120576 kB
--------------------------------------------------------------------------------
Dump from /proc/self/status
--------------------------------------------------------------------------------
Name: mira
State: R (running)
Tgid: 2375
Pid: 2375
PPid: 2374
TracerPid: 0
Uid: 3084 3084 3084 3084
Gid: 597 597 597 597
Utrace: 0
FDSize: 64
Groups: 597
VmPeak: 108416 kB
VmSize: 108416 kB
VmLck: 0 kB
VmHWM: 5804 kB
VmRSS: 5804 kB
VmData: 5624 kB
VmStk: 92 kB
VmExe: 5792 kB
VmLib: 0 kB
VmPTE: 44 kB
VmSwap: 0 kB
Threads: 1
SigQ: 0/1033194
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed: 00ff
Cpus_allowed_list: 0-7
Mems_allowed:
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003
Mems_allowed_list: 0-1
voluntary_ctxt_switches: 49
nonvoluntary_ctxt_switches: 2
--------------------------------------------------------------------------------
Information on current assembly object:
AS_readpool: 1 reads.
AS_contigs: 0 contigs.
AS_bbcontigs: 0 contigs.
Mem used for reads: 184 (184 B)
Memory used in assembly structures:
Eff. Size Free cap. LostByAlign
AS_writtenskimhitsperid: 0 24 B 0 B 0 B
AS_skim_edges: 0 24 B 0 B 0 B
AS_adsfacts: 0 24 B 0 B 0 B
AS_confirmed_edges: 0 24 B 0 B 0 B
AS_permanent_overlap_bans: 1 24 B 0 B 0 B
AS_readhitmiss: 0 24 B 0 B 0 B
AS_readhmcovered: 0 24 B 0 B 0 B
AS_count_rhm: 0 24 B 0 B 0 B
AS_clipleft: 0 24 B 0 B 0 B
AS_clipright: 0 24 B 0 B 0 B
AS_used_ids: 0 24 B 0 B 0 B
AS_multicopies: 0 24 B 0 B 0 B
AS_hasmcoverlaps: 0 24 B 0 B 0 B
AS_maxcoveragereached: 0 24 B 0 B 0 B
AS_coverageperseqtype: 0 24 B 0 B 0 B
AS_istroublemaker: 0 24 B 0 B 0 B
AS_isdebris: 0 24 B 0 B 0 B
AS_needalloverlaps: 0 24 B 0 B 0 B
AS_readsforrepeatresolve: 0 40 B 0 B 0 B
AS_allrmbsok: 0 24 B 0 B 0 B
AS_probablermbsnotok: 0 24 B 0 B 0 B
AS_weakrmbsnotok: 0 24 B 0 B 0 B
AS_readmaytakeskim: 0 40 B 0 B 0 B
AS_skimstaken: 0 40 B 0 B 0 B
AS_numskimoverlaps: 0 24 B 0 B 0 B
AS_numleftextendskims: 0 24 B 0 B 0 B
AS_rightextendskims: 0 24 B 0 B 0 B
AS_skimleftextendratio: 0 24 B 0 B 0 B
AS_skimrightextendratio: 0 24 B 0 B 0 B
AS_usedtmpfiles: 0 16 B 0 B 0 B
Total: 944 (944 B)
================================================================================
Dynamic s allocs: 0
Dynamic m allocs: 0
Align allocs: 0
Fatal error (may be due to problems of the input data or parameters):
********************************************************************************
* tried to set a base '.' (ASCII: 46), which is not a valid IUPAC base nor N, *
* X, - or @. *
********************************************************************************
->Thrown: void Read::setSequenceFromString(const char * sequence)
->Caught: main
Aborting process, probably due to error in the input data or parametrisation.
Please check the output log for more information.
For help, please write a mail to the mira talk mailing list.
Subscribing / unsubscribing to mira talk, see:
//www.freelists.org/list/mira_talk
CWD: /gpfs/data/epscor/erubin/3096_P_lim
Thank you for noticing that this is *NOT* a crash, but a
controlled program stop.
Failure, wrapped MIRA process aborted.