[mira_talk] Re: FW: 454 assembly

  • From: "Shabhonam Caim (TGAC)" <Shabhonam.Caim@xxxxxxxxxxx>
  • To: "mira_talk@xxxxxxxxxxxxx" <mira_talk@xxxxxxxxxxxxx>
  • Date: Thu, 22 Jul 2010 10:36:37 +0100

Thanks thomas for your reply but still I am getting error:
I have used the following command and provided the qual file as well :

mira --project=test --job=denovo,genome,draft 454_SETTINGS 
-FN:fai=2.GAC.454Reads.fna -FN:fqui=2.GAC.454Reads.qual

            Minimum reads per group needed for tagging (mrpg)    :  [san]  2
                                                                    [454]  4
            Minimum neighbour quality needed for tagging (mnq)   :  [san]  20
                                                                    [454]  20
            Minimum Group Quality needed for RMB Tagging (mgqrt) :  [san]  30
                                                                    [454]  25
            End-read Marking Exclusion Area in bases (emea)      :  [san]  25
                                                                    [454]  10
            Also mark gap bases (amgb)                           :  [san]  yes
                                                                    [454]  no
                Also mark gap bases - even multicolumn (amgbemc) :  [san]  yes
                                                                    [454]  yes
                Also mark gap bases - need both strands (amgbnbs):  [san]  yes
                                                                    [454]  yes
        Force non-IUPAC consensus per sequencing type (fnicpst)  :  [san]  no
                                                                    [454]  no
        Merge short reads (msr)                                  :  [san]  no
                                                                    [454]  no
        Gap override ratio (gor)                                 :  [san]  66
                                                                    [454]  66

  Edit options (-ED):
        Automatic contig editing (ace)              :  [san]  no
                                                       [454]  yes
     Sanger only:
        Strict editing mode (sem)                   : no
        Confirmation threshold in percent (ct)      : 50

  Directories (-DI):
        When loading EXP files            :
        When loading SCF files            :
        Top directory for writing files   : test_assembly
        For writing result files          : test_assembly/test_d_results
        For writing result info files     : test_assembly/test_d_info
        For writing log files             : test_assembly/test_d_log
        For writing checkpoint files      : test_assembly/test_d_chkpt

  File names (-FN):
        When loading sequences from FASTA            :  [san]  
test_in.sanger.fasta
                                                        [454]  
2.GAC.454Reads.fna
        When loading qualities from FASTA quality    :  [san]  
test_in.sanger.fasta.qual
                                                        [454]  
2.GAC.454Reads.qual
        When loading sequences from FASTQ            :  [san]  
test_in.sanger.fastq
                                                        [454]  test_in.454.fastq
        When loading project from CAF                : test_in.sanger.caf
        When loading project from MAF (disabled)     : test_in.sanger.maf
        When loading EXP fofn                        : test_in.fofn
        When loading project from PHD                : test_in.phd.1
        When loading strain data                     : test_straindata_in.txt
        When loading XML trace info files            :  [san]  
test_traceinfo_in.sanger.xml
                                                        [454]  
test_traceinfo_in.454.xml
        When loading SSAHA vector screen results     : 
test_ssaha2vectorscreen_in.txt

        When loading backbone from MAF               : test_backbone_in.maf
        When loading backbone from CAF               : test_backbone_in.caf
        When loading backbone from GenBank           : test_backbone_in.gbf
        When loading backbone from FASTA             : test_backbone_in.fasta


  Output files (-OUTPUT/-OUT):
        Save simple singlets in project (sssip)      :  [san]  no
                                                        [454]  no
        Save tagged singlets in project (stsip)      :  [san]  yes
                                                        [454]  yes

        Remove rollover logs (rrol)                  : yes
        Remove log directory (rld)                   : no

    Result files:
        Saved as CAF                       (orc)     : yes
        Saved as FASTA                     (orf)     : yes
        Saved as GAP4 (directed assembly)  (org)     : no
        Saved as phrap ACE                 (ora)     : yes
        Saved as HTML                      (orh)     : no
        Saved as Transposed Contig Summary (ors)     : yes
        Saved as simple text format        (ort)     : no
        Saved as wiggle                    (orw)     : yes

    Temporary result files:
        Saved as CAF                       (otc)     : yes
        Saved as CAF                       (otm)     : no
        Saved as FASTA                     (otf)     : no
        Saved as GAP4 (directed assembly)  (otg)     : no
        Saved as phrap ACE                 (ota)     : no
        Saved as HTML                      (oth)     : no
        Saved as Transposed Contig Summary (ots)     : no
        Saved as simple text format        (ott)     : no

    Extended temporary result files:
        Saved as CAF                      (oetc)     : no
        Saved as FASTA                    (oetf)     : no
        Saved as GAP4 (directed assembly) (oetg)     : no
        Saved as phrap ACE                (oeta)     : no
        Saved as HTML                     (oeth)     : no
        Save also singlets               (oetas)     : no

    Alignment output customisation:
        TEXT characters per line (tcpl)              : 60
        HTML characters per line (hcpl)              : 60
        TEXT end gap fill character (tegfc)          :
        HTML end gap fill character (hegfc)          :

    File / directory output names:
        CAF             : test_out.caf
        MAF             : test_out.maf
        FASTA           : test_out.unpadded.fasta
        FASTA quality   : test_out.unpadded.fasta.qual
        FASTA (padded)  : test_out.padded.fasta
        FASTA qual.(pad): test_out.padded.fasta.qual
        GAP4 (directory): test_out.gap4da
        ACE             : test_out.ace
        HTML            : test_out.html
        Simple text     : test_out.txt
        TCS overview    : test_out.tcs
        Wiggle          : test_out.wig
------------------------------------------------------------------------------
Deleting old directory test_assembly ... done.
Creating directory test_assembly ... done.
Creating directory test_assembly/test_d_log ... done.
Creating directory test_assembly/test_d_results ... done.
Creating directory test_assembly/test_d_info ... done.
Creating directory test_assembly/test_d_chkpt ... done.
Localtime: Thu Jul 22 10:31:49 2010



========================== Memory self assessment ==============================
Running in 64 bit mode.

Dump from /proc/meminfo
--------------------------------------------------------------------------------
MemTotal:      8174224 kB
MemFree:         94968 kB
Buffers:          4644 kB
Cached:        4976444 kB
SwapCached:     287504 kB
Active:        6618732 kB
Inactive:      1307228 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      8174224 kB
LowFree:         94968 kB
SwapTotal:     2031608 kB
SwapFree:      1476740 kB
Dirty:            1984 kB
Writeback:           0 kB
AnonPages:     2936468 kB
Mapped:          32408 kB
Slab:            81840 kB
PageTables:      37252 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   6118720 kB
Committed_AS:  5290616 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    263728 kB
VmallocChunk: 34359474579 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB
--------------------------------------------------------------------------------

Dump from /proc/self/status
--------------------------------------------------------------------------------
Name:   mira
State:  R (running)
SleepAVG:       0%
Tgid:   6158
Pid:    6158
PPid:   17617
TracerPid:      0
Uid:    8395    8395    8395    8395
Gid:    3658    3658    3658    3658
FDSize: 256
Groups: 3658
VmPeak:     4972 kB
VmSize:     4920 kB
VmLck:         0 kB
VmHWM:      1744 kB
VmRSS:      1744 kB
VmData:      464 kB
VmStk:        84 kB
VmExe:      4336 kB
VmLib:         0 kB
VmPTE:        28 kB
StaBrk: 00a7e000 kB
Brk:    017f0000 kB
StaStk: 7fffc798fa70 kB
Threads:        1
SigQ:   0/71680
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
Cpus_allowed:   
00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff
Mems_allowed:   00000000,00000001
--------------------------------------------------------------------------------

Information on current assembly object:

AS_readpool: 0 reads.
AS_contigs: 0 contigs.
AS_bbcontigs: 0 contigs.
Mem used for reads: 112 (112 B)

Memory used in assembly structures:
                                           Eff. Size   Free cap. LostByAlign
     AS_writtenskimhitsperid:          0        24 B         0 B         0 B
               AS_skim_edges:          0        24 B         0 B         0 B
                 AS_adsfacts:          0        24 B         0 B         0 B
          AS_confirmed_edges:          0        24 B         0 B         0 B
   AS_permanent_overlap_bans:          0        24 B         0 B         0 B
              AS_readhitmiss:          0        24 B         0 B         0 B
            AS_readhmcovered:          0        24 B         0 B         0 B
                AS_count_rhm:          0        24 B         0 B         0 B
                 AS_clipleft:          0        24 B         0 B         0 B
                AS_clipright:          0        24 B         0 B         0 B
                 AS_used_ids:          0        24 B         0 B         0 B
              AS_multicopies:          0        24 B         0 B         0 B
            AS_hasmcoverlaps:          0        24 B         0 B         0 B
       AS_maxcoveragereached:          0        24 B         0 B         0 B
       AS_coverageperseqtype:          0        24 B         0 B         0 B
           AS_istroublemaker:          0        24 B         0 B         0 B
                 AS_isdebris:          0        24 B         0 B         0 B
          AS_needalloverlaps:          0        40 B         0 B         0 B
    AS_readsforrepeatresolve:          0        40 B         0 B         0 B
                AS_allrmbsok:          0        24 B         0 B         0 B
        AS_probablermbsnotok:          0        24 B         0 B         0 B
            AS_weakrmbsnotok:          0        24 B         0 B         0 B
          AS_readmaytakeskim:          0        40 B         0 B         0 B
               AS_skimstaken:          0        40 B         0 B         0 B
          AS_numskimoverlaps:          0        24 B         0 B         0 B
       AS_numleftextendskims:          0        24 B         0 B         0 B
         AS_rightextendskims:          0        24 B         0 B         0 B
      AS_skimleftextendratio:          0        24 B         0 B         0 B
     AS_skimrightextendratio:          0        24 B         0 B         0 B
             AS_usedlogfiles:          1        48 B         0 B         0 B
Total: 920 (920 B)

================================================================================
Dynamic allocs: 0
Align allocs: 0

Fatal Error (may be due to problems of the input data):
"You did not specify any input sequences to be loaded."

->Thrown: void Assembly::loadSequenceData_new()

->Caught: main

Cheers
Shab


From: mira_talk-bounce@xxxxxxxxxxxxx [mailto:mira_talk-bounce@xxxxxxxxxxxxx] On 
Behalf Of Thomas Müller
Sent: 22 July 2010 10:12
To: mira_talk@xxxxxxxxxxxxx
Subject: [mira_talk] Re: FW: 454 assembly

try:
mira --project=test --job=denovo,genome,draft,est 454_SETTINGS 
-FN:fai=2.GAC.454Reads.fna

But you should really also add the .qual file with FN:fqui=2.GAC.454Reads.qual

cheers
Thomas

On Jul 22, 2010, at 10:53 AM, Shabhonam Caim (TGAC) wrote:



Hello Mira Users

I am trying to assemble the 454 reads using Mira by using following command:
mira-3.0.0 mira --project=test --job=denovo,genome,draft,2.GAC.454Reads.fna

and I am getting the following error:

This is MIRA V3.0.0 (production version).

Please cite: Chevreux, B., Wetter, T. and Suhai, S. (1999), Genome Sequence
Assembly Using Trace Signals and Additional Sequence Information.
Computer Science and Biology: Proceedings of the German Conference on
Bioinformatics (GCB) 99, pp. 45-56.

Mail general questions to the MIRA talk mailing list:
        mira_talk@xxxxxxxxxxxxx<mailto:mira_talk@xxxxxxxxxxxxx>

To (un-)subsubcribe the MIRA mailing lists, see:
        http://www.chevreux.org/mira_mailinglists.html

To report bugs or ask for features, please use the new ticketing system at:
        http://sourceforge.net/apps/trac/mira-assembler/
This ensures that requests don't get lost.


Compiled by: bach
Sun Jan 31 20:23:36 CET 2010
On: Linux arcadia64 2.6.27-11-generic #1 SMP Wed Apr 1 20:53:41 UTC 2009 x86_64 
GNU/Linux
Compiled in boundtracking mode.
Compiled in bugtracking mode.
Compilation settings (sorry, for debug):
        Size of size_t  : 8
        Size of uint32  : 4
        Size of uint32_t: 4
        Size of uint64  : 8
        Size of uint64_t: 8
Current system: Linux n57140 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 
x86_64 x86_64 x86_64 GNU/Linux



Parsing parameters: --project=454asembly --job=denovo, genome, draft 
pk.454.fasta

Seen no assembly quality in job definition, assuming 'normal'.
Seen no assembly type in job definition, assuming 'genome'.

,..

========================= Parameter parsing error(s) ==========================

* Parameter section: '(none)'
*       unrecognised string or unexpected character: genome

* Parameter section: '(none)'
*       unrecognised string or unexpected character: draft

* Parameter section: '(none)'
*       unrecognised string or unexpected character: pk

* Parameter section: '(none)'
*       unrecognised string or unexpected character: 454

* Parameter section: '(none)'
*       unrecognised string or unexpected character: fasta

===============================================================================

Fatal Error (may be due to problems of the input data):
"Error while parsing parameters, sorry."

->Thrown: void MIRAParameters::parse(istream & is, vector<MIRAParameters> & Pv, 
MIRAParameters * singlemp)

->Caught: main

Or can I please get the commands to run the 454 assembly (basic denovo assembly 
with default parameters)

cheers

Shab


--
Crop Plant Biodiversity and Breeding Informatics Group (350b)
Institute of Plant Breeding, Seed Science and Population Genetics
University of Hohenheim
Fruwirthstrasse 21
D-70599 Stuttgart
Phone: +49-711-459 24293

Other related posts: