[mira_talk] Re: Failure, wrapped MIRA process aborted

  • From: "Mendez Garcia, Celia" <cmendezg@xxxxxxxxxxxx>
  • To: "mira_talk@xxxxxxxxxxxxx" <mira_talk@xxxxxxxxxxxxx>
  • Date: Tue, 8 Jul 2014 17:01:00 +0000

Hi Bastien,

many thanks. You guys are great.
Everything worked perfectly fine.

I would like to ask you how you manage to assess quality to your assembly. I 
was mapping against a reference, which of course helped, but then I found out 
that my reference's quality, a closed genome and all that, is not as good as 
one may have thought. Should I assemble de novo and then compare? How do you 
differentiate among artifacts, missing data, etc. I attach a couple of info 
.txt I got using from sub-sampled data.

Thanks..

Celia
________________________________________
From: mira_talk-bounce@xxxxxxxxxxxxx [mira_talk-bounce@xxxxxxxxxxxxx] on behalf 
of Bastien Chevreux [bach@xxxxxxxxxxxx]
Sent: Wednesday, June 04, 2014 1:34 PM
To: mira_talk@xxxxxxxxxxxxx
Subject: [mira_talk] Re: Failure, wrapped MIRA process aborted

On 04 Jun 2014, at 18:37 , Peter Cock <p.j.a.cock@xxxxxxxxxxxxxx> wrote:
> […]
> Personally I would tell MIRA to ignore the long read names.

Or use “rename_prefix” in the manifest file to have on-the-fly renaming of 
reads.

In your case
  rename_prefix=HWI-ST330:422:C4AVHACXX clostraur
should do the trick.

In other news:
- mapping in draft mode is not that much faster than in accurate mode, I 
recommend accurate.
- you are mapping almost 20m reads. If the reference is a bacterium, it’s 
almost sure MIRA will tell you that this is not a good idea … and tell you what 
to do :-)

B.


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html
Localtime: Tue Jun 10 16:32:10 2014

Assembly information:
=====================

Localtime: Tue Jun 10 16:32:10 2014
MIRA version: 4.0.2 

Num. reads assembled: 325198
Num. singlets: 0


Coverage assessment (calculated from contigs >= 5000 with coverage >= 0):
=========================================================
  Avg. total coverage: 8.33
  Avg. coverage per sequencing technology
        Sanger: 0.00
        454:    0.00
        IonTor: 0.00
        PcBioHQ:        0.00
        PcBioLQ:        0.00
        Text:   3.00
        Solexa: 5.33
        Solid:  0.00


Large contigs (makes less sense for EST assemblies):
====================================================
With    Contig size             >= 500
        AND (Total avg. Cov     >= 3
             OR Cov(san)        >= 0
             OR Cov(454)        >= 0
             OR Cov(ion)        >= 0
             OR Cov(pbh)        >= 0
             OR Cov(pbl)        >= 0
             OR Cov(txt)        >= 1
             OR Cov(sxa)        >= 2
             OR Cov(sid)        >= 0
            )

  Length assessment:
  ------------------
  Number of contigs:    1
  Total consensus:      6001982
  Largest contig:       6001982
  N50 contig size:      6001982
  N90 contig size:      6001982
  N95 contig size:      6001982

  Coverage assessment:
  --------------------
  Max coverage (total): 1656
  Max coverage per sequencing technology
        Sanger: 0
        454:    0
        IonTor: 0
        PcBioHQ:        0
        PcBioLQ:        0
        Text:   3
        Solexa: 1653
        Solid:  0

  Quality assessment:
  -------------------
  Average consensus quality:                    38
  Consensus bases with IUPAC:                   17052   (you might want to 
check these)
  Strong unresolved repeat positions (SRMc):    1407    (you might want to 
check these)
  Weak unresolved repeat positions (WRMc):      109     (you might want to 
check these)
  Sequencing Type Mismatch Unsolved (STMU):     0       (excellent)
  Contigs having only reads wo qual:            0       (excellent)
  Contigs with reads wo qual values:            1       (you might want to 
check these)


All contigs:
============
  Length assessment:
  ------------------
  Number of contigs:    1
  Total consensus:      6001982
  Largest contig:       6001982
  N50 contig size:      6001982
  N90 contig size:      6001982
  N95 contig size:      6001982

  Coverage assessment:
  --------------------
  Max coverage (total): 1656
  Max coverage per sequencing technology
        Sanger: 0
        454:    0
        IonTor: 0
        PcBioHQ:        0
        PcBioLQ:        0
        Text:   3
        Solexa: 1653
        Solid:  0

  Quality assessment:
  -------------------
  Average consensus quality:                    38
  Consensus bases with IUPAC:                   17052   (you might want to 
check these)
  Strong unresolved repeat positions (SRMc):    1407    (you might want to 
check these)
  Weak unresolved repeat positions (WRMc):      109     (you might want to 
check these)
  Sequencing Type Mismatch Unsolved (STMU):     0       (excellent)
  Contigs having only reads wo qual:            0       (excellent)
  Contigs with reads wo qual values:            1       (you might want to 
check these)

Localtime: Tue Jul  1 04:20:09 2014

Assembly information:
=====================

Localtime: Tue Jul  1 04:20:09 2014
MIRA version: 4.0.2 

Num. reads assembled: 865156
Num. singlets: 0


Coverage assessment (calculated from contigs >= 5000 with coverage >= 0):
=========================================================
  Avg. total coverage: 17.22
  Avg. coverage per sequencing technology
        Sanger: 0.00
        454:    0.00
        IonTor: 0.00
        PcBioHQ:        0.00
        PcBioLQ:        0.00
        Text:   3.00
        Solexa: 14.22
        Solid:  0.00


Large contigs (makes less sense for EST assemblies):
====================================================
With    Contig size             >= 500
        AND (Total avg. Cov     >= 6
             OR Cov(san)        >= 0
             OR Cov(454)        >= 0
             OR Cov(ion)        >= 0
             OR Cov(pbh)        >= 0
             OR Cov(pbl)        >= 0
             OR Cov(txt)        >= 1
             OR Cov(sxa)        >= 5
             OR Cov(sid)        >= 0
            )

  Length assessment:
  ------------------
  Number of contigs:    1
  Total consensus:      6002288
  Largest contig:       6002288
  N50 contig size:      6002288
  N90 contig size:      6002288
  N95 contig size:      6002288

  Coverage assessment:
  --------------------
  Max coverage (total): 4207
  Max coverage per sequencing technology
        Sanger: 0
        454:    0
        IonTor: 0
        PcBioHQ:        0
        PcBioLQ:        0
        Text:   3
        Solexa: 4204
        Solid:  0

  Quality assessment:
  -------------------
  Average consensus quality:                    42
  Consensus bases with IUPAC:                   14309   (you might want to 
check these)
  Strong unresolved repeat positions (SRMc):    2374    (you might want to 
check these)
  Weak unresolved repeat positions (WRMc):      167     (you might want to 
check these)
  Sequencing Type Mismatch Unsolved (STMU):     0       (excellent)
  Contigs having only reads wo qual:            0       (excellent)
  Contigs with reads wo qual values:            1       (you might want to 
check these)


All contigs:
============
  Length assessment:
  ------------------
  Number of contigs:    1
  Total consensus:      6002288
  Largest contig:       6002288
  N50 contig size:      6002288
  N90 contig size:      6002288
  N95 contig size:      6002288

  Coverage assessment:
  --------------------
  Max coverage (total): 4207
  Max coverage per sequencing technology
        Sanger: 0
        454:    0
        IonTor: 0
        PcBioHQ:        0
        PcBioLQ:        0
        Text:   3
        Solexa: 4204
        Solid:  0

  Quality assessment:
  -------------------
  Average consensus quality:                    42
  Consensus bases with IUPAC:                   14309   (you might want to 
check these)
  Strong unresolved repeat positions (SRMc):    2374    (you might want to 
check these)
  Weak unresolved repeat positions (WRMc):      167     (you might want to 
check these)
  Sequencing Type Mismatch Unsolved (STMU):     0       (excellent)
  Contigs having only reads wo qual:            0       (excellent)
  Contigs with reads wo qual values:            1       (you might want to 
check these)

Other related posts: