[mira_talk] Re: Reference vs. De novo assembly.

  • From: Andrzej N <andrzej.k.n@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 8 Dec 2009 12:57:40 -0600

Hello Bastien,

Computer just finished assemblies:

./mira -project=mito1 -job=denovo,genome,accurate,454 -AS:ardct=3:mrl=100

*Num. reads assembled: 194346
Num. singlets: 79

Large contigs:
--------------
With    Contig size        >= 500
    AND (Total avg. Cov    >= 29
         OR Cov(san)    >= 0
         OR Cov(454)    >= 27
         OR Cov(sxa)    >= 0
         OR Cov(sid)    >= 0
        )

  Length assessment:
  ------------------
  Number of contigs:    174
  Total consensus:    780952
  Largest contig:    50281
  N50 contig size:    12021
  N90 contig size:    1755
  N95 contig size:    1300

  Coverage assessment:
  --------------------
  Max coverage (total):    899
  Max coverage
    Sanger:    0
    454:    1013
    Solexa:    0
    Solid:    0
  Avg. total coverage (size >= 5000): 85.52
  Avg. coverage (contig size >= 5000)
    Sanger:    0.00
    454:    86.66
    Solexa:    0.00
    Solid:    0.00

  Quality assessment:
  -------------------
  Average consensus quality:            78
  Consensus bases with IUPAC (IUPc):        392    (you might want to check
these)
  Strong unresolved repeat positions (SRMc):    1    (you might want to
check these)
  Weak unresolved repeat positions (WRMc):    0    (excellent)
  Sequencing Type Mismatch Unsolved (STMU):    0    (excellent)
  Contigs having only reads wo qual:        0    (excellent)
  Contigs with reads wo qual values:        0    (excellent)*

As you can see now I have biggest contig 50000bp, but the "problem" remains.
Is there any chance to tell MIRA not to add more sequences above the for
example 100x? Here as you see in some single regions MIRA is putting 899 (I
think is a limit of MIRA to put stuff on top of each other). Can we tell
MIRA stop doing it? This somehow artificially increases my coverage... I
don't know if it that important, but doesn't look good.

I will try to run MIRA with similar parameters but increase the read size to
200, than 300, and... 400, since I have so many sequences... this should be
interesting.

Do you have other suggestions, before I will go and start doing cloning ;).

Thank you for any suggestions.

Andrzej

On Mon, Dec 7, 2009 at 4:26 PM, Andrzej N <andrzej.k.n@xxxxxxxxx> wrote:

> Actually I do have pieces, I would like to have whole! ;). Some day...
>
> Andrzej
>
>
> On Mon, Dec 7, 2009 at 4:25 PM, Andrzej N <andrzej.k.n@xxxxxxxxx> wrote:
>
>> But it is soooo good cake! ;).
>>
>> Andrzej
>>
>>
>> On Mon, Dec 7, 2009 at 1:55 PM, Bastien Chevreux <bach@xxxxxxxxxxxx>wrote:
>>
>>> On Montag 07 Dezember 2009 Andrzej N wrote:
>>> > Hello, I tried joing contigs "by hand", I get chance to go down from
>>> 165
>>> > contigs to about 120, no other similarities found between them.
>>> >
>>> > I'm running assembly using call parameters you suggested.
>>> >
>>> > For some reason I think I need to use more that just "best" contigs, to
>>> >  find places to join them.
>>> >
>>> > If I wouldn't have reference, I couldn't build this genome at all,
>>> based
>>> > only on de novo assembly :(.
>>>
>>> Well, that's what scaffolding programs and paired-end (or template)
>>> sequencing
>>> is for. If you have paired-end: have a look at BAMBUS (see the docs
>>> written by
>>> Gregory.
>>>
>>> If you don't have paired-end: well, you'd need to sift through the contig
>>> debris by hand, trying to evaluate whether they are repetitive or not
>>> (rRNA
>>> stretches, phages etc.) and try to place them in the project. A task
>>> which
>>> would normally not be possible without targeted sequencing (primer
>>> walking
>>> etc.) and tedious hours in both wet lab and at the computer.
>>>
>>> Yes, genome assembly ain't a piece of cake :-)
>>>
>>> Regards,
>>>   Bastien
>>>
>>> --
>>> You have received this mail because you are subscribed to the mira_talk
>>> mailing list. For information on how to subscribe or unsubscribe, please
>>> visit http://www.chevreux.org/mira_mailinglists.html
>>>
>>
>>
>

Other related posts: