[mira_talk] Re: Megahub info
- From: Björn Nystedt <bjorn.nystedt@xxxxxxxxx>
- To: mira_talk@xxxxxxxxxxxxx
- Date: Fri, 26 Jun 2009 13:32:14 +0200
On Fri, 26 Jun 2009 12:11:18 +0200
Bastien Chevreux <bach@xxxxxxxxxxxx> wrote:
> > Phrap can assemble "reads" of any length in a sensible way. We often have
> > long segments of a genome that for various reasons have been proven
> > correct;
> > [...]
>
> Wait a minute: are you telling me that giving phrap two "reads" of, say, 50
> megabases and having an overlap of, say, 200kb, then phrap will actually join
> those reads?
Yepp, exactly!
It works quite well, especially if you add a bunch of short reads that span the
overlap; this is sometimes a quite efficient way to proceed in a genome project.
> Ouch, if that's the case, that won't be available with MIRA anytime soon. I
> use a banded Smith-Waterman that is almost O(n) in time, but O(n^2) is space
> used. And these are some of the most tricky parts of MIRA which I am not too
> keen to touch at the moment.
Ok, I see.
So what about the idea of an optional final joining step, starting where a
mapping+denovo assembly ends today; I am no algorithm wizard, but it seems to
me that a fairly simple joining approach would do better than most people
manage by manual fiddling around in gap4/consed. Although it would require some
other approach than the core MIRA algorithm, so it is a bit of a hack. But a
useful one :)
> > [...]
> > As discussed, fake reads of up to 20kb can be fed into MIRA allready now,
> > but there was the issue with the megahubs, making me a bit unsure that the
> > assembly algorithm is really designed for this, although it appears to work
> > pretty well (but I have not had time to investigate it too much yet).
>
> That megahub problem actually is something which could be solved, I'll have a
> look.
>
> > However, for longer fake reads (such as for example complete manually
> > checked contigs, or manually combined PCR products), we need to cut them
> > into 20kb overlapping pieces, which is kind of against the whole idea of
> > producing long correct segments.
> > If anything can be done in this direction it would be great!
>
> Not directly with very long fake reads, no. But I do have an idea how this
> could be solved efficiently (if not elegantly).
>
> Won't be available exactly tomorrow though.
Anything is welcome, whenever possible.
Thanks again!
Björn
> Regards,
> Bastien
>
>
>
> --
> You have received this mail because you are subscribed to the mira_talk
> mailing list. For information on how to subscribe or unsubscribe, please
> visit http://www.chevreux.org/mira_mailinglists.html
--
====================================
Björn Nystedt (Sällström)
PhD Student
Molecular Evolution
EBC, Uppsala University
Norbyv. 18C, 752 36 Uppsala
Sweden
phone: +46 (0)18-471 45 88
email: Bjorn.Nystedt@xxxxxxxxx
====================================
--
You have received this mail because you are subscribed to the mira_talk mailing
list. For information on how to subscribe or unsubscribe, please visit
http://www.chevreux.org/mira_mailinglists.html
Other related posts: