On Fri, 26 Jun 2009 12:11:18 +0200 Bastien Chevreux <bach@xxxxxxxxxxxx> wrote: > > Phrap can assemble "reads" of any length in a sensible way. We often have > > long segments of a genome that for various reasons have been proven > > correct; > > [...] > > Wait a minute: are you telling me that giving phrap two "reads" of, say, 50 > megabases and having an overlap of, say, 200kb, then phrap will actually join > those reads? Yepp, exactly! It works quite well, especially if you add a bunch of short reads that span the overlap; this is sometimes a quite efficient way to proceed in a genome project. > Ouch, if that's the case, that won't be available with MIRA anytime soon. I > use a banded Smith-Waterman that is almost O(n) in time, but O(n^2) is space > used. And these are some of the most tricky parts of MIRA which I am not too > keen to touch at the moment. Ok, I see. So what about the idea of an optional final joining step, starting where a mapping+denovo assembly ends today; I am no algorithm wizard, but it seems to me that a fairly simple joining approach would do better than most people manage by manual fiddling around in gap4/consed. Although it would require some other approach than the core MIRA algorithm, so it is a bit of a hack. But a useful one :) > > [...] > > As discussed, fake reads of up to 20kb can be fed into MIRA allready now, > > but there was the issue with the megahubs, making me a bit unsure that the > > assembly algorithm is really designed for this, although it appears to work > > pretty well (but I have not had time to investigate it too much yet). > > That megahub problem actually is something which could be solved, I'll have a > look. > > > However, for longer fake reads (such as for example complete manually > > checked contigs, or manually combined PCR products), we need to cut them > > into 20kb overlapping pieces, which is kind of against the whole idea of > > producing long correct segments. > > If anything can be done in this direction it would be great! > > Not directly with very long fake reads, no. But I do have an idea how this > could be solved efficiently (if not elegantly). > > Won't be available exactly tomorrow though. Anything is welcome, whenever possible. Thanks again! Björn > Regards, > Bastien > > > > -- > You have received this mail because you are subscribed to the mira_talk > mailing list. For information on how to subscribe or unsubscribe, please > visit http://www.chevreux.org/mira_mailinglists.html -- ==================================== Björn Nystedt (Sällström) PhD Student Molecular Evolution EBC, Uppsala University Norbyv. 18C, 752 36 Uppsala Sweden phone: +46 (0)18-471 45 88 email: Bjorn.Nystedt@xxxxxxxxx ==================================== -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html