On Freitag 26 Juni 2009 Björn Nystedt wrote: > [...] > Basically this is about integrating data; if I have long pieces of the > genome that I know is correct, how do I best combine that information with > the full set of shotgun and paired-end reads to make the most accurate and > complete assembly? Hmmm, fragmenting them into longer "fake" reads and mixing them into the lot is actually the only way to go with mira at the moment. > Phrap can assemble "reads" of any length in a sensible way. We often have > long segments of a genome that for various reasons have been proven > correct; > [...] Wait a minute: are you telling me that giving phrap two "reads" of, say, 50 megabases and having an overlap of, say, 200kb, then phrap will actually join those reads? Ouch, if that's the case, that won't be available with MIRA anytime soon. I use a banded Smith-Waterman that is almost O(n) in time, but O(n^2) is space used. And these are some of the most tricky parts of MIRA which I am not too keen to touch at the moment. > [...] > As discussed, fake reads of up to 20kb can be fed into MIRA allready now, > but there was the issue with the megahubs, making me a bit unsure that the > assembly algorithm is really designed for this, although it appears to work > pretty well (but I have not had time to investigate it too much yet). That megahub problem actually is something which could be solved, I'll have a look. > However, for longer fake reads (such as for example complete manually > checked contigs, or manually combined PCR products), we need to cut them > into 20kb overlapping pieces, which is kind of against the whole idea of > producing long correct segments. > If anything can be done in this direction it would be great! Not directly with very long fake reads, no. But I do have an idea how this could be solved efficiently (if not elegantly). Won't be available exactly tomorrow though. Regards, Bastien -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html