Yes, I tried that, I do have some VERY small regions of similarities. But now is question...how I can physically use this information from dot plot...to build consensus from that two contigs..." slide them" against each other. I have like 160 contigs (of "better quality") and total is like 5000 contigs (insane number do start doing this "by hand"). Thank you, Andrzej On Thu, Dec 3, 2009 at 2:19 PM, Sven Klages <sir.svencelot@xxxxxxxxxxxxxx>wrote: > You might want to join contigs via "Find internal joins" (dot plot) or > directly in the "Join Editor". > > But keep in mind, you cannot join contigs if they don't overlap. You can > just change contig layout > accordingly (kind of manual scaffolding). > > cheers, > Sven > > 2009/12/3 Andrzej N <andrzej.k.n@xxxxxxxxx> > > THANK YOU VERY MUCH FOR ANSWER :). >> >> It's very basic question! Can ANYBODY tell me how to join contigs in GAP4? >> Yes, I did set up EVERYTHING, I see them etc. (contigs look really accurate, >> not many errors etc) but I can make this stuff work for me...in my place >> nobody have any idea how to do this stuff... >> >> How can I join contigs if they don't overlap? >> >> Do you know any type of manual... >> >> I always hear "by hand"... HOW? >> >> THANK YOU! >> >> Andrzej >> >> Ps. I will try setting you provided. >> >> On Thu, Dec 3, 2009 at 1:48 PM, Bastien Chevreux <bach@xxxxxxxxxxxx>wrote: >> >>> On Mittwoch 02 Dezember 2009 Andrzej N wrote: >>> > I need some help... I did *de novo* assembly of several plant >>> mitochondrial >>> > genome sequences (454, Titanium, one end reads), about 200000 reads >>> used >>> > for assembly, this should give me about... 100x coverage). Yes, I know >>> > overkill, but... MIRA created about 160 contings around 78 quality >>> score >>> > (what is it exactly?) (total number of contigs like 5,000 but >>> including >>> > smaller ones that don’t help much i.e., "junk"). These contigs don't >>> go >>> > together to create one big consensus contig. >>> >>> Hello Adrzej, >>> >>> 100x is not only overkill, it also is a bit dangerous for many assemblers >>> (including MIRA), as there are some unwanted side-effects of ultra-high >>> coverage. One of them: as sequencing errors are not totally random, they >>> tend >>> to accumulate at certain points. If you now have very high coverage, >>> these >>> sequencing errors will be recognised as valid variants and hence split >>> off >>> into other contigs. >>> >>> Plus you've got plant mitochondrial genomes, and these I've come to fear >>> a >>> bit. 454 data from those I've seen so far suggest pretty uneven coverage, >>> which might lead MIRA to have problems if the uniform rad distribution is >>> used, mistakenly recognising some parts as repeats when they're not. >>> >>> > I also did reference assembly, to an already finished and assembled >>> > sequence. MIRA is covering all of this reference sequence with just >>> only >>> > one small break (so I get two huge contings about 200000bp each). >>> > >>> > Now is the interesting part. When I take these contings from *de novo * >>> > assembly* *and blast them against the ones generated based on reference >>> > assembly, they cover the entire sequence very nicely... So, my question >>> is >>> > why MIRA is not creating larger contings during *de novo* assembly. >>> These >>> > contigs are next to each other and show a certain amount of sequence >>> > overlap (I setup BLAST on my computer to blast the against each other) >>> but >>> > MIRA is not seeing this and combining them. >>> >>> Oh, MIRA is probably seeing them, but refuses to join because the ends >>> contain >>> to many sequencing errors (mistakenly recognised as valid variants) or >>> because >>> the ends lay in regions with exceptionally high coverage (mistakenly >>> recognised as repeat). >>> >>> > What parameters in MIRA need to be changed to help build larger >>> contings? >>> > My adjustment to date have not helped do much more than your default >>> > settings for "fast assembly". >>> >>> Umm ... the 'draft' options are really just that: for drafts. And if >>> you've >>> got 60kb chunks it's not too bad already. But use at least 'normal' or >>> 'accurate' mode. >>> >>> Now, other things you probably want to do: >>> 1) decrease sensitivity of repeat marker base recognition. I'd suggest to >>> add >>> 454_SETTINGS -CO:mrpg=12 >>> and see what happens then >>> 2) eventually assemble without uniform read distribution >>> -AS:urd=no >>> and loosen the repeat detection thresholds >>> 454_SETTINGS -AS:ardct=3:mrl=800 >>> or switch off repeat detection altogether >>> -AS:ard=no >>> >>> If everything else fails: join the large contigs by hand in 'gap4', just >>> takes >>> a couple of minutes for a plant mitochondrion :-) >>> >> >> >> >> >>> >>> Hope that helps, >>> Bastien >>> >>> -- >>> You have received this mail because you are subscribed to the mira_talk >>> mailing list. For information on how to subscribe or unsubscribe, please >>> visit http://www.chevreux.org/mira_mailinglists.html >>> >> >> >