Hey Bastien, I would be interested in seeing that AMOS howto or if anyone else uses it . Can't get it to produce scaffolds for me. On Sun, Aug 16, 2009 at 3:09 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote: > On Freitag 14 August 2009 Marcin Swiatek wrote: > > I seem to have difficulties getting good results from Mira. Or perhaps > > 'expected' would be a better word. Here is my story: I am trying to > > assemble the genome of a strain of a Lactobacillus bacteria. It is a > > naughty little microbe [...] > > Hi Marcin, > > welcome to the club. Lactobacillus has been a nightmare project also for > me. > Especially as I had no paired-end at the time. > > > I got decently looking results, but there is one thing I > > don't understand: where all these paired ends went? They are in the input > > files I think, I saw these reads in the generated traceinfo file... > > My first guess would be: in the contigs, where they belong. > > > However, while both Celera and Newbler produced contigs *and* scaffolds, > in > > Mira's output I find contigs only. > > In the beginning the users I know liked to use MIRA and combine it with > dedicated scaffolders (BAMBUS, own scripts etc.), therefore I never really > felt the urge to implement an own scaffolder. This has considerably changed > as > inquiries for a scaffolder have noticeably increased in the past year. I > think > I'll have to cave in at some point: not for the 3.0 version which I'm > finalising at the moment, but it's now pretty high on the TODO. > > In the mean time, some time ago I had asked a few people who I know use the > AMOS scaffolder to write a short HOWTO for data comming from MIRA. But I > haven't heard back from any at the moment. > > > Contigs computed by Mira (using > > 'accurate') are quite similar in number and size distribution to what I > get > > with other assemblers, but I see no scaffolds and no evidence of use of > > paired end data. > > [...] > > Now the questions. Firstly, how do I tell if paired ends were indeed used > > or not. Secondly, if they weren't, how do I go about putting them to use. > > MIRA uses them without making too much noise about it. One way for you to > check: in the output, there's a line saying > > Generated XXX unique template ids for YYY valid reads. > > If XXX is smaler than YYY, then MIRA has assigned read-pairs to templates > and > uses that information later on in the assembly: > > [1321] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > [1381] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++s+ > [1440] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > [1500] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > [1560] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > [1620] ++++++++++++++++++++++++++++++++++++++++++++++++++t+t+t+++++ > > The "+" shows reads assembled without problems, "s" means a read has been > rejected at a given contig position because of template size violation and > "t" > because of template direction violation. So you'll see the template usage > only > when there's a (temporary) problem during construction, all others are > assembled without any more notice. > > > And if they were, why don't I see scaffolds (or longer contigs with > little > > gaps in them). > > Because there's no scaffolder. As I wrote, it's on the TODO. In the mean > time, > I'd propose to use the one from AMOS as I heard it works quite well (never > used it though). > > > I will have another query, but I think I will try that one by one. > > No problem. > > > Best, > Bastien > > > -- > You have received this mail because you are subscribed to the mira_talk > mailing list. For information on how to subscribe or unsubscribe, please > visit http://www.chevreux.org/mira_mailinglists.html > -- "If scientists knew what they were doing they wouldn't call it research"