[mira_talk] Re: Where are my scaffolds?

  • From: Brian Forde <bforde@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 17 Aug 2009 02:17:02 -0700

Hey Bastien,

I would be interested in seeing that AMOS howto or if anyone else uses it .
Can't get it to produce scaffolds for me.

On Sun, Aug 16, 2009 at 3:09 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote:

> On Freitag 14 August 2009 Marcin Swiatek wrote:
> > I seem to have difficulties getting good results from Mira. Or perhaps
> > 'expected' would be a better word. Here is my story: I am trying to
> > assemble the genome of a strain of a Lactobacillus bacteria. It is a
> > naughty little microbe [...]
>
> Hi Marcin,
>
> welcome to the club. Lactobacillus has been a nightmare project also for
> me.
> Especially as I had no paired-end at the time.
>
> > I got decently looking results, but there is one thing I
> > don't understand: where all these paired ends went? They are in the input
> > files I think, I saw these reads in the generated traceinfo file...
>
> My first guess would be: in the contigs, where they belong.
>
> > However, while both Celera and Newbler produced contigs *and* scaffolds,
> in
> > Mira's output I find contigs only.
>
> In the beginning the users I know liked to use MIRA and combine it with
> dedicated scaffolders (BAMBUS, own scripts etc.), therefore I never really
> felt the urge to implement an own scaffolder. This has considerably changed
> as
> inquiries for a scaffolder have noticeably increased in the past year. I
> think
> I'll have to cave in at some point: not for the 3.0 version which I'm
> finalising at the moment, but it's now pretty high on the TODO.
>
> In the mean time, some time ago I had asked a few people who I know use the
> AMOS scaffolder to write a short HOWTO for data comming from MIRA. But I
> haven't heard back from any at the moment.
>
> > Contigs computed by Mira (using
> > 'accurate') are quite similar in number and size distribution to what I
> get
> > with other assemblers, but I see no scaffolds and no evidence of use of
> > paired end data.
> > [...]
> > Now the questions. Firstly, how do I tell if paired ends were indeed used
> > or not. Secondly, if they weren't, how do I go about putting them to use.
>
> MIRA uses them without making too much noise about it. One way for you to
> check: in the output, there's a line saying
>
>   Generated XXX unique template ids for YYY valid reads.
>
> If XXX is smaler than YYY, then MIRA has assigned read-pairs to templates
> and
> uses that information later on in the assembly:
>
> [1321] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> [1381] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++s+
> [1440] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> [1500] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> [1560] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> [1620] ++++++++++++++++++++++++++++++++++++++++++++++++++t+t+t+++++
>
> The "+" shows reads assembled without problems, "s" means a read has been
> rejected at a given contig position because of template size violation and
> "t"
> because of template direction violation. So you'll see the template usage
> only
> when there's a (temporary) problem during construction, all others are
> assembled without any more notice.
>
> > And if they were, why don't I see scaffolds (or longer contigs with
> little
> > gaps in them).
>
> Because there's no scaffolder. As I wrote, it's on the TODO. In the mean
> time,
> I'd propose to use the one from AMOS as I heard it works quite well (never
> used it though).
>
> > I will have another query, but I think I will try that one by one.
>
> No problem.
>
>
> Best,
>  Bastien
>
>
> --
> You have received this mail because you are subscribed to the mira_talk
> mailing list. For information on how to subscribe or unsubscribe, please
> visit http://www.chevreux.org/mira_mailinglists.html
>



-- 
"If scientists knew what they were doing they wouldn't call it research"

Other related posts: