[mira_talk] Re: Assembling 454 and Solexa mate-pair data - rethinking ...

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 16 Sep 2009 22:58:12 +0200

On Dienstag 15 September 2009 Martin A. Hansen wrote:
> I have tried yet another approach for this assembly. I assumed that the
> Solexa data was contaminated, so I ran MIRA with the 454 contigs (used as
> long Sanger reads) and the Solexa mate-pairs - but only the mate pairs that
> could be mapped to the contigs (using Bowtie and allowing for 3
> mismatches). This reduced the amount of mate-pairs from 3M to 1M, and still
> reads should be included that spans the gaps between the contigs.
> [...]
> This strikes me as completely wrong. The long contigs are gone.
> According to the log both Sanger and Solexa reads were loaded (I omitted
> the quals on purpose expecting a simple run).

Ummm ... MIRA did not reject the "reads" longer than 20kb? Then I think did 
you turn of all clippings for the "Sanger" reads. Especially if you did not 
load quals ... I suspect that the quality clipping algorithms (turned on by 
default) threw them out without a second look.

Search the output log for the name of one or two of your fake Sangers ... you 
might find them with a messagge saying that thesy don't have enough bases 
(after quality clip).

Also have a look at the "*_int_clippings.0.txt" in the log directory ... this 
should also tell you what happened to your reads after loading.

Regards,
  Bastien



-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: