[mira_talk] Re: PacBio CCS questions

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 16 Aug 2013 21:50:41 +0200

On Aug 15, 2013, at 1:34 , Matthew D. Pagel <pagel@xxxxxxx> wrote:
> Is there a quick-and-dirty algorithm out there for identifying inversions from
> one subread to the next within a single PB read

I'd have a more pressing, but similar question at the moment: is there a way of 
easily identifying reads which for such a FR structure but where the PB 
algorithms apparently did not recognise an adapter?

Background: I'm working on the read improvement routines atm and I think that 
in the 49 PB reads I took as initial test set (out of >30k from the E.coli 
Nature paper), already two reads show such an inversion where there should be 
none … ergo it's a sequencing artefact and 4% of reads like this will wreak 
havoc with most assembly algorithms. I hate situations like these.

Bonus question: are PB adaptor sequences listed somewhere on the net? The only 
place I found some are in the metadata XML files, and they told me 
   ATCTCTCTCttttcctcctcctccgttgttgttgttGAGAGAGAT

Are there others?

B.
--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: