[mira_talk] Re: PacBio CCS questions

From: Bastien Chevreux <bach@xxxxxxxxxxxx>
To: mira_talk@xxxxxxxxxxxxx
Date: Sat, 17 Aug 2013 10:13:35 +0200

On Aug 17, 2013, at 1:58 , Chris Hoefler <hoeflerb@xxxxxxxxx> wrote:
> Do either of these help?
> https://s3.amazonaws.com/files.pacb.com/pdf/Guide_Pacific_Biosciences_Template_Preparation_and_Sequencing.pdf
> http://www.smrtcommunity.com/servlet/servlet.FileDownload?file=00P7000000HYU49EAH
> 
> This is the only official documentation from PacBio that I could find about 
> their adapter sequences and barcodes.

See? And I didn't find them, though I did use Google with quite a number of 
different keywords. Probably the wrong ones. My thanks to Matthew and you :-)

> How long are these chimeras? The worst offenders can probably be removed by 
> filtering read lengths and quality scores. But apparently these artifacts do 
> appear in longer reads at a non-negligible level as a result of the way the 
> libraries are constructed.
> http://www.microbiomejournal.com/content/1/1/10
> 
> The PacBioToCA paper puts the number at ~2.5%. HGAP gets rid of these during 
> the preassembly step by looking at the quality of the error correction. If 
> there is a chimeric seed reed, the short reads won't align across the 
> junction of the inversion, resulting in a "coverage gap" in the preassembler 
> alignment. These gaps are identified by a low consensus quality in the middle 
> of the read.

Does it show I did not read the PacBioToCA paper (yet)(intentionally)? I want 
to develop own ideas for "best practice" when learning the characteristics of 
new sequencing technologies before looking at what others have done. But feel 
free to cite from papers when appropriate :-)

Incidentally, the above strategy crossed my mind sometime this week when 
discovering those chimeras. I discarded it after some more thoughts because I 
think it will lead to too many "false positives," i.e., one would break 
otherwise perfect reads. I do have an idea how to make it differently, but I'll 
need to work out a couple of things first.

B.

References:
- [mira_talk] PacBio CCS questions
  - From: Matthew D. Pagel
- [mira_talk] Re: PacBio CCS questions
  - From: Bastien Chevreux
- [mira_talk] Re: PacBio CCS questions
  - From: Chris Hoefler

[mira_talk] Re: PacBio CCS questions

Other related posts: