[mira_talk] Re: PacBio CCS questions

  • From: "Matthew D. Pagel" <pagel@xxxxxxx>
  • To: Chris Hoefler <hoeflerb@xxxxxxxxx>
  • Date: Thu, 15 Aug 2013 04:35:46 -0400

Chris,

Thank you very much for your response.

I understand now what I misunderstood before: specifically that the "circle" is
not dsDNA, but rather essentially ssDNA with the "inserts" reverse
complimentary to each other, which when allowed to anneal forms the "SMRTBell"
structure.  When sequenced, this produces a FRFRFR pattern in 3 passes around
the circle.  I actually came to this realization while trying to explain PacBio
to a non-molecular biology person last night, but hadn't had a chance to get
back online to confirm until now. As I've heard before, often you don't really
learn something until you find yourself in a position to teach it to others.

Hopefully that answers my questions for a while, but undoubtedly more will crop
up as I go.

Thanks

--Matt

On Wed, Aug 14, 2013 11:19 PM Chris Hoefler <hoeflerb@xxxxxxxxx> wrote:
>
>I think you should have a look at this technical data from PacBio,
>https://github.com/PacificBiosciences/cDNA_primer/wiki/Understanding-PacBio-transcriptome-data#wiki-readexplained
>http://www.pacificbiosciences.com/pdf/TechnicalNote_Experimental_Design_for_Targeted_Sequencing.pdf
>
>They have fairly good explanations of the differences between raw
>polymerase reads, filtered subreads, and circular consensus (CCS) reads.
>
>The short answer to your question is that consecutive subreads from the
>same polymerase read are expected to be in opposite orientations because
>they result from the polymerase walking "around" a linear DNA fragment that
>has been made topologically circular by the library preparation procedure,
>http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2926623/
>
>> Upon further analysis, approximately 1/4 of my non-CCS subreads are from
>reads with multiple subreads (that follow the FR or RF...never FF or RR).
>
>This is completely expected. CCS reads are a subset of filtered subreads
>where at least two full passes of the template have been completed.
>Depending on the length of your insert, quite a few of the multi-pass
>subreads may not be full length and therefore will not be processed as CCS
>reads.
>
>
>On Wed, Aug 14, 2013 at 6:34 PM, Matthew D. Pagel <pagel@xxxxxxx> wrote:
>
>> I was looking at an alignment of filtered but uncorrected reads from PacBio
>> data (covering a particular area of the genome where PacBio and 454 data
>> were
>> in apparent disagreement).  I noticed that subreads of a given read that
>> was
>> labelled as "CCS" were not oriented in the same direction. My
>> understanding of
>> the technology was that a given read tracks a single polymerase in a ZMW.
>>  I
>> also understood that reads are presumed to be of a contiguous sequencing
>> reaction (no disassociation of polymerase from template). Thus, I would
>> assume
>> that each subread should be read in the same direction as the previous,
>> given a
>> small circular piece of DNA resulting from library preparation.
>>
>> I do note that my subreads alternate in orientation vs the reference along
>> the
>> CCS read (RFRFRFR).  Upon further analysis, approximately 1/4 of my non-CCS
>> subreads are from reads with multiple subreads (that follow the FR or
>> RF...never FF or RR).  My assumption is that I'm dealing with a region of
>> the
>> genome that has a inverted duplication.  Further the read-to-subread
>> algorithm
>> is inappropriately identifying adaptor sequence (or I happen to have the
>> adaptor sequence in my genome verbatim).
>>
>> Alternatively, does this indicate a problem with the library preparation?
>> Is
>> the polymerase in the ZMW encountering some secondary structure and
>> jumping to
>> the other strand of dsDNA? Anything else that you can think of that is
>> going on?
>>
>> Is there a quick-and-dirty algorithm out there for identifying inversions
>> from
>> one subread to the next within a single PB read, assuming that these data
>> really indicate a genomic duplication and inversion rather than a re-read
>> of
>> the same genomic region?  Is there a way to identify "false adaptors"
>> otherwise, as some genomic duplications are oriented the same direction
>> (FF)
>> rather than as an inversion?
>> _______________________________________________________
>> Matt Pagel
>> Graduate Student
>> Penn State Biochemistry, Microbiology and Molecular Biology
>>
>>
>> --
>> You have received this mail because you are subscribed to the mira_talk
>> mailing list. For information on how to subscribe or unsubscribe, please
>> visit http://www.chevreux.org/mira_mailinglists.html
>>
>
>
>
>-- 
>Chris Hoefler, PhD
>Postdoctoral Research Associate
>Straight Lab
>Texas A&M University
>2128 TAMU
>College Station, TX 77843-2128


-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: