[mira_talk] Re: haplotype phasing

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 31 Jan 2014 20:58:00 +0100

On 31 Jan 2014, at 19:57 , Adrian Pelin <apelin20@xxxxxxxxx> wrote:
> I was wonder if the contigs that MIRA build due to intragenomic 
> variation/heterozygosity are phased haplotypes?
> In other words, could we expect these small contigs to represent portions of 
> the chromatids?

Most of the time yes. At least if SNPs determining the haplotype are closer 
together that the average read length (or the average library size in paired 
libs).

But things are not always easy. Assume the following to be “the truth” in 
reality:

haplotype 1:   xxxxAxxxx … 1000 bp … yyyyAyyyy
haplotype 2:   xxxxCxxxx … 1000 bp … yyyyCyyyy

Now, if you sequence that with, say, Illumina 100bp paired end, you may get the 
following four cases (with equal chances):

I)
1 large contig with xxxxAxxxx … 1000 bp … yyyyAyyyy
+ 2 small contigs, one xxxxCxxxx and another one yyyyCyyyy

II)
1 large contig with xxxxAxxxx … 1000 bp … yyyyCyyyy
+ 2 small contigs, one xxxxCxxxx and another one yyyyAyyyy

III)
1 large contig with xxxxCxxxx … 1000 bp … yyyyAyyyy
+ 2 small contigs, one xxxxAxxxx and another one yyyyCyyyy

IV)
1 large contig with xxxxCxxxx … 1000 bp … yyyyCyyyy
+ 2 small contigs, one xxxxAxxxx and another one yyyyAyyyy


The reason for this being that the placeholder “1000 bp” stands for absolute 
identical sequence which - in absence of further SNPs, longer reads or 
libraries with larger insert sizes - make it impossible impossible to cross 
correctly.

B.


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: