On 31 Jan 2014, at 19:57 , Adrian Pelin <apelin20@xxxxxxxxx> wrote: > I was wonder if the contigs that MIRA build due to intragenomic > variation/heterozygosity are phased haplotypes? > In other words, could we expect these small contigs to represent portions of > the chromatids? Most of the time yes. At least if SNPs determining the haplotype are closer together that the average read length (or the average library size in paired libs). But things are not always easy. Assume the following to be “the truth” in reality: haplotype 1: xxxxAxxxx … 1000 bp … yyyyAyyyy haplotype 2: xxxxCxxxx … 1000 bp … yyyyCyyyy Now, if you sequence that with, say, Illumina 100bp paired end, you may get the following four cases (with equal chances): I) 1 large contig with xxxxAxxxx … 1000 bp … yyyyAyyyy + 2 small contigs, one xxxxCxxxx and another one yyyyCyyyy II) 1 large contig with xxxxAxxxx … 1000 bp … yyyyCyyyy + 2 small contigs, one xxxxCxxxx and another one yyyyAyyyy III) 1 large contig with xxxxCxxxx … 1000 bp … yyyyAyyyy + 2 small contigs, one xxxxAxxxx and another one yyyyCyyyy IV) 1 large contig with xxxxCxxxx … 1000 bp … yyyyCyyyy + 2 small contigs, one xxxxAxxxx and another one yyyyAyyyy The reason for this being that the placeholder “1000 bp” stands for absolute identical sequence which - in absence of further SNPs, longer reads or libraries with larger insert sizes - make it impossible impossible to cross correctly. B. -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html