On Feb 4, 2014, at 8:22 AM, Austen Chen <cyausten@xxxxxxxxx> wrote: > Firstly I apologise for invading your privacy which was done unintentionally. > After sending the email I couldn't see it on the list, so i wasn't sure if > it's being sent or not, and with your name and photos popping out everywhere > on the web page, looks like you were happy for people to contact you and > that's why i thought i could just ask you if i had sent my mail successfully, > and i didn't mean to demand a response from you straightaway, and i know you > are very helpful and quick in response, and there is no need to do so, so > merely misunderstanding and sorry to have upset you. > > Still thank you very much for your help, and i don't think I had explained my > problem clearly. you are right that I made a mapping, imported that into > gap4, made some edits/finishing there and then exported the tag positions > from within gap4. this tag position has been illustrated in your mira manual > page 17 (version 4.0rc5): Figure 1.18 "MCVc" tag showing a genome deletion > in Solexa mapping assembly. as you can see in the figure the top line is a > bacterial reference sequence, and the missing dark red stretch in the > consensus is given at a position of 554830 in reference to the reference > sequence. what i have done is that i have made two separate mapping > assemblies, one is A1 against an annotated GenBank bacterial reference > sequence, and another is A3 against the same reference sequence, and I want > to find if A1 and A3 have any common missing regions, and that's where i had > my problem. For example, as you can see in Gap4 A1 has no coverage at a > position of 165410 with a missing length of 404bp (165410-165814) > > position 165410 > length 404 > type MCVc > comment 'gff3str=.' > > <image.png> > > and as for A3, it has no coverage at a position of 165209 with a missing > length of 827bp (165209-166036) > > position 165209 > length 827 > type MCVc > comment 'gff3str=.' > > <image.png> > > Presumably A1 and A3 would have a common missing region from 165410-165814, > but this is not the case as each missing position in A1 and A3 is not > referring to the same position in relation to the reference sequence, and > hope i have explained it here clearly. it looks like i am able to find > missing coverage in each strain but unable to find common missing coverage > between these 2 strains, and that's why I was asking you for help if my > interpretation was correct and you could give me some suggestions on how to > tackle this problem. > If both of the genomes are mapped to the same reference sequence, you could make sam and then bam files from each assembly, merge them with picard-tools' bam merge tool, and then view the merged assembly in Tablet. That way you can see each of the read groups, and their relationship to the reference sequence. J