[mira_talk] Re: Help with MCVc (missing Co Verage in consenus) after mapping assembly

  • From: Austen Chen <cyausten@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 4 Feb 2014 23:52:44 +1030

Firstly I apologise for invading your privacy which was done
unintentionally. After sending the email I couldn't see it on the list, so
i wasn't sure if it's being sent or not, and with your name and photos
popping out everywhere on the web page, looks like you were happy for
people to contact you and that's why i thought i could just ask you if i
had sent my mail successfully, and i didn't mean to demand a response from
you straightaway, and i know you are very helpful and quick in response,
and there is no need to do so, so merely misunderstanding and sorry to have
upset you.

Still thank you very much for your help, and i don't think I had explained
my problem clearly. you are right that I made a mapping, imported that into
gap4, made some edits/finishing there and then exported the tag positions
from within gap4. this tag position has been illustrated in your mira
manual page 17 (version 4.0rc5): Figure 1.18  "MCVc" tag showing a genome
deletion in Solexa mapping assembly. as you can see in the figure the top
line is a bacterial reference sequence, and the missing dark red stretch in
the consensus is given at a position of 554830 in reference to the
reference sequence. what i have done is that i have made two separate
mapping assemblies, one is A1 against an annotated GenBank bacterial
reference sequence, and another is A3 against the same reference sequence,
and I want to find if A1 and A3 have any common missing regions, and that's
where i had my problem. For example, as you can see in Gap4 A1 has no
coverage at a position of 165410 with a missing length of 404bp
(165410-165814)

position 165410
length 404
type MCVc
comment 'gff3str=.'

[image: Inline image 1]

and as for A3, it has no coverage at a position of 165209 with a missing
length of 827bp (165209-166036)

position 165209
length 827
type MCVc
comment 'gff3str=.'

[image: Inline image 1]

Presumably A1 and A3 would have a common missing region from 165410-165814,
but this is not the case as each missing position in A1 and A3 is
not referring to the same position in relation to the reference sequence,
and hope i have explained it here clearly. it looks like i am able to find
missing coverage in each strain but unable to find common missing coverage
between these 2 strains, and that's why I was asking you for help if my
interpretation was correct and you could give me some suggestions on how to
tackle this problem.

Just want to say that both you and Mira are great and I prefer Mira to any
other programs, because you have made this kind of work so much fun and
interesting, thank you so much.

Austen



>
>

PNG image

PNG image

Other related posts: