[mira_talk] Re: Help with MCVc (missing Co Verage in consenus) after mapping assembly

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 19 Feb 2014 13:51:02 +0100 (CET)

> On February 19, 2014 at 1:30 PM Austen Chen <cyausten@xxxxxxxxx> wrote:
>  Just another simple or dumb question:  as you know I am looking for missing
>regions b/t strains
> and i have found that quite a few regions listed present in the
> featuresequences.txt actually have
> only one read each when i checked them on Gap4 (see below for example).
> [...]
>  Can you please advice if they are still good enough to be regarded as
>present?

We're speacking of high throughput data, right? I.e. average coverages >=50x

For those cases: technically the regions are "present", but I regard them as
absent, especially if they are bordering MCVc tags and no known problematic
sequencing motif (e.g. multiple GGCxG on fwd/rev strands in Illumina) is in the
area. The single matching read or sometimes two or three reads are probably due
to a non-100% clonal DNA sample where a couple of individuals have the
corresponding stretch. I actually tend to see that a often in mapping assemblies
with larger deletions.

BTW, tings like these are the reason why I always check by hand results before I
give them back to biologists.

B.

Attachment: image.png
Description: PNG image

Other related posts: