[mira_talk] Re: Settings for ddRAD data using EST mode

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 8 Oct 2014 13:41:09 +0200 (CEST)

> On October 6, 2014 at 6:48 PM Magnus Popp <magnus.popp@xxxxxxxxxx> wrote:
> To complicate matters, there's some heterozygosity in my organisms (di- and
> possibly tetraploid plants)
> so I only really want to get rid of IUPAC calls stemming from poor quality,
> not completely eliminate them.
> As this is iontorrent PGM data, the read length varies quite a bit and many
> contigs have a 3’ tail consisting of very few (1-4) reads.

MIRA usually tries to get you unambiguous calls and only falls back to IUPAC if
that fails. From what you wrote you would hope for MIRA to give you

 - e.g.: an IUPAC in case it needs to decide between an A at qual 60 at a G at
qual 61
 - e.g.: a single base G in case it needs to decide between an A at qual 6 at a
G at qual 7

Am I summarising correctly? If yes: can you tell me why you think that this is a
good idea? I do have some trouble at grasping the reasoning for this.

> 1) Do you have any suggestions on how to reduce the number of IUPACs due to
> low quality in the 3’ en of reads?

Clip the 3' a bit harder? Not ideal, I know.

> 2) Is there a way of telling mira to clip a contig where it goes below a
> certain minimum coverage?

No. I think there was a functionality in miraconvert which would N out consensus
on the coverage being below a given level, but atm I see that only wrt to
quality (see -q parameter). I'm not sure why I dropped the version with coverage
... need to check.

> 3) And somewhat unrelated - what’s the quality score in the fasta.qual files
> and how is it calculated?

//www.freelists.org/post/mira_talk/Quality-Values,4

HTH,
  B.

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: