Hi and thanks for the reply! Concerning your examples below, I’m OK with the first case as I would interpret an IUPAC "R" as a result of merging what probably are two alleles, but ideally mira would give me two contigs. I see that wasn’t what I wrote in my question, but that’s what I meant, sorry… In second example the preferred result would be that the contig (or perhaps the read) is clipped when the quality drops that low. I will align each RAD loci/“EST” across a handfull of fairly closely related species and although the errors (be it a single base or a IUPAC base) most of the time only will result in apomorphic characters, they will mess up the diversity and branch length estimates in my phylogenetic analyses. So I rather have somewhat shorter RAD loci/ESTs with ± reliable data than longer ones with shaky 3’ ends at this stage. Unless you suggest otherwise, I’ll probably use miraconvert and set -q to something that works with my data and then simply truncate at the first N or perhaps just merge the fasta and qual files and use a suitable cut off in fastq quality trimmer. Thanks again! Cheers, MAgnus On 8 Oct 2014, at 13:41, Bastien Chevreux <bach@xxxxxxxxxxxx<mailto:bach@xxxxxxxxxxxx>> wrote: On October 6, 2014 at 6:48 PM Magnus Popp <magnus.popp@xxxxxxxxxx<mailto:magnus.popp@xxxxxxxxxx>> wrote: To complicate matters, there's some heterozygosity in my organisms (di- and possibly tetraploid plants) so I only really want to get rid of IUPAC calls stemming from poor quality, not completely eliminate them. As this is iontorrent PGM data, the read length varies quite a bit and many contigs have a 3’ tail consisting of very few (1-4) reads. MIRA usually tries to get you unambiguous calls and only falls back to IUPAC if that fails. From what you wrote you would hope for MIRA to give you - e.g.: an IUPAC in case it needs to decide between an A at qual 60 at a G at qual 61 - e.g.: a single base G in case it needs to decide between an A at qual 6 at a G at qual 7 Am I summarising correctly? If yes: can you tell me why you think that this is a good idea? I do have some trouble at grasping the reasoning for this. 1) Do you have any suggestions on how to reduce the number of IUPACs due to low quality in the 3’ en of reads? Clip the 3' a bit harder? Not ideal, I know. 2) Is there a way of telling mira to clip a contig where it goes below a certain minimum coverage? No. I think there was a functionality in miraconvert which would N out consensus on the coverage being below a given level, but atm I see that only wrt to quality (see -q parameter). I'm not sure why I dropped the version with coverage ... need to check. 3) And somewhat unrelated - what’s the quality score in the fasta.qual files and how is it calculated? //www.freelists.org/post/mira_talk/Quality-Values,4 HTH, B. -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html __________________________________________ Magnus Popp Natural History Museum University of Oslo P.O. Box 1172 Blindern NO-0318 Oslo, Norway Phone: +47 22851875 Fax: +47 22851835 Visiting address: Office 112C, Botanical Museum, Sars gate 1, Tøyen www.nhm.uio.no/english/about/organization/research-collections/people/magnuspo/index.html<http://www.nhm.uio.no/english/about/organization/research-collections/people/magnuspo/index.html> www.forbio.uio.no/ __________________________________________