Hi, I’m attempting to assemble ddRAD data from an iontorrent PGM using the EST mode of mira 4.0.2. The data comes from genomic DNA that has been completely digested with two different RE’s, ligated with indexed adaptors, size selected (inserts ca 300-420bp) and then sequenced. The result is a reduced genome representation, and in this particular case I end up with 2-3k loci of 350-420 bp for each sample, analogous to ESTs except the contigs aren’t necessarily coding and don’t have a poly-A tail - so basically not very EST like at all... What I’ve seen so far using the EST mode is quite promising but I need to get rid of the 3’ end of the reads in order to minimise the number of IUPAC coded bases in my consensus. To complicate matters, there's some heterozygosity in my organisms (di- and possibly tetraploid plants) so I only really want to get rid of IUPAC calls stemming from poor quality, not completely eliminate them. As this is iontorrent PGM data, the read length varies quite a bit and many contigs have a 3’ tail consisting of very few (1-4) reads. What I’ve done so far is: #1 project = casfas job = est,accurate readgroup = casfas data = casfas_18.fastq technology = iontor PARAMETERS = COMMON_SETTINGS -AS:nop=5 sep=on IONTOR_SETTINGS -AS:mrl=70 mrpc=5 and then some tests with: #2 project = casfas job = est,accurate readgroup = casfas data = casfas_18.fastq technology = iontor PARAMETERS = COMMON_SETTINGS -AS:nop=5 sep=on IONTOR_SETTINGS -AS:mrl=70 mrpc=5 -CL pec=on cpat=off (and cpat=on too) and: #3 project = casfas job = est,accurate readgroup = casfas data = casfas_18.fastq technology = iontor PARAMETERS = COMMON_SETTINGS -AS:nop=5 sep=on IONTOR_SETTINGS -AS:mrl=70 mrpc=5 -CL pec=on cpat=off qc=on While #2 reduce the number of IUPACs with 50% (or even better for some species), #3 is better than #1 but worse than #2. #3 also gave some odd increase in number of contigs for my single test run but I have to run a few more to see if that’s consistent So, my questions are: 1) Do you have any suggestions on how to reduce the number of IUPACs due to low quality in the 3’ en of reads? 2) Is there a way of telling mira to clip a contig where it goes below a certain minimum coverage? 3) And somewhat unrelated - what’s the quality score in the fasta.qual files and how is it calculated? Cheers, Magnus -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html