On Wed, Feb 12, 2014 at 9:38 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote: > On 12 Feb 2014, at 19:15 , Peter Cock <p.j.a.cock@xxxxxxxxxxxxxx> wrote: >> For paired end libraries, does MIRA report any of the observed >> template/insert size, orientation, fraction assembled in same >> contig, or something like the SAM FLAG for "properly mapped" >> somewhere? > > That's something where proper output did not make it into 4.0. There > are ways to extract all that info (except fraction) from the output log, > but they're tedious and the format will change. If you want to give it > a try anyway, grep for "^ATG" in the log. You'll see reports (mean, > stdev, skewdness, inferred min/max) for every readgroup. Pass 1 > does not count, look in pass 2 and there for the final predictions. > > If you are just interested in min/max of the templates, look at the > header of the MAF files (results or, during assembly, checkpoint). > > B. Great - thanks Bastien. I'm doing this with grep: $ grep ^ATG -A 3 assembly.log ATG PREDICTIONS rgid: 1 c: 161712 sp: -2 m: 113.6219227891 d: 31.3431858333 s: -0.1626077222 -: 50 +: 176 rgid: 1 c: 620869 sp: -1 m: 244.2576642220 d: 123.1716308278 s: 0.7693070253 -: 22 +: 515 Final prediction: rgid: 1 c: 620869 sp: -1 m: 244.2576642220 d: 123.1716308278 s: 0.7693070253 -: 22 +: 515 -- ATG PREDICTIONS rgid: 1 c: 161180 sp: -2 m: 111.8964201514 d: 30.8984276809 s: -0.1555124536 -: 50 +: 173 rgid: 1 c: 613365 sp: -1 m: 231.4318040621 d: 114.9710732224 s: 0.7802482977 -: 24 +: 484 Final prediction: rgid: 1 c: 613365 sp: -1 m: 231.4318040621 d: 114.9710732224 s: 0.7802482977 -: 24 +: 484 -- ATG PREDICTIONS rgid: 1 c: 160397 sp: -2 m: 111.2749847255 d: 30.7681223973 s: -0.1499654587 -: 49 +: 172 rgid: 1 c: 610365 sp: -1 m: 230.1019698574 d: 114.2280587811 s: 0.7776165138 -: 24 +: 481 Final prediction: rgid: 1 c: 610365 sp: -1 m: 230.1019698574 d: 114.2280587811 s: 0.7776165138 -: 24 +: 481 -- ATG PREDICTIONS rgid: 1 c: 159008 sp: -2 m: 111.2163714004 d: 30.7810822718 s: -0.1497065318 -: 49 +: 172 rgid: 1 c: 604403 sp: -1 m: 229.8911552540 d: 114.0643954342 s: 0.7746355633 -: 24 +: 480 Final prediction: rgid: 1 c: 604403 sp: -1 m: 229.8911552540 d: 114.0643954342 s: 0.7746355633 -: 24 +: 480 -- ATG PREDICTIONS rgid: 1 c: 159089 sp: -2 m: 111.3048965994 d: 30.6778592767 s: -0.1515380996 -: 49 +: 172 rgid: 1 c: 608715 sp: -1 m: 229.7921947666 d: 113.9986506361 s: 0.7775140932 -: 24 +: 480 Final prediction: rgid: 1 c: 608715 sp: -1 m: 229.7921947666 d: 113.9986506361 s: 0.7775140932 -: 24 +: 480 Those are the predictions for each of the five passes - settling down to a mean template size of approx 230, standard deviation 114, skew 0.77, min 24, max 480. However, the MAF header seems to have the min/max from the first pass (22, 515), is that an error? @ReadGroup @RG name MiSeq @RG ID 1 @RG technology Solexa @RG strainname StrainX @RG templatesize 22 515 @RG segmentplacement FR @RG segmentnaming solexa @EndReadGroup Thanks, Peter -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html