Hi Bastien,Have prepared a mini example, by picking out all reads (both pre and Titanium) that went into contigs that blasted to the same reference gene. These reads were then assembled with 3 different versions of Mira, V2.9.15, .37, .43, and cap3, and contigs from the Mira versions were assembled again with cap3. The results aren't quite as obvious as the full data set, but it does have the advantage of assembling in minutes rather than days.
The results are:- First assembly: Used total reads contigs bases 2002 13 5696 Reads assembled with cap3 2002 42 15643 Reads assembled with Mira V2.9.15 2016 59 19234 Reads assembled with Mira V2.9.37 2008 67 20981 Reads assembled with Mira V2.9.43 Reassembled with cap3 with default params: Mira Used Cap3 Contigs contigs Singletons Contigs 13 8 5 3 cap3 42 40 2 3 v2.9.15 59 51 1 4 v2.9.37 67 59 6 4 v2.9.43 Will send the reads via PM.I should also mention these reads have been heavily trimmed for adapter and poly A/T using several passes of blast. Poly A/T will be present, but only when it was originally <10 bases long and within the middle half of the read.
Have also now tested 2.9.37 with the original pre-Titanium dataset I was using to show the change in behaviour between Mira versions. This is using not-so well trimmed reads.
Number Total Number of of Reads Bases Contigs 169796 2865603 8540 V2.9.15 146840 5673833 22863 V2.9.37 149758 6167756 24376 V2.9.43 Mira Cap3 Cap3 Contigs Singletons Contigs In Used Bases 'reads' Bases Out 8540 2277 2021257 6263 459553 630 V2.9.15 22863 16141 1918793 6722 656203 1116 V2.9.37 24376 17545 2007147 6831 724953 1167 V2.9.43 Which shows the main difference is between .15 and .37 Richard Bastien Chevreux wrote:
On Mittwoch 03 Juni 2009 Richard Gregory wrote:The "good" results were from V2.9.15 . Making sure this effect was real, I've just tried assembling an old dataset using 2.9.43 and exactly the same input reads. This example uses pre-Titanium reads, a pool of two samples of relatively degraded cdna with an average read length 120bp. [...]Hmmm ... 2.9.15 is from one and a half years ago. A lot has changed in the mean time and I'll need to investigate that.The question is: why does MIRA put things apart that it now thinks do not belong together. One idea I have is (as I changed poly-A/T clip handling) that it now sses more "differences" in these parts and therefore has the effect you noticed.In the end, it would be best if this could be looked at with some very specific examples at hand. Would it be possible for you to make available for me one of these data sets? If yes, you could show me a few cases that trouble you and I would have a deeper look at what happened (or did not happen).Regards, BastienPS: Please note that this could probably only happen mid to end of next week or later as I this week-end is already reserved for something else and then I'm on travel.
-- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html