[mira_talk] Re: assembly options for non-redundant contigs

From: Richard Gregory <R.Gregory@xxxxxxxxxxxxxxx>
To: "mira_talk@xxxxxxxxxxxxx" <mira_talk@xxxxxxxxxxxxx>
Date: Tue, 09 Jun 2009 03:39:54 +0100

Hi Bastien,

Have prepared a mini example, by picking out all reads (both pre andTitanium) that went into contigs that blasted to the same referencegene. These reads were then assembled with 3 different versions of Mira,V2.9.15, .37, .43, and cap3, and contigs from the Mira versions wereassembled again with cap3. The results aren't quite as obvious as thefull data set, but it does have the advantage of assembling in minutesrather than days.


The results are:-
First assembly:
Used            total
reads contigs   bases
2002   13       5696    Reads assembled with cap3
2002   42      15643    Reads assembled with Mira V2.9.15
2016   59      19234    Reads assembled with Mira V2.9.37
2008   67      20981    Reads assembled with Mira V2.9.43

Reassembled with cap3 with default params:
 Mira     Used            Cap3
Contigs contigs  Singletons  Contigs
  13       8         5         3       cap3
  42      40         2         3       v2.9.15
  59      51         1         4       v2.9.37
  67      59         6         4       v2.9.43

Will send the reads via PM.

I should also mention these reads have been heavily trimmed for adapterand poly A/T using several passes of blast. Poly A/T will be present,but only when it was originally <10 bases long and within the middlehalf of the read.

Have also now tested 2.9.37 with the original pre-Titanium dataset I wasusing to show the change in behaviour between Mira versions. This isusing not-so well trimmed reads.

 Number    Total    Number of
of Reads   Bases     Contigs
 169796   2865603      8540    V2.9.15
 146840   5673833     22863    V2.9.37
 149758   6167756     24376    V2.9.43


      Mira           Cap3             Cap3
    Contigs        Singletons        Contigs
  In      Used    Bases  'reads'   Bases   Out
 8540     2277   2021257  6263    459553   630  V2.9.15
22863    16141   1918793  6722    656203  1116  V2.9.37
24376    17545   2007147  6831    724953  1167  V2.9.43

Which shows the main difference is between .15 and .37


Richard


Bastien Chevreux wrote:

On Mittwoch 03 Juni 2009 Richard Gregory wrote:
The "good" results were from V2.9.15 . Making sure this effect was real,
I've just tried assembling an old dataset using 2.9.43 and exactly the
same input reads. This example uses pre-Titanium reads, a pool of two
samples of relatively degraded cdna with an average read length 120bp.
[...]
Hmmm ... 2.9.15 is from one and a half years ago. A lot has changed in themean time and I'll need to investigate that.
The question is: why does MIRA put things apart that it now thinks do notbelong together. One idea I have is (as I changed poly-A/T clip handling) thatit now sses more "differences" in these parts and therefore has the effect younoticed.
In the end, it would be best if this could be looked at with some very specificexamples at hand. Would it be possible for you to make available for me one ofthese data sets? If yes, you could show me a few cases that trouble you and Iwould have a deeper look at what happened (or did not happen).
Regards,
  Bastien
PS: Please note that this could probably only happen mid to end of next weekor later as I this week-end is already reserved for something else and thenI'm on travel.


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

References:
- [mira_talk] assembly options for non-redundant contigs
  - From: Richard Gregory
- [mira_talk] Re: assembly options for non-redundant contigs
  - From: Bastien Chevreux
- [mira_talk] Re: assembly options for non-redundant contigs
  - From: Richard Gregory
- [mira_talk] Re: assembly options for non-redundant contigs
  - From: Bastien Chevreux

[mira_talk] Re: assembly options for non-redundant contigs

Other related posts: