[mira_talk] Re: big transcriptome files too long running time

  • From: "Kapli, Paschalia" <Paschalia.Kapli@xxxxxxxxx>
  • To: "mira_talk@xxxxxxxxxxxxx" <mira_talk@xxxxxxxxxxxxx>
  • Date: Tue, 13 Oct 2015 22:45:36 +0000

Dear Bastien,


thank you for this! I did run it for longer and it finished after about a month.


Best regards,

Paschalia



________________________________
From: mira_talk-bounce@xxxxxxxxxxxxx <mira_talk-bounce@xxxxxxxxxxxxx> on behalf
of Bastien Chevreux <bach@xxxxxxxxxxxx>
Sent: 01 September 2015 03:28
To: mira_talk@xxxxxxxxxxxxx
Subject: [mira_talk] Re: big transcriptome files too long running time

On 31 Aug 2015, at 4:44 , Kapli, Paschalia
<Paschalia.Kapli@xxxxxxxxx<mailto:Paschalia.Kapli@xxxxxxxxx>> wrote:
[...]
Another problem is that I am running the analyses on a cluster with 48 hours
time-wall. Thus I need to resume the analysis every two day, which I am afraid
is stalling it even more.

You ... might want to ask your sysadmin for a longer runtime as I fear the
above will lead to nothing. Or another tool than MIRA.

If neither of the above is an option for you, here's what I would do:
pre-assembly to get troublemakers out, then "easy" assembly.

1.) run MIRA with, say, 1m reads
2.) extract from the result the top 1% highest expressed genes (that is result
part one). Depending on the data you have, this top 1% of genes can account for
anything between 10 and 90% of the reads (if it is just ten, then maybe take
not 1 but 5%).
3) use mirabait to subtract reads in this top 1% from the complete set (use 32
<= k <= 50)
4) rerun MIRA with the resulting set. (result part 2)

Hope this helps,
B.

PS: (but you also want a longer runtime, believe me)

Other related posts: