Hello Bastien,
I'll keep in mind to include more details in the case of any further
questions =)
On regards to your guess, you're right. I do have different strains in my
data.
Best regards,
Pati
On Tue, Oct 13, 2015 at 11:11 PM, FreeLists Mailing List Manager <
ecartis@xxxxxxxxxxxxx> wrote:
mira_talk Digest Tue, 13 Oct 2015 Volume: 03 Issue: 118
In This Issue:
[mira_talk] "Xs" found in EST de novo assemblies
[mira_talk] Re: "Xs" found in EST de novo assemblies
[mira_talk] Re: big transcriptome files too long running
tim
----------------------------------------------------------------------
Date: Tue, 13 Oct 2015 00:23:40 -0600
Subject: [mira_talk] "Xs" found in EST de novo assemblies
From: Patricia Carvajal <patriciacarvajal.mxli@xxxxxxxxx>
Hello!!
I have EST, de novo assemblies and I found Xs in my unpadded contigs,
how should those Xs be interpreted in this case?
Please advise.
Best regards,
Pati
------------------------------
Subject: [mira_talk] Re: "Xs" found in EST de novo assemblies
From: Bastien Chevreux <bach@xxxxxxxxxxxx>
Date: Tue, 13 Oct 2015 05:56:15 -0400
On 13 Oct 2015, at 2:23 , Patricia Carvajal <
patriciacarvajal.mxli@xxxxxxxxx> wrote:
I have EST, de novo assemblies and I found Xs in my unpadded contigs,how should those Xs be interpreted in this case?
Please advise.
When posting questions like these, it is always helpful to give a tad more
detail on your data and what you actually did. I.e., the manifest file.
Anyway … am I correct in guessing you had several strains in your
assembly? Because MIRA writes X as result only in those cases, and this
denotes a missing coverage for a given strain.
B.
------------------------------
From: "Kapli, Paschalia" <Paschalia.Kapli@xxxxxxxxx>
Subject: [mira_talk] Re: big transcriptome files too long running time
Date: Tue, 13 Oct 2015 22:45:36 +0000
Dear Bastien,
thank you for this! I did run it for longer and it finished after about a
month.
Best regards,
Paschalia
________________________________
From: mira_talk-bounce@xxxxxxxxxxxxx <mira_talk-bounce@xxxxxxxxxxxxx> on
behalf of Bastien Chevreux <bach@xxxxxxxxxxxx>
Sent: 01 September 2015 03:28
To: mira_talk@xxxxxxxxxxxxx
Subject: [mira_talk] Re: big transcriptome files too long running time
On 31 Aug 2015, at 4:44 , Kapli, Paschalia <Paschalia.Kapli@xxxxxxxxx
<mailto:Paschalia.Kapli@xxxxxxxxx>> wrote:
[...]
Another problem is that I am running the analyses on a cluster with 48
hours time-wall. Thus I need to resume the analysis every two day, which I
am afraid is stalling it even more.
You ... might want to ask your sysadmin for a longer runtime as I fear the
above will lead to nothing. Or another tool than MIRA.
If neither of the above is an option for you, here's what I would do:
pre-assembly to get troublemakers out, then "easy" assembly.
1.) run MIRA with, say, 1m reads
2.) extract from the result the top 1% highest expressed genes (that is
result part one). Depending on the data you have, this top 1% of genes can
account for anything between 10 and 90% of the reads (if it is just ten,
then maybe take not 1 but 5%).
3) use mirabait to subtract reads in this top 1% from the complete set
(use 32 <= k <= 50)
4) rerun MIRA with the resulting set. (result part 2)
Hope this helps,
B.
PS: (but you also want a longer runtime, believe me)
------------------------------
End of mira_talk Digest V3 #118
*******************************