[mira_talk] Re: Internal logic/programming/debugging error during mapping assembly with MAF reference

  • From: John Eppley <jmeppley@xxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 12 Aug 2014 11:41:08 -1000

Thanks! In case you were curious. The dev version of mira crashes on the
same data. This time though, it's a core-dump:

...
008-A-S7_rep_c119 first rail: rr_####297980####
008-A-S7_rep_c119 last rail: rr_####297981####
Adding rails: length 276 and overlap 138
makeIntelligentConsensus() complete calc .. mict_fin        50
mict_pre        13
mict_shadow     14
mict_fallout    649
mict_newin      490
mict_helper1    1587
mict_newin      490
mict_restofloop 516
mict_totalloop  5093
done.
008-A-S7_rep_c120 first rail: rr_####297982####
008-A-S7_rep_c120 last rail: rr_####297985####
src/tcmalloc.cc:289] Attempt to free invalid pointer 0x7fff18826780
Aborted (core dumped)
Failure, wrapped MIRA process aborted.


If you want to debug, I've attached a small set of reads (with manifests)
that reproduces the problem for me (with 4.0.2 and the devel code). Let me
know if there is anything else you'd need.

Meanwhile, I'm going to take your advice and come up with a different
approach. So you don't need to worry about this on my account.

Thanks for the help,
-john



On Mon, Aug 11, 2014 at 10:08 PM, Bastien Chevreux <bach@xxxxxxxxxxxx>
wrote:

> On 12 Aug 2014, at 1:06 , John Eppley <jmeppley@xxxxxxxxxx> wrote:
>
> I get the following message from Mira 4.0.2:
> […]
> I'm trying do do something like an iterative assembly using two pairs of
> Illumina (MiSeq) fastq files. The plan is to do a denovo assembly with one
> pair of files, and then a mapping assembly using the first assembly (as a
> MAF files) as the reference. My eventual goal is to be able to assemble
> something that might not otherwise fit into RAM.
>
>
> Good thinking, but that approach will unfortunately backfire in terms of
> RAM usage: contigs are incredibly RAM expensive beasts and I suspect you
> will end up using more RAM doing this than by doing a full de-novo, I’m
> sorry.
>
> […]
> The data are randomly fragmented transcripts from a mixed population,
> hence the est approach.
> My first question is: is this a reasonable thing to attempt? Can Mira pull
> off this sort of iterative assembly?
>
>
> The approach will also not work for other reasons: if a transcript is
> broken in two (or more parts) in the first assembly because of missing
> data, these parts will not be joined in the subsequent mapping. There are a
> couple of other reasons (those can be worked around though), but I think
> that this one will already be a no-go and cannot be worked around.
>
> […]
> If so, then what is there to do about this error?
> In the meantime, I'll try to reproduce with a smaller set of reads.
>
>
> You cannot do anything yourself as this looks like a programming error.
> Must be something weird though, it’s deep within the core contig routines.
> It maybe already fixed (I’m not sure), you can have a try at my current
> development branch and report whether it still appears if you wish:
>
> http://www.chevreux.org/tmp/mira_develop-0-g81642a1_linux-gnu_x86_64_static.tar.bz2
>
> If it is still present, any small data set you can give me to reproduce
> and go on a bug hunt is welcome.
>
> Best,
>   Bastien
>
>
>


-- 
--
John Eppley
Senior Software Engineer
Center For Microbial Oceanography: Research & Education
617 500 2468
jmeppley@xxxxxxxxxx

Attachment: mira_test_data.tar.bz2
Description: BZip2 compressed data

Other related posts: