Thanks! In case you were curious. The dev version of mira crashes on the same data. This time though, it's a core-dump: ... 008-A-S7_rep_c119 first rail: rr_####297980#### 008-A-S7_rep_c119 last rail: rr_####297981#### Adding rails: length 276 and overlap 138 makeIntelligentConsensus() complete calc .. mict_fin 50 mict_pre 13 mict_shadow 14 mict_fallout 649 mict_newin 490 mict_helper1 1587 mict_newin 490 mict_restofloop 516 mict_totalloop 5093 done. 008-A-S7_rep_c120 first rail: rr_####297982#### 008-A-S7_rep_c120 last rail: rr_####297985#### src/tcmalloc.cc:289] Attempt to free invalid pointer 0x7fff18826780 Aborted (core dumped) Failure, wrapped MIRA process aborted. If you want to debug, I've attached a small set of reads (with manifests) that reproduces the problem for me (with 4.0.2 and the devel code). Let me know if there is anything else you'd need. Meanwhile, I'm going to take your advice and come up with a different approach. So you don't need to worry about this on my account. Thanks for the help, -john On Mon, Aug 11, 2014 at 10:08 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote: > On 12 Aug 2014, at 1:06 , John Eppley <jmeppley@xxxxxxxxxx> wrote: > > I get the following message from Mira 4.0.2: > […] > I'm trying do do something like an iterative assembly using two pairs of > Illumina (MiSeq) fastq files. The plan is to do a denovo assembly with one > pair of files, and then a mapping assembly using the first assembly (as a > MAF files) as the reference. My eventual goal is to be able to assemble > something that might not otherwise fit into RAM. > > > Good thinking, but that approach will unfortunately backfire in terms of > RAM usage: contigs are incredibly RAM expensive beasts and I suspect you > will end up using more RAM doing this than by doing a full de-novo, I’m > sorry. > > […] > The data are randomly fragmented transcripts from a mixed population, > hence the est approach. > My first question is: is this a reasonable thing to attempt? Can Mira pull > off this sort of iterative assembly? > > > The approach will also not work for other reasons: if a transcript is > broken in two (or more parts) in the first assembly because of missing > data, these parts will not be joined in the subsequent mapping. There are a > couple of other reasons (those can be worked around though), but I think > that this one will already be a no-go and cannot be worked around. > > […] > If so, then what is there to do about this error? > In the meantime, I'll try to reproduce with a smaller set of reads. > > > You cannot do anything yourself as this looks like a programming error. > Must be something weird though, it’s deep within the core contig routines. > It maybe already fixed (I’m not sure), you can have a try at my current > development branch and report whether it still appears if you wish: > > http://www.chevreux.org/tmp/mira_develop-0-g81642a1_linux-gnu_x86_64_static.tar.bz2 > > If it is still present, any small data set you can give me to reproduce > and go on a bug hunt is welcome. > > Best, > Bastien > > > -- -- John Eppley Senior Software Engineer Center For Microbial Oceanography: Research & Education 617 500 2468 jmeppley@xxxxxxxxxx
Attachment:
mira_test_data.tar.bz2
Description: BZip2 compressed data