[mira_talk] STOP SENDING EMAIL RE: mira_talk Digest V4 #86

  • From: Alex Washington <alexwashington@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 19 May 2011 07:31:33 -0800


> -----Original Message-----
> From: ecartis@xxxxxxxxxxxxx
> Sent: Tue, 17 May 2011 01:09:50 -0400 (EDT)
> To: ecartis@xxxxxxxxxxxxx
> Subject: mira_talk Digest V4 #86
> 
> mira_talk Digest      Mon, 16 May 2011        Volume: 04  Issue: 086
> 
> In This Issue:
>               [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr
>               [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr
>               [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr
>               [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr
>               [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr
>               [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr
>               [mira_talk] Re: results folder empty
>               [mira_talk] Re: results folder empty
>               [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr
>               [mira_talk] Re: Mira says "killed" as last word after being
>               [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr
>               [mira_talk] Re: Mira says "killed" as last word after being
>               [mira_talk] Re: results folder empty
>               [mira_talk] Re: Mira says "killed" as last word after being
>               [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr
> 
> ----------------------------------------------------------------------
> 
> Date: Mon, 16 May 2011 08:33:32 +0100
> Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent
> From: Peter <peter@xxxxxxxxxxxxxxxxxxxxx>
> 
> On Mon, May 16, 2011 at 12:13 AM, Bastien Chevreux wrote:
>> Dear all,
>> 
>> as main highlight for this version, MIRA now knows about a new
>> sequencing
>> technology: Ion Torrent. I think it fares pretty well compared to other
>> assemblers, but feedback is always appreciated.
>> 
>> Fetch it here:
>> 
>> http://sourceforge.net/projects/mira-assembler/files/MIRA/development/
>> 
>> Since the last announcement (3.2.1.7), several other improvements ...
> 
> I wonder if Nick Loman at Birmingham UK is on the list?
> 
> http://pathogenomics.bham.ac.uk/blog/2011/05/first-look-at-ion-torrent-data-de-novo-assembly/
> 
> Peter
> 
> ------------------------------
> 
> Date: Mon, 16 May 2011 11:44:27 +0200 (MEST)
> From: "Bastien Chevreux" <bach@xxxxxxxxxxxx>
> Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent
> 
>> From: Peter<peter@xxxxxxxxxxxxxxxxxxxxx>
>> I wonder if Nick Loman at Birmingham UK is on the list?
> 
> Even if he were not, he knows about .17 :-)
> 
> B.
> 
> 
> 
> ------------------------------
> 
> Date: Mon, 16 May 2011 07:46:28 -0400
> From: Phillip San Miguel <pmiguel@xxxxxxxxxx>
> Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent
> 
> Hi Bastien,
>      Looks like a parameter parsing bug has crept in?
> nohup mira -DIR:lrt=/scratch/pmiguel/ --project=TL3360
> --job=denovo,genome,accurate,solexa SOLEXA_SETTINGS
> -GE:tismin=200:tismax=700 > & ! log_assembly
> 
> works fine for  MIRA V3.2.1.15. but errors out with MIRA V3.2.1.17. I
> think the salient section in the log file is:
> 
> Parsing parameters: -DIR:lrt=/scratch/pmiguel/ --project=TL3360
> --job=denovo,genome,accurate,solexa SOLEXA_SET
> TINGS -GE:tismin=200:tismax=700
> 
> ///
> -SB:sbuip is 3, but must be no more than 2. Setting to 2
> 
> 
> ========================= Parameter parsing error(s)
> ==========================
> 
> * Parameter section: '-DIR'
> *       unrecognised string or unexpected character: lrt
> 
> * Parameter section: '-DIR'
> *       unrecognised string or unexpected character: scratch
> *       (may be due to previous errors)
> 
> * Parameter section: '-DIR'
> *       unrecognised string or unexpected character: pmiguel
> *       (may be due to previous errors)
> 
> ===============================================================================
> 
> Fatal error (may be due to problems of the input data or parameters):
> 
> "Error while parsing parameters, sorry."
> 
> ->Thrown: void MIRAParameters::parse(istream & is,
> vector<MIRAParameters> & Pv, MIRAParameters * singlemp)
> ->Caught: main
> 
> 
> Regards,
> Phillip
> 
> 
> On 5/15/2011 7:13 PM, Bastien Chevreux wrote:
>> 
>> Dear all,
>> 
>> 
>> as main highlight for this version, MIRA now knows about a new
>> sequencing technology: Ion Torrent. I think it fares pretty well
>> compared to other assemblers, but feedback is always appreciated.
>> 
>> 
>> Fetch it here:
>> 
>> http://sourceforge.net/projects/mira-assembler/files/MIRA/development/
>> 
>> 
>> Since the last announcement (3.2.1.7), several other improvements were
>> made. Here are the most prominent ones:
>> 
>> 
>> - distinct improvement of assembly quality for 454 data containing PCR
>> 
>> artefacts
>> 
>> - faster and better assemblies in hybrid Solexa + ... scenarios
>> 
>> - Solexa: automatic clipping for known sequencing adaptors and new
>> quality
>> 
>> filtering routines
>> 
>> - memory usage is kept more closely to constraints given via parameters
>> 
>> - convert_project revamped: less memory usage in most cases and a lot
>> of new
>> 
>> options to convert / filter / rework data
>> 
>> - new documentation: Ion Torrent and MIRA utilities (both not complete
>> yet)
>> 
>> - lots of other small bugfixes and improvements
>> 
>> 
>> 
> 
> 
> 
> 
> ------------------------------
> 
> Date: Mon, 16 May 2011 14:21:52 +0200 (MEST)
> From: "Bastien Chevreux" <bach@xxxxxxxxxxxx>
> Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent
> 
> --b_01_dc0774e18877dc53373e957b86cbc470
> Content-Type: text/html; charset="ISO-8859-1"
> Content-Transfer-Encoding: quoted-printable
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
> Transitional//DE"><HTML><HEAD><=
> META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html;
> charset=3Dus-ascii">=
> <TITLE>Message</TITLE></HEAD><BODY><b>&gt; From:</b> Phillip San
> Miguel<br>=
> &gt; Looks like a parameter parsing bug has crept in?<br>
> 
>     &gt;&nbsp;&nbsp; nohup mira -DIR:lrt=3D/scratch/pmiguel/
> --project=3DTL=
> 3360
>     --job=3Ddenovo,genome,accurate,solexa
> <br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; =
> SOLEXA_SETTINGS
>     -GE:tismin=3D200:tismax=3D700 &gt; &amp; ! log_assembly<br>
>     &gt; works fine for&nbsp; MIRA V3.2.1.15. but errors out with MIRA
> V3.2=
> .1.17.
>     <br><br>Ooops ... no bug, but forgot to put that in the CHANGES file
> (a=
> nd update the docs):<br><br>&nbsp;&nbsp;&nbsp;&nbsp; The "log" directory
> is=
>  no more, long live the "tmp" directory.<br>
> <br>In fact, the contents are currently exactly the same, it's just been
> re=
> named to reduce the number of question like in "can I delete contents of
> th=
> e log directory during a run, they're so big". (Answer: for most, no, you
> c=
> annot).<br><br>Consequently, I also had to rename a couple of parameters
> to=
>  keep naming consistent. -DIR:lrt is one of these. It's new name is ...
> "-D=
> IR:trt". Use that and all should be good.<br><br>I'm sorry for the
> confusio=
> n that this will create during a (hopefully) short transition
> period.<br><b=
> r>Best,<br>&nbsp; Bastien<br><br>
> <pmiguel@xxxxxxxxxx>&nbsp;&nbsp;&nbsp; <br>
>     <br>
> </pmiguel@xxxxxxxxxx></BODY></HTML>
> --b_01_dc0774e18877dc53373e957b86cbc470
> Content-Type: text/plain; charset="ISO-8859-1"
> Content-Transfer-Encoding: quoted-printable
> 
>> From: Phillip San Miguel=0A> Looks like a parameter parsing bug has
>> crept=
>  in?=0A>   nohup mira -DIR:lrt=3D/scratch/pmiguel/
> --project=3DTL3360--job=
> =3Ddenovo,genome,accurate,solexa =0A>
> SOLEXA_SETTINGS-GE:tismin=3D200:t=
> ismax=3D700 > & ! log_assembly=0A> works fine for  MIRA V3.2.1.15. but
> erro=
> rs out with MIRA V3.2.1.17.=0A=0AOoops ... no bug, but forgot to put that
> i=
> n the CHANGES file (and update the docs):=0A=0A     The "log" directory
> is =
> no more, long live the "tmp" directory.=0A=0AIn fact, the contents are
> curr=
> ently exactly the same, it's just been renamed to reduce the number of
> ques=
> tion like in "can I delete contents of the log directory during a run,
> they=
> 're so big". (Answer: for most, no, you cannot).=0A=0AConsequently, I
> also =
> had to rename a couple of parameters to keep naming consistent. -DIR:lrt
> is=
>  one of these. It's new name is ... "-DIR:trt". Use that and all should
> be =
> good.=0A=0AI'm sorry for the confusion that this will create during a
> (hope=
> fully) short transition period.=0A=0ABest,=0A  Bastien=0A=0A    =0A=0A
> --b_01_dc0774e18877dc53373e957b86cbc470--
> 
> ------------------------------
> 
> Date: Mon, 16 May 2011 14:12:57 +0100
> From: Tony Travis <a.travis@xxxxxxxxxx>
> Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent
> 
> On 16/05/11 13:21, Bastien Chevreux wrote:
>> *> From:* Phillip San Miguel
>>  > Looks like a parameter parsing bug has crept in?
>>  > nohup mira -DIR:lrt=/scratch/pmiguel/ --project=TL3360
>> --job=denovo,genome,accurate,solexa
>>  > SOLEXA_SETTINGS -GE:tismin=200:tismax=700 > & ! log_assembly
>>  > works fine for MIRA V3.2.1.15. but errors out with MIRA V3.2.1.17.
>> 
>> Ooops ... no bug, but forgot to put that in the CHANGES file (and update
>> the docs):
>> 
>> The "log" directory is no more, long live the "tmp" directory.
> 
> Hi, Bastien.
> 
> Should that not be the /var/tmp directory?
> 
> Typically, /tmp is 'small', fast' and volatile; /var/tmp is 'large', not
> so fast, but persistent across reboots.
> 
> My three-week(!) MIRA assembly of 20x106 paired-end Solexa + 270x104 454
> reads has just crashed when you announced your call for testing ;-)
> 
> Maybe someone is trying to tell me something?
> 
> I've just tried your development version, but it crashed:
> 
>> mira -project=meta \
>>         -job=denovo,genome,normal,sanger,454,solexa \
>>         COMMON_SETTINGS -GE:not=16 -SK:nrr=100 \
>>         SANGER_SETTINGS -GE:tismin=5000:tismax=40000 -LR:lsd=on,ft=fasta
>> -CL:msvs=yes \
>>         454_SETTINGS -LR:lsd=on,ft=fasta -CL:msvs=yes \
>>         SOLEXA_SETTINGS -GE:tismin=160:tismax=220 -LR:lsd=on,ft=fasta
>> -CL:msvs=yes
>  > [...]
>> Internal logic/programming/debugging error (*sigh* this should not have
>> happened).
>> Please file a bug report on
>> http://sourceforge.net/apps/trac/mira-assembler/
>> 
>> "While trying to set the name of read
>>    <trace>
>> Encountered character with ASCII code 32. T
>> It is probably due to your input data, but normally, MIRA should have
>> caught that earlier!"
>> 
>> ->Thrown: void Read::setName(const string & name)
>> ->Caught: main
>> 
>> Aborting process, probably due to an internal error.
> 
> Bye,
> 
>    Tony.
> 
>> In fact, the contents are currently exactly the same, it's just been
>> renamed to reduce the number of question like in "can I delete contents
>> of the log directory during a run, they're so big". (Answer: for most,
>> no, you cannot).
>> 
>> Consequently, I also had to rename a couple of parameters to keep naming
>> consistent. -DIR:lrt is one of these. It's new name is ... "-DIR:trt".
>> Use that and all should be good.
>> 
>> I'm sorry for the confusion that this will create during a (hopefully)
>> short transition period.
>> 
>> Best,
>> Bastien
>> 
>> 
>> 
> 
> 
> --
> Dr. A.J.Travis, University of Aberdeen, Rowett Institute of Nutrition
> and Health, Greenburn Road, Bucksburn, Aberdeen AB21 9SB, Scotland, UK
> tel +44(0)1224 712751, fax +44(0)1224 716687, http://www.rowett.ac.uk
> mailto:a.travis@xxxxxxxxxx, http://bioinformatics.rri.sari.ac.uk
> 
> ------------------------------
> 
> Date: Mon, 16 May 2011 09:16:21 -0400
> From: Phillip San Miguel <pmiguel@xxxxxxxxxx>
> Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent
> 
> On 5/16/2011 8:21 AM, Bastien Chevreux wrote:
>>> From: Phillip San Miguel
>>> Looks like a parameter parsing bug has crept in?
>>>    nohup mira -DIR:lrt=/scratch/pmiguel/
>>> --project=TL3360--job=denovo,genome,accurate,solexa
>>>      SOLEXA_SETTINGS-GE:tismin=200:tismax=700>  &  ! log_assembly
>>> works fine for  MIRA V3.2.1.15. but errors out with MIRA V3.2.1.17.
>> Ooops ... no bug, but forgot to put that in the CHANGES file (and update
>> the docs):
>> 
>>       The "log" directory is no more, long live the "tmp" directory.
>> 
>> In fact, the contents are currently exactly the same, it's just been
>> renamed to reduce the number of question like in "can I delete contents
>> of the log directory during a run, they're so big". (Answer: for most,
>> no, you cannot).
>> 
>> Consequently, I also had to rename a couple of parameters to keep naming
>> consistent. -DIR:lrt is one of these. It's new name is ... "-DIR:trt".
>> Use that and all should be good.
>> 
>> I'm sorry for the confusion that this will create during a (hopefully)
>> short transition period.
>> 
>> Best,
>>    Bastien
>> 
>> 
>> 
>> 
> Ah, yes. That solves the issue. Thanks.
> 
> I am now using MIRA V3.2.1.17 to de novo assemble 13 million solexa
> reads (101 base PE reads).  That is 1.3 billion bases of sequence. The
> genome size is about 4.5 million bases (Salmonella). So that is
> 200x-300x coverage--more than I intended.
> 
> Anyone want to predict the N50 contig length?
> 
> I tried MIRA V3.2.1.15 on a 70% GC bacterial genome (Deinococcus) at
> around 100x coverage with solexa PE 101 base reads. My N50 contig size
> was 4630 bases. That seems short to me, but it might be a result of the
> 70% GC. So I decided to de novo assemble a 50% GC data set from the same
> run.
> 
> Phillip
> 
> ------------------------------
> 
> Subject: [mira_talk] Re: results folder empty
> From: Shaun Tyler <Shaun.Tyler@xxxxxxxxxxxxxxx>
> Date: Mon, 16 May 2011 10:43:19 -0500
> 
> I've been helping Tarah get started with using MIRA (talk about the blind
> leading the blind!!!)  The log file she sent was not for the run in
> question.  Unfortunately she started the run again which overwrote the
> original log.  The second run terminated because we filled up the disk
> space so no need to dwell on that.
> 
> Maybe that's what happened the first time too but the situation was a
> little different.  Tarah said that after several days the terminal window
> had come back to the prompt.  When we were looking over the output the
> info
> folder had the info_assembly file giving the stats on the contigs, etc.
> but
> when we went to look at the results the folder was empty :-<  Could it be
> really bad timing that the drive filled up just as the run completed and
> the final files couldn't be written ????
> 
> Just looked again and the info folder only had the assembly,
> callparameters
> and readrepeats files.
> 
> Is there anyway to get MIRA to restart using the results that were
> already
> compiled???
> 
> Shaun
> 
> 
> 
> ***********************************************
> Shaun Tyler
> Head, DNA Core Facility and
> International Depositary Authority of Canada
> National Microbiology Laboratory
> Public Health Agency of Canada
> Canadian Science Centre for Human and Animal Health
> 1015 Arlington St., Suite H3130
> Winnipeg, MB   R3E 3R2
> Ph:      204-789-6030
> Fax:    204-789-2018
> EMail:    shaun_tyler@xxxxxxxxxxxxxxx
> 
> 
> 
> From: Bastien Chevreux <bach@xxxxxxxxxxxx>
> To:   mira_talk@xxxxxxxxxxxxx
> Date: 2011-05-14 03:59 AM
> Subject:      [mira_talk] Re: results folder empty
> Sent by:      mira_talk-bounce@xxxxxxxxxxxxx
> 
> 
> 
> On Wednesday 11 May 2011 12:34:35 Sven Klages wrote:
>> The attached log file is not complete ... MIRA maybe not yet finished?
> 
> Having logs (and MIRA) stopping at that place would be indeed very
> unusual.
> 
>> The log usually ends with an "end of assembly" message ..
> 
> Well, not if there was a "kill" from outside or a segmentation fault.
> 
> Tarah:
> a) is the process still running?
> b) if not: does the window where you started the run somewhwere say
> "killed"?
> 
> B.
> 
> 
> 
> ------------------------------
> 
> Subject: [mira_talk] Re: results folder empty
> From: Tarah Lynch <tarah.lynch@xxxxxxxxxxxxxxx>
> Date: Mon, 16 May 2011 11:02:57 -0500
> 
> As Shaun mentioned, I accidentally wrote over the log file, so not sure
> if
> it said 'killed' :0/
> We are currently re-running MIRA right now after moving files around to
> create more free disk space.
> Will keep you posted, thanks!
> Tarah
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Tarah Lynch
> Postdoctoral Fellow
> National Microbiology Laboratory
> Canadian Science Centre for Human and Animal Health
> Public Health Agency of Canada
> 1015 Arlington St.
> Winnipeg, MB R3E 3R2
> P: (204) 789-5000
> E: Tarah.Lynch@xxxxxxxxxxxxxxx
> 
> 
> 
> 
> 
> From:   Bastien Chevreux <bach@xxxxxxxxxxxx>
> To:     mira_talk@xxxxxxxxxxxxx
> Date:   2011-05-14 03:59 AM
> Subject:        [mira_talk] Re: results folder empty
> Sent by:        mira_talk-bounce@xxxxxxxxxxxxx
> 
> 
> 
> On Wednesday 11 May 2011 12:34:35 Sven Klages wrote:
>> The attached log file is not complete ... MIRA maybe not yet finished?
> 
> Having logs (and MIRA) stopping at that place would be indeed very
> unusual.
> 
>> The log usually ends with an "end of assembly" message ..
> 
> Well, not if there was a "kill" from outside or a segmentation fault.
> 
> Tarah:
> a) is the process still running?
> b) if not: does the window where you started the run somewhwere say
> "killed"?
> 
> B.
> 
> 
> 
> 
> ------------------------------
> 
> From: Bastien Chevreux <bach@xxxxxxxxxxxx>
> Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent
> Date: Mon, 16 May 2011 19:11:35 +0200
> 
> On May 16, 2011, at 15:12 , Tony Travis wrote:
>> On 16/05/11 13:21, Bastien Chevreux wrote:
>>> The "log" directory is no more, long live the "tmp" directory.
>> Should that not be the /var/tmp directory?
> Ahm, no. What I mean are the <projectname>_assembly/<projectname>_d_log
> respectively *_d_tmp directories.
> 
>> My three-week(!) MIRA assembly of 20x106 paired-end Solexa + 270x104 454
>> reads has just crashed when you announced your call for testing ;-)
> 
> Usual question from my side: log please :-)
> 
>> I've just tried your development version, but it crashed:
>>> Internal logic/programming/debugging error (*sigh* this should not have
>>> happened).
>>> Please file a bug report on
>>> http://sourceforge.net/apps/trac/mira-assembler/
>>> 
>>> "While trying to set the name of read
>>>   <trace>
>>> Encountered character with ASCII code 32. T
>>> It is probably due to your input data, but normally, MIRA should have
>>> caught that earlier!"
> 
> Not a crash, but certainly not normal either. Here too: log please. I
> think I have an idea what could have cause the bug, maybe I will also
> need a sample of the data.
> 
> B.
> 
> 
> 
> ------------------------------
> 
> Date: Mon, 16 May 2011 13:14:35 -0400
> Subject: [mira_talk] Re: Mira says "killed" as last word after being
> From: Adrian Pelin <apelin20@xxxxxxxxx>
> 
> My hybrid,acurate,454,solexa,denovo assembly is on its 4th day and I was
> wondering if that is normal. Here is a bit of info:
> 240,000 454 reads
> 2000000 solexa pair-end reads
> 48 GB ram
> 2 quad core 2.4 Ghz xeons
> Fedora 14, using about 80% of ram.
> 
> There are some contaminations of the DNA thought:( we have chunks of
> bacterial and fungal DNA and we are only interested in the Fungal
> mitochondrial DNA ~70-80kb.
> We already have the genome assembled into 9 contigs, problem is that is
> still a lot, we would like to reduce it further to 4-5 contigs if
> possible
> and get different contigs from Mira algorithms.
> Can anyone confirm that it is normal to have such a long wait time?
> 
> Here is a bit of log, last 300 lines, more than ahppy to provide more:
> http://pastebin.ca/2061695
> 
> Thanks everyone.
> 
> 
> On Sun, May 15, 2011 at 7:25 PM, Bastien Chevreux <bach@xxxxxxxxxxxx>
> wrote:
> 
>>  On Sunday 15 May 2011 22:46:51 Adrian Pelin wrote:
>> 
>>> Yes sorry about that, i started to use the stable version.... you know
>> 
>>> because I thought stable is better and that is the case in most
>> 
>>> programs.
>> 
>> 
>> I think I'll soon need to make a stable 3.4, too much has changed since
>> 3.2.1 stable.
>> 
>> 
>>> But I am using .15 as .16 came out just today and my assembly
>> 
>>> is running for 3 days now.
>> 
>> 
>> .16 is so yesterday ... .17 is better (I know, too many versions in a
>> short
>> time frame, but I simply could not resist to use the week-end to get Ion
>> Torrent done right ;-)
>> 
>> 
>> B.
>> 
>> 
>> 
> 
> 
> 
> ------------------------------
> 
> From: Bastien Chevreux <bach@xxxxxxxxxxxx>
> Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent
> Date: Mon, 16 May 2011 19:17:01 +0200
> 
> On May 16, 2011, at 15:16 , Phillip San Miguel wrote:
>> I am now using MIRA V3.2.1.17 to de novo assemble 13 million solexa
>> reads (101 base PE reads).  That is 1.3 billion bases of sequence. The
>> genome size is about 4.5 million bases (Salmonella). So that is
>> 200x-300x coverage--more than I intended.
> Do yourself a favour: go with 6m reads, that should be plenty enough.
> 
>> Anyone want to predict the N50 contig length?
> 
> Depends on the genome itself, how repetitive it is. With PE reads I would
> hope for N50 >20kb though.
> 
>> I tried MIRA V3.2.1.15 on a 70% GC bacterial genome (Deinococcus) at
>> around 100x coverage with solexa PE 101 base reads. My N50 contig size
>> was 4630 bases. That seems short to me, but it might be a result of the
>> 70% GC. So I decided to de novo assemble a 50% GC data set from the same
>> run.
> 
> That's bad, really bad. You are the second report I get that apparently,
> MIRA has problems with high GC Solexa data sets. The first being a
> supersecret bug of a big company, I cannot get the data to see what's
> causing havoc. Would it be possible for me to have a look at that thing?
> No promises, but it might help.
> 
> B.
> 
> 
> 
> ------------------------------
> 
> Subject: [mira_talk] Re: Mira says "killed" as last word after being
> almost don
> From: Bastien Chevreux <bach@xxxxxxxxxxxx>
> Date: Mon, 16 May 2011 19:43:42 +0200
> 
> On May 16, 2011, at 19:14 , Adrian Pelin wrote:
>> My hybrid,acurate,454,solexa,denovo assembly is on its 4th day and I was
>> wondering if that is normal. Here is a bit of info:
>> 240,000 454 reads
>> 2000000 solexa pair-end reads
>> There are some contaminations of the DNA thought:( we have chunks of
>> bacterial and fungal DNA and we are only interested in the Fungal
>> mitochondrial DNA ~70-80kb.
> 
> Oh, old friends of mine: mitochondria and chloroplasts. They are
> inherently difficult as most data sets I have seen up to know have a
> wildly varying coverage and, to complicate things, most of the time
> contain DNA from slightly different mitochondria/chloroplasts. Assembly
> hell par excellence.
> 
> And then you have a small target (80kb) you sequence with tons and tons
> of reads. Ouch. Coverage >1000x ? MIRA will have a hard time.
> 
>> We already have the genome assembled into 9 contigs, problem is that is
>> still a lot, we would like to reduce it further to 4-5 contigs if
>> possible and get different contigs from Mira algorithms.
>> Can anyone confirm that it is normal to have such a long wait time?
> 
> That depends on the definition of "normal". However, as comparison, the
> run time for a small ~4.5mb bacterial genome with 800k 454bFLX reads and
> 3.5m Solexa reads is < 1 day (and that just because MIRA starts to build
> huge contigs of 1.5mb which slows down some things tremendously ... I'm
> working on that).
> 
> MIRA *has* a hard time. That it takes so long is a sign that a lot of
> SNPs respectively repeat markers were found and disentangling them is a
> time consuming process. To asses where MIRA is:
>   grep "^Pass:" log_assembly.txt
> (or to whatever you redirected the output) and compare that to the number
> of passes with which MIRA is configured (see -AS:nop in the parameter
> section atop said file)
> 
> The memory usage is less of a problem: MIRA just grabbed all it could /
> was allowed to load big tables and to less disk IO. No need to worry.
> 
> B.
> 
> 
> ------------------------------
> 
> From: Bastien Chevreux <bach@xxxxxxxxxxxx>
> Subject: [mira_talk] Re: results folder empty
> Date: Mon, 16 May 2011 19:45:51 +0200
> 
> On May 16, 2011, at 17:43 , Shaun Tyler wrote:
>> Maybe that's what happened the first time too but the situation was a
>> little different. Tarah said that after several days the terminal window
>> had come back to the prompt. When we were looking over the output the
>> info folder had the info_assembly file giving the stats on the contigs,
>> etc. but when we went to look at the results the folder was empty :-<
>> Could it be really bad timing that the drive filled up just as the run
>> completed and the final files couldn't be written ????
>> 
> That would be really bad luck, but possible. Statistics get written to
> disk before the results.
>> Just looked again and the info folder only had the assembly,
>> callparameters and readrepeats files.
>> 
> You will not see more there before the run ends.
>> Is there anyway to get MIRA to restart using the results that were
>> already compiled???
>> 
> Terribly sorry, but no. This checkpointing capability is on my TODO since
> quite a while (and I already had started), but somehow those darn
> instrument providers keep coming with new throughputs, new machines, new
> technologies and I had to shift development priorities a couple of times
> already to get projects done at work in some areas I never would have
> thought I'd ever venture.
> B.
> 
> ------------------------------
> 
> Date: Mon, 16 May 2011 14:06:52 -0400
> Subject: [mira_talk] Re: Mira says "killed" as last word after being
> From: Adrian Pelin <apelin20@xxxxxxxxx>
> 
> Oh I see it did 3 Passes out of 5. So another day or two.
> By the way, would a solid state drive increase speed of assemblies of
> mira
> and other assemblers in general?
> 
> Adrian
> 
> On Mon, May 16, 2011 at 1:43 PM, Bastien Chevreux <bach@xxxxxxxxxxxx>
> wrote:
> 
>> On May 16, 2011, at 19:14 , Adrian Pelin wrote:
>>> My hybrid,acurate,454,solexa,denovo assembly is on its 4th day and I
>>> was
>> wondering if that is normal. Here is a bit of info:
>>> 240,000 454 reads
>>> 2000000 solexa pair-end reads
>>> There are some contaminations of the DNA thought:( we have chunks of
>> bacterial and fungal DNA and we are only interested in the Fungal
>> mitochondrial DNA ~70-80kb.
>> 
>> Oh, old friends of mine: mitochondria and chloroplasts. They are
>> inherently
>> difficult as most data sets I have seen up to know have a wildly varying
>> coverage and, to complicate things, most of the time contain DNA from
>> slightly different mitochondria/chloroplasts. Assembly hell par
>> excellence.
>> 
>> And then you have a small target (80kb) you sequence with tons and tons
>> of
>> reads. Ouch. Coverage >1000x ? MIRA will have a hard time.
>> 
>>> We already have the genome assembled into 9 contigs, problem is that is
>> still a lot, we would like to reduce it further to 4-5 contigs if
>> possible
>> and get different contigs from Mira algorithms.
>>> Can anyone confirm that it is normal to have such a long wait time?
>> 
>> That depends on the definition of "normal". However, as comparison, the
>> run
>> time for a small ~4.5mb bacterial genome with 800k 454bFLX reads and
>> 3.5m
>> Solexa reads is < 1 day (and that just because MIRA starts to build huge
>> contigs of 1.5mb which slows down some things tremendously ... I'm
>> working
>> on that).
>> 
>> MIRA *has* a hard time. That it takes so long is a sign that a lot of
>> SNPs
>> respectively repeat markers were found and disentangling them is a time
>> consuming process. To asses where MIRA is:
>>  grep "^Pass:" log_assembly.txt
>> (or to whatever you redirected the output) and compare that to the
>> number
>> of passes with which MIRA is configured (see -AS:nop in the parameter
>> section atop said file)
>> 
>> The memory usage is less of a problem: MIRA just grabbed all it could /
>> was
>> allowed to load big tables and to less disk IO. No need to worry.
>> 
>> B.
>> 
>> 
>> --
>> You have received this mail because you are subscribed to the mira_talk
>> mailing list. For information on how to subscribe or unsubscribe, please
>> visit http://www.chevreux.org/mira_mailinglists.html
>> 
> 
> 
> 
> ------------------------------
> 
> Date: Tue, 17 May 2011 01:23:21 +0100
> From: Tony Travis <a.travis@xxxxxxxxxx>
> Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent
> 
> On 16/05/11 18:11, Bastien Chevreux wrote:
>> [...]
>>> My three-week(!) MIRA assembly of 20x106 paired-end Solexa + 270x104
>>> 454 reads has just crashed when you announced your call for testing ;-)
>> 
>> Usual question from my side: log please :-)
> 
> Hi, Bastien.
> 
> Sorry, I'm re-running the assembly with the 'stable' version of MIRA and
> I've overwritten the log files from the 'testing 'version...
> 
>>> I've just tried your development version, but it crashed:
>>>> Internal logic/programming/debugging error (*sigh* this should not
>>>> have happened).
>>>> Please file a bug report on
>>>> http://sourceforge.net/apps/trac/mira-assembler/
>>>> 
>>>> "While trying to set the name of read
>>>> <trace>
>>>> Encountered character with ASCII code 32. T
>>>> It is probably due to your input data, but normally, MIRA should have
>>>> caught that earlier!"
>> 
>> Not a crash, but certainly not normal either. Here too: log please. I
>> think I have an idea what could have cause the bug, maybe I will also
>> need a sample of the data.
> 
> Ah, of course, I mean the program terminated unexpectedly ;-)
> 
> This same data did not cause any problem with the stable version. MIRA
> crashed because someone else ran a job with a large memory footprint and
> between us we exhausted the swap space. I'm trying again, but using
> "ramzswap" this time to compress and keep swapped memory pages in RAM.
> 
> Bye,
> 
>    Tony.
> --
> Dr. A.J.Travis, University of Aberdeen, Rowett Institute of Nutrition
> and Health, Greenburn Road, Bucksburn, Aberdeen AB21 9SB, Scotland, UK
> tel +44(0)1224 712751, fax +44(0)1224 716687, http://www.rowett.ac.uk
> mailto:a.travis@xxxxxxxxxx, http://bioinformatics.rri.sari.ac.uk
> 
> ------------------------------
> 
> End of mira_talk Digest V4 #86
> ******************************

____________________________________________________________
TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5
Capture screenshots, upload images, edit and send them to your friends
through IMs, post on Twitter®, Facebook®, MySpace™, LinkedIn® – FAST!

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts:

  • » [mira_talk] STOP SENDING EMAIL RE: mira_talk Digest V4 #86 - Alex Washington