> -----Original Message----- > From: ecartis@xxxxxxxxxxxxx > Sent: Tue, 17 May 2011 01:09:50 -0400 (EDT) > To: ecartis@xxxxxxxxxxxxx > Subject: mira_talk Digest V4 #86 > > mira_talk Digest Mon, 16 May 2011 Volume: 04 Issue: 086 > > In This Issue: > [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr > [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr > [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr > [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr > [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr > [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr > [mira_talk] Re: results folder empty > [mira_talk] Re: results folder empty > [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr > [mira_talk] Re: Mira says "killed" as last word after being > [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr > [mira_talk] Re: Mira says "killed" as last word after being > [mira_talk] Re: results folder empty > [mira_talk] Re: Mira says "killed" as last word after being > [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torr > > ---------------------------------------------------------------------- > > Date: Mon, 16 May 2011 08:33:32 +0100 > Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent > From: Peter <peter@xxxxxxxxxxxxxxxxxxxxx> > > On Mon, May 16, 2011 at 12:13 AM, Bastien Chevreux wrote: >> Dear all, >> >> as main highlight for this version, MIRA now knows about a new >> sequencing >> technology: Ion Torrent. I think it fares pretty well compared to other >> assemblers, but feedback is always appreciated. >> >> Fetch it here: >> >> http://sourceforge.net/projects/mira-assembler/files/MIRA/development/ >> >> Since the last announcement (3.2.1.7), several other improvements ... > > I wonder if Nick Loman at Birmingham UK is on the list? > > http://pathogenomics.bham.ac.uk/blog/2011/05/first-look-at-ion-torrent-data-de-novo-assembly/ > > Peter > > ------------------------------ > > Date: Mon, 16 May 2011 11:44:27 +0200 (MEST) > From: "Bastien Chevreux" <bach@xxxxxxxxxxxx> > Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent > >> From: Peter<peter@xxxxxxxxxxxxxxxxxxxxx> >> I wonder if Nick Loman at Birmingham UK is on the list? > > Even if he were not, he knows about .17 :-) > > B. > > > > ------------------------------ > > Date: Mon, 16 May 2011 07:46:28 -0400 > From: Phillip San Miguel <pmiguel@xxxxxxxxxx> > Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent > > Hi Bastien, > Looks like a parameter parsing bug has crept in? > nohup mira -DIR:lrt=/scratch/pmiguel/ --project=TL3360 > --job=denovo,genome,accurate,solexa SOLEXA_SETTINGS > -GE:tismin=200:tismax=700 > & ! log_assembly > > works fine for MIRA V3.2.1.15. but errors out with MIRA V3.2.1.17. I > think the salient section in the log file is: > > Parsing parameters: -DIR:lrt=/scratch/pmiguel/ --project=TL3360 > --job=denovo,genome,accurate,solexa SOLEXA_SET > TINGS -GE:tismin=200:tismax=700 > > /// > -SB:sbuip is 3, but must be no more than 2. Setting to 2 > > > ========================= Parameter parsing error(s) > ========================== > > * Parameter section: '-DIR' > * unrecognised string or unexpected character: lrt > > * Parameter section: '-DIR' > * unrecognised string or unexpected character: scratch > * (may be due to previous errors) > > * Parameter section: '-DIR' > * unrecognised string or unexpected character: pmiguel > * (may be due to previous errors) > > =============================================================================== > > Fatal error (may be due to problems of the input data or parameters): > > "Error while parsing parameters, sorry." > > ->Thrown: void MIRAParameters::parse(istream & is, > vector<MIRAParameters> & Pv, MIRAParameters * singlemp) > ->Caught: main > > > Regards, > Phillip > > > On 5/15/2011 7:13 PM, Bastien Chevreux wrote: >> >> Dear all, >> >> >> as main highlight for this version, MIRA now knows about a new >> sequencing technology: Ion Torrent. I think it fares pretty well >> compared to other assemblers, but feedback is always appreciated. >> >> >> Fetch it here: >> >> http://sourceforge.net/projects/mira-assembler/files/MIRA/development/ >> >> >> Since the last announcement (3.2.1.7), several other improvements were >> made. Here are the most prominent ones: >> >> >> - distinct improvement of assembly quality for 454 data containing PCR >> >> artefacts >> >> - faster and better assemblies in hybrid Solexa + ... scenarios >> >> - Solexa: automatic clipping for known sequencing adaptors and new >> quality >> >> filtering routines >> >> - memory usage is kept more closely to constraints given via parameters >> >> - convert_project revamped: less memory usage in most cases and a lot >> of new >> >> options to convert / filter / rework data >> >> - new documentation: Ion Torrent and MIRA utilities (both not complete >> yet) >> >> - lots of other small bugfixes and improvements >> >> >> > > > > > ------------------------------ > > Date: Mon, 16 May 2011 14:21:52 +0200 (MEST) > From: "Bastien Chevreux" <bach@xxxxxxxxxxxx> > Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent > > --b_01_dc0774e18877dc53373e957b86cbc470 > Content-Type: text/html; charset="ISO-8859-1" > Content-Transfer-Encoding: quoted-printable > <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 > Transitional//DE"><HTML><HEAD><= > META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; > charset=3Dus-ascii">= > <TITLE>Message</TITLE></HEAD><BODY><b>> From:</b> Phillip San > Miguel<br>= > > Looks like a parameter parsing bug has crept in?<br> > > > nohup mira -DIR:lrt=3D/scratch/pmiguel/ > --project=3DTL= > 3360 > --job=3Ddenovo,genome,accurate,solexa > <br>> = > SOLEXA_SETTINGS > -GE:tismin=3D200:tismax=3D700 > & ! log_assembly<br> > > works fine for MIRA V3.2.1.15. but errors out with MIRA > V3.2= > .1.17. > <br><br>Ooops ... no bug, but forgot to put that in the CHANGES file > (a= > nd update the docs):<br><br> The "log" directory > is= > no more, long live the "tmp" directory.<br> > <br>In fact, the contents are currently exactly the same, it's just been > re= > named to reduce the number of question like in "can I delete contents of > th= > e log directory during a run, they're so big". (Answer: for most, no, you > c= > annot).<br><br>Consequently, I also had to rename a couple of parameters > to= > keep naming consistent. -DIR:lrt is one of these. It's new name is ... > "-D= > IR:trt". Use that and all should be good.<br><br>I'm sorry for the > confusio= > n that this will create during a (hopefully) short transition > period.<br><b= > r>Best,<br> Bastien<br><br> > <pmiguel@xxxxxxxxxx> <br> > <br> > </pmiguel@xxxxxxxxxx></BODY></HTML> > --b_01_dc0774e18877dc53373e957b86cbc470 > Content-Type: text/plain; charset="ISO-8859-1" > Content-Transfer-Encoding: quoted-printable > >> From: Phillip San Miguel=0A> Looks like a parameter parsing bug has >> crept= > in?=0A> nohup mira -DIR:lrt=3D/scratch/pmiguel/ > --project=3DTL3360--job= > =3Ddenovo,genome,accurate,solexa =0A> > SOLEXA_SETTINGS-GE:tismin=3D200:t= > ismax=3D700 > & ! log_assembly=0A> works fine for MIRA V3.2.1.15. but > erro= > rs out with MIRA V3.2.1.17.=0A=0AOoops ... no bug, but forgot to put that > i= > n the CHANGES file (and update the docs):=0A=0A The "log" directory > is = > no more, long live the "tmp" directory.=0A=0AIn fact, the contents are > curr= > ently exactly the same, it's just been renamed to reduce the number of > ques= > tion like in "can I delete contents of the log directory during a run, > they= > 're so big". (Answer: for most, no, you cannot).=0A=0AConsequently, I > also = > had to rename a couple of parameters to keep naming consistent. -DIR:lrt > is= > one of these. It's new name is ... "-DIR:trt". Use that and all should > be = > good.=0A=0AI'm sorry for the confusion that this will create during a > (hope= > fully) short transition period.=0A=0ABest,=0A Bastien=0A=0A =0A=0A > --b_01_dc0774e18877dc53373e957b86cbc470-- > > ------------------------------ > > Date: Mon, 16 May 2011 14:12:57 +0100 > From: Tony Travis <a.travis@xxxxxxxxxx> > Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent > > On 16/05/11 13:21, Bastien Chevreux wrote: >> *> From:* Phillip San Miguel >> > Looks like a parameter parsing bug has crept in? >> > nohup mira -DIR:lrt=/scratch/pmiguel/ --project=TL3360 >> --job=denovo,genome,accurate,solexa >> > SOLEXA_SETTINGS -GE:tismin=200:tismax=700 > & ! log_assembly >> > works fine for MIRA V3.2.1.15. but errors out with MIRA V3.2.1.17. >> >> Ooops ... no bug, but forgot to put that in the CHANGES file (and update >> the docs): >> >> The "log" directory is no more, long live the "tmp" directory. > > Hi, Bastien. > > Should that not be the /var/tmp directory? > > Typically, /tmp is 'small', fast' and volatile; /var/tmp is 'large', not > so fast, but persistent across reboots. > > My three-week(!) MIRA assembly of 20x106 paired-end Solexa + 270x104 454 > reads has just crashed when you announced your call for testing ;-) > > Maybe someone is trying to tell me something? > > I've just tried your development version, but it crashed: > >> mira -project=meta \ >> -job=denovo,genome,normal,sanger,454,solexa \ >> COMMON_SETTINGS -GE:not=16 -SK:nrr=100 \ >> SANGER_SETTINGS -GE:tismin=5000:tismax=40000 -LR:lsd=on,ft=fasta >> -CL:msvs=yes \ >> 454_SETTINGS -LR:lsd=on,ft=fasta -CL:msvs=yes \ >> SOLEXA_SETTINGS -GE:tismin=160:tismax=220 -LR:lsd=on,ft=fasta >> -CL:msvs=yes > > [...] >> Internal logic/programming/debugging error (*sigh* this should not have >> happened). >> Please file a bug report on >> http://sourceforge.net/apps/trac/mira-assembler/ >> >> "While trying to set the name of read >> <trace> >> Encountered character with ASCII code 32. T >> It is probably due to your input data, but normally, MIRA should have >> caught that earlier!" >> >> ->Thrown: void Read::setName(const string & name) >> ->Caught: main >> >> Aborting process, probably due to an internal error. > > Bye, > > Tony. > >> In fact, the contents are currently exactly the same, it's just been >> renamed to reduce the number of question like in "can I delete contents >> of the log directory during a run, they're so big". (Answer: for most, >> no, you cannot). >> >> Consequently, I also had to rename a couple of parameters to keep naming >> consistent. -DIR:lrt is one of these. It's new name is ... "-DIR:trt". >> Use that and all should be good. >> >> I'm sorry for the confusion that this will create during a (hopefully) >> short transition period. >> >> Best, >> Bastien >> >> >> > > > -- > Dr. A.J.Travis, University of Aberdeen, Rowett Institute of Nutrition > and Health, Greenburn Road, Bucksburn, Aberdeen AB21 9SB, Scotland, UK > tel +44(0)1224 712751, fax +44(0)1224 716687, http://www.rowett.ac.uk > mailto:a.travis@xxxxxxxxxx, http://bioinformatics.rri.sari.ac.uk > > ------------------------------ > > Date: Mon, 16 May 2011 09:16:21 -0400 > From: Phillip San Miguel <pmiguel@xxxxxxxxxx> > Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent > > On 5/16/2011 8:21 AM, Bastien Chevreux wrote: >>> From: Phillip San Miguel >>> Looks like a parameter parsing bug has crept in? >>> nohup mira -DIR:lrt=/scratch/pmiguel/ >>> --project=TL3360--job=denovo,genome,accurate,solexa >>> SOLEXA_SETTINGS-GE:tismin=200:tismax=700> & ! log_assembly >>> works fine for MIRA V3.2.1.15. but errors out with MIRA V3.2.1.17. >> Ooops ... no bug, but forgot to put that in the CHANGES file (and update >> the docs): >> >> The "log" directory is no more, long live the "tmp" directory. >> >> In fact, the contents are currently exactly the same, it's just been >> renamed to reduce the number of question like in "can I delete contents >> of the log directory during a run, they're so big". (Answer: for most, >> no, you cannot). >> >> Consequently, I also had to rename a couple of parameters to keep naming >> consistent. -DIR:lrt is one of these. It's new name is ... "-DIR:trt". >> Use that and all should be good. >> >> I'm sorry for the confusion that this will create during a (hopefully) >> short transition period. >> >> Best, >> Bastien >> >> >> >> > Ah, yes. That solves the issue. Thanks. > > I am now using MIRA V3.2.1.17 to de novo assemble 13 million solexa > reads (101 base PE reads). That is 1.3 billion bases of sequence. The > genome size is about 4.5 million bases (Salmonella). So that is > 200x-300x coverage--more than I intended. > > Anyone want to predict the N50 contig length? > > I tried MIRA V3.2.1.15 on a 70% GC bacterial genome (Deinococcus) at > around 100x coverage with solexa PE 101 base reads. My N50 contig size > was 4630 bases. That seems short to me, but it might be a result of the > 70% GC. So I decided to de novo assemble a 50% GC data set from the same > run. > > Phillip > > ------------------------------ > > Subject: [mira_talk] Re: results folder empty > From: Shaun Tyler <Shaun.Tyler@xxxxxxxxxxxxxxx> > Date: Mon, 16 May 2011 10:43:19 -0500 > > I've been helping Tarah get started with using MIRA (talk about the blind > leading the blind!!!) The log file she sent was not for the run in > question. Unfortunately she started the run again which overwrote the > original log. The second run terminated because we filled up the disk > space so no need to dwell on that. > > Maybe that's what happened the first time too but the situation was a > little different. Tarah said that after several days the terminal window > had come back to the prompt. When we were looking over the output the > info > folder had the info_assembly file giving the stats on the contigs, etc. > but > when we went to look at the results the folder was empty :-< Could it be > really bad timing that the drive filled up just as the run completed and > the final files couldn't be written ???? > > Just looked again and the info folder only had the assembly, > callparameters > and readrepeats files. > > Is there anyway to get MIRA to restart using the results that were > already > compiled??? > > Shaun > > > > *********************************************** > Shaun Tyler > Head, DNA Core Facility and > International Depositary Authority of Canada > National Microbiology Laboratory > Public Health Agency of Canada > Canadian Science Centre for Human and Animal Health > 1015 Arlington St., Suite H3130 > Winnipeg, MB R3E 3R2 > Ph: 204-789-6030 > Fax: 204-789-2018 > EMail: shaun_tyler@xxxxxxxxxxxxxxx > > > > From: Bastien Chevreux <bach@xxxxxxxxxxxx> > To: mira_talk@xxxxxxxxxxxxx > Date: 2011-05-14 03:59 AM > Subject: [mira_talk] Re: results folder empty > Sent by: mira_talk-bounce@xxxxxxxxxxxxx > > > > On Wednesday 11 May 2011 12:34:35 Sven Klages wrote: >> The attached log file is not complete ... MIRA maybe not yet finished? > > Having logs (and MIRA) stopping at that place would be indeed very > unusual. > >> The log usually ends with an "end of assembly" message .. > > Well, not if there was a "kill" from outside or a segmentation fault. > > Tarah: > a) is the process still running? > b) if not: does the window where you started the run somewhwere say > "killed"? > > B. > > > > ------------------------------ > > Subject: [mira_talk] Re: results folder empty > From: Tarah Lynch <tarah.lynch@xxxxxxxxxxxxxxx> > Date: Mon, 16 May 2011 11:02:57 -0500 > > As Shaun mentioned, I accidentally wrote over the log file, so not sure > if > it said 'killed' :0/ > We are currently re-running MIRA right now after moving files around to > create more free disk space. > Will keep you posted, thanks! > Tarah > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Tarah Lynch > Postdoctoral Fellow > National Microbiology Laboratory > Canadian Science Centre for Human and Animal Health > Public Health Agency of Canada > 1015 Arlington St. > Winnipeg, MB R3E 3R2 > P: (204) 789-5000 > E: Tarah.Lynch@xxxxxxxxxxxxxxx > > > > > > From: Bastien Chevreux <bach@xxxxxxxxxxxx> > To: mira_talk@xxxxxxxxxxxxx > Date: 2011-05-14 03:59 AM > Subject: [mira_talk] Re: results folder empty > Sent by: mira_talk-bounce@xxxxxxxxxxxxx > > > > On Wednesday 11 May 2011 12:34:35 Sven Klages wrote: >> The attached log file is not complete ... MIRA maybe not yet finished? > > Having logs (and MIRA) stopping at that place would be indeed very > unusual. > >> The log usually ends with an "end of assembly" message .. > > Well, not if there was a "kill" from outside or a segmentation fault. > > Tarah: > a) is the process still running? > b) if not: does the window where you started the run somewhwere say > "killed"? > > B. > > > > > ------------------------------ > > From: Bastien Chevreux <bach@xxxxxxxxxxxx> > Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent > Date: Mon, 16 May 2011 19:11:35 +0200 > > On May 16, 2011, at 15:12 , Tony Travis wrote: >> On 16/05/11 13:21, Bastien Chevreux wrote: >>> The "log" directory is no more, long live the "tmp" directory. >> Should that not be the /var/tmp directory? > Ahm, no. What I mean are the <projectname>_assembly/<projectname>_d_log > respectively *_d_tmp directories. > >> My three-week(!) MIRA assembly of 20x106 paired-end Solexa + 270x104 454 >> reads has just crashed when you announced your call for testing ;-) > > Usual question from my side: log please :-) > >> I've just tried your development version, but it crashed: >>> Internal logic/programming/debugging error (*sigh* this should not have >>> happened). >>> Please file a bug report on >>> http://sourceforge.net/apps/trac/mira-assembler/ >>> >>> "While trying to set the name of read >>> <trace> >>> Encountered character with ASCII code 32. T >>> It is probably due to your input data, but normally, MIRA should have >>> caught that earlier!" > > Not a crash, but certainly not normal either. Here too: log please. I > think I have an idea what could have cause the bug, maybe I will also > need a sample of the data. > > B. > > > > ------------------------------ > > Date: Mon, 16 May 2011 13:14:35 -0400 > Subject: [mira_talk] Re: Mira says "killed" as last word after being > From: Adrian Pelin <apelin20@xxxxxxxxx> > > My hybrid,acurate,454,solexa,denovo assembly is on its 4th day and I was > wondering if that is normal. Here is a bit of info: > 240,000 454 reads > 2000000 solexa pair-end reads > 48 GB ram > 2 quad core 2.4 Ghz xeons > Fedora 14, using about 80% of ram. > > There are some contaminations of the DNA thought:( we have chunks of > bacterial and fungal DNA and we are only interested in the Fungal > mitochondrial DNA ~70-80kb. > We already have the genome assembled into 9 contigs, problem is that is > still a lot, we would like to reduce it further to 4-5 contigs if > possible > and get different contigs from Mira algorithms. > Can anyone confirm that it is normal to have such a long wait time? > > Here is a bit of log, last 300 lines, more than ahppy to provide more: > http://pastebin.ca/2061695 > > Thanks everyone. > > > On Sun, May 15, 2011 at 7:25 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> > wrote: > >> On Sunday 15 May 2011 22:46:51 Adrian Pelin wrote: >> >>> Yes sorry about that, i started to use the stable version.... you know >> >>> because I thought stable is better and that is the case in most >> >>> programs. >> >> >> I think I'll soon need to make a stable 3.4, too much has changed since >> 3.2.1 stable. >> >> >>> But I am using .15 as .16 came out just today and my assembly >> >>> is running for 3 days now. >> >> >> .16 is so yesterday ... .17 is better (I know, too many versions in a >> short >> time frame, but I simply could not resist to use the week-end to get Ion >> Torrent done right ;-) >> >> >> B. >> >> >> > > > > ------------------------------ > > From: Bastien Chevreux <bach@xxxxxxxxxxxx> > Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent > Date: Mon, 16 May 2011 19:17:01 +0200 > > On May 16, 2011, at 15:16 , Phillip San Miguel wrote: >> I am now using MIRA V3.2.1.17 to de novo assemble 13 million solexa >> reads (101 base PE reads). That is 1.3 billion bases of sequence. The >> genome size is about 4.5 million bases (Salmonella). So that is >> 200x-300x coverage--more than I intended. > Do yourself a favour: go with 6m reads, that should be plenty enough. > >> Anyone want to predict the N50 contig length? > > Depends on the genome itself, how repetitive it is. With PE reads I would > hope for N50 >20kb though. > >> I tried MIRA V3.2.1.15 on a 70% GC bacterial genome (Deinococcus) at >> around 100x coverage with solexa PE 101 base reads. My N50 contig size >> was 4630 bases. That seems short to me, but it might be a result of the >> 70% GC. So I decided to de novo assemble a 50% GC data set from the same >> run. > > That's bad, really bad. You are the second report I get that apparently, > MIRA has problems with high GC Solexa data sets. The first being a > supersecret bug of a big company, I cannot get the data to see what's > causing havoc. Would it be possible for me to have a look at that thing? > No promises, but it might help. > > B. > > > > ------------------------------ > > Subject: [mira_talk] Re: Mira says "killed" as last word after being > almost don > From: Bastien Chevreux <bach@xxxxxxxxxxxx> > Date: Mon, 16 May 2011 19:43:42 +0200 > > On May 16, 2011, at 19:14 , Adrian Pelin wrote: >> My hybrid,acurate,454,solexa,denovo assembly is on its 4th day and I was >> wondering if that is normal. Here is a bit of info: >> 240,000 454 reads >> 2000000 solexa pair-end reads >> There are some contaminations of the DNA thought:( we have chunks of >> bacterial and fungal DNA and we are only interested in the Fungal >> mitochondrial DNA ~70-80kb. > > Oh, old friends of mine: mitochondria and chloroplasts. They are > inherently difficult as most data sets I have seen up to know have a > wildly varying coverage and, to complicate things, most of the time > contain DNA from slightly different mitochondria/chloroplasts. Assembly > hell par excellence. > > And then you have a small target (80kb) you sequence with tons and tons > of reads. Ouch. Coverage >1000x ? MIRA will have a hard time. > >> We already have the genome assembled into 9 contigs, problem is that is >> still a lot, we would like to reduce it further to 4-5 contigs if >> possible and get different contigs from Mira algorithms. >> Can anyone confirm that it is normal to have such a long wait time? > > That depends on the definition of "normal". However, as comparison, the > run time for a small ~4.5mb bacterial genome with 800k 454bFLX reads and > 3.5m Solexa reads is < 1 day (and that just because MIRA starts to build > huge contigs of 1.5mb which slows down some things tremendously ... I'm > working on that). > > MIRA *has* a hard time. That it takes so long is a sign that a lot of > SNPs respectively repeat markers were found and disentangling them is a > time consuming process. To asses where MIRA is: > grep "^Pass:" log_assembly.txt > (or to whatever you redirected the output) and compare that to the number > of passes with which MIRA is configured (see -AS:nop in the parameter > section atop said file) > > The memory usage is less of a problem: MIRA just grabbed all it could / > was allowed to load big tables and to less disk IO. No need to worry. > > B. > > > ------------------------------ > > From: Bastien Chevreux <bach@xxxxxxxxxxxx> > Subject: [mira_talk] Re: results folder empty > Date: Mon, 16 May 2011 19:45:51 +0200 > > On May 16, 2011, at 17:43 , Shaun Tyler wrote: >> Maybe that's what happened the first time too but the situation was a >> little different. Tarah said that after several days the terminal window >> had come back to the prompt. When we were looking over the output the >> info folder had the info_assembly file giving the stats on the contigs, >> etc. but when we went to look at the results the folder was empty :-< >> Could it be really bad timing that the drive filled up just as the run >> completed and the final files couldn't be written ???? >> > That would be really bad luck, but possible. Statistics get written to > disk before the results. >> Just looked again and the info folder only had the assembly, >> callparameters and readrepeats files. >> > You will not see more there before the run ends. >> Is there anyway to get MIRA to restart using the results that were >> already compiled??? >> > Terribly sorry, but no. This checkpointing capability is on my TODO since > quite a while (and I already had started), but somehow those darn > instrument providers keep coming with new throughputs, new machines, new > technologies and I had to shift development priorities a couple of times > already to get projects done at work in some areas I never would have > thought I'd ever venture. > B. > > ------------------------------ > > Date: Mon, 16 May 2011 14:06:52 -0400 > Subject: [mira_talk] Re: Mira says "killed" as last word after being > From: Adrian Pelin <apelin20@xxxxxxxxx> > > Oh I see it did 3 Passes out of 5. So another day or two. > By the way, would a solid state drive increase speed of assemblies of > mira > and other assemblers in general? > > Adrian > > On Mon, May 16, 2011 at 1:43 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> > wrote: > >> On May 16, 2011, at 19:14 , Adrian Pelin wrote: >>> My hybrid,acurate,454,solexa,denovo assembly is on its 4th day and I >>> was >> wondering if that is normal. Here is a bit of info: >>> 240,000 454 reads >>> 2000000 solexa pair-end reads >>> There are some contaminations of the DNA thought:( we have chunks of >> bacterial and fungal DNA and we are only interested in the Fungal >> mitochondrial DNA ~70-80kb. >> >> Oh, old friends of mine: mitochondria and chloroplasts. They are >> inherently >> difficult as most data sets I have seen up to know have a wildly varying >> coverage and, to complicate things, most of the time contain DNA from >> slightly different mitochondria/chloroplasts. Assembly hell par >> excellence. >> >> And then you have a small target (80kb) you sequence with tons and tons >> of >> reads. Ouch. Coverage >1000x ? MIRA will have a hard time. >> >>> We already have the genome assembled into 9 contigs, problem is that is >> still a lot, we would like to reduce it further to 4-5 contigs if >> possible >> and get different contigs from Mira algorithms. >>> Can anyone confirm that it is normal to have such a long wait time? >> >> That depends on the definition of "normal". However, as comparison, the >> run >> time for a small ~4.5mb bacterial genome with 800k 454bFLX reads and >> 3.5m >> Solexa reads is < 1 day (and that just because MIRA starts to build huge >> contigs of 1.5mb which slows down some things tremendously ... I'm >> working >> on that). >> >> MIRA *has* a hard time. That it takes so long is a sign that a lot of >> SNPs >> respectively repeat markers were found and disentangling them is a time >> consuming process. To asses where MIRA is: >> grep "^Pass:" log_assembly.txt >> (or to whatever you redirected the output) and compare that to the >> number >> of passes with which MIRA is configured (see -AS:nop in the parameter >> section atop said file) >> >> The memory usage is less of a problem: MIRA just grabbed all it could / >> was >> allowed to load big tables and to less disk IO. No need to worry. >> >> B. >> >> >> -- >> You have received this mail because you are subscribed to the mira_talk >> mailing list. For information on how to subscribe or unsubscribe, please >> visit http://www.chevreux.org/mira_mailinglists.html >> > > > > ------------------------------ > > Date: Tue, 17 May 2011 01:23:21 +0100 > From: Tony Travis <a.travis@xxxxxxxxxx> > Subject: [mira_talk] Re: Call for testing: MIRA 3.2.1.17 and Ion Torrent > > On 16/05/11 18:11, Bastien Chevreux wrote: >> [...] >>> My three-week(!) MIRA assembly of 20x106 paired-end Solexa + 270x104 >>> 454 reads has just crashed when you announced your call for testing ;-) >> >> Usual question from my side: log please :-) > > Hi, Bastien. > > Sorry, I'm re-running the assembly with the 'stable' version of MIRA and > I've overwritten the log files from the 'testing 'version... > >>> I've just tried your development version, but it crashed: >>>> Internal logic/programming/debugging error (*sigh* this should not >>>> have happened). >>>> Please file a bug report on >>>> http://sourceforge.net/apps/trac/mira-assembler/ >>>> >>>> "While trying to set the name of read >>>> <trace> >>>> Encountered character with ASCII code 32. T >>>> It is probably due to your input data, but normally, MIRA should have >>>> caught that earlier!" >> >> Not a crash, but certainly not normal either. Here too: log please. I >> think I have an idea what could have cause the bug, maybe I will also >> need a sample of the data. > > Ah, of course, I mean the program terminated unexpectedly ;-) > > This same data did not cause any problem with the stable version. MIRA > crashed because someone else ran a job with a large memory footprint and > between us we exhausted the swap space. I'm trying again, but using > "ramzswap" this time to compress and keep swapped memory pages in RAM. > > Bye, > > Tony. > -- > Dr. A.J.Travis, University of Aberdeen, Rowett Institute of Nutrition > and Health, Greenburn Road, Bucksburn, Aberdeen AB21 9SB, Scotland, UK > tel +44(0)1224 712751, fax +44(0)1224 716687, http://www.rowett.ac.uk > mailto:a.travis@xxxxxxxxxx, http://bioinformatics.rri.sari.ac.uk > > ------------------------------ > > End of mira_talk Digest V4 #86 > ****************************** ____________________________________________________________ TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5 Capture screenshots, upload images, edit and send them to your friends through IMs, post on Twitter®, Facebook®, MySpace™, LinkedIn® – FAST! -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html