Hi Chris, My genome is about 1.4 Mb, and I have 28245 reads with median length ~2000. I am really inexperienced with pacbio, so could you clarify what you mean by filter? (I am assembling a parasite genome from sequencing results from whole organism by pulling out reads with homology with a closely related genome and then extracting those reads.) Best, Chenling 2014-05-15 11:32 GMT-07:00 Chris Hoefler <hoeflerb@xxxxxxxxx>: > Just FYI, you will need a lot more than 16 Gb to error-correct and > assemble your reads. If you can get access to a high memory machine (or a > cluster) that would be best. What is your expected genome size? What is > your post-filter median read length and yield? > > > Best, > Chris > > > On Thu, May 15, 2014 at 11:04 AM, Chenling Antelope < > chenlingantelope@xxxxxxxxx> wrote: > >> THANKS Bastien and Andrej :) >> >> >> 2014-05-15 0:26 GMT-07:00 Andrej Benjak <abenjak@xxxxxxxxx>: >> >> Hi Chenling, >>> >>> For correcting PacBio reads and/or de novo assemblies you can use the >>> SMRT portal. As an alternative to the PIA local installation, you can >>> download the PacBio virtual machine with the SMRT portal installed and >>> configured (not the latest version, but almost): >>> >>> https://github.com/PacificBiosciences/Bioinformatics-Training/wiki/SMRT-Analysis-Virtual-Machine-Install >>> >>> >>> Cheers, >>> Andrej >>> >>> >>> >>> On 05/15/2014 09:02 AM, Bastien Chevreux wrote: >>> >>> On 15 May 2014, at 2:55 , Chenling Antelope <chenlingantelope@xxxxxxxxx> >>> <chenlingantelope@xxxxxxxxx> wrote: >>> >>> Thanks Bastien for the answer! >>> However I am currently unable to correct my reads because I lack the glib >>> version required by celera. >>> >>> Then you should get that from somewhere :-) >>> >>> >>> Also, I used miramem to estimate the RAM required, which is a lot smaller >>> than my actual RAM 16G >>> >>> miramem does not know about PacBio reads yet, especially not about the >>> worst memory eater for that scenario: the Smith-Waterman overlapper. >>> >>> >>> Is there something else I can do to trouble shoot? >>> >>> You could try to remove all reads >= 10kb (or 9kb, 8kb, etc.) to save >>> memory at the overlap stage. >>> >>> But again: it makes absolutely no sense to currently use MIRA with >>> non-corrected PacBio reads. These simply contain too much crap which MIRA >>> is not prepared for. You will get “something” as result, but it will be >>> total nonsense. >>> >>> B. >>> >>> >>> >>> >>> >>> >> > > > -- > Chris Hoefler, PhD > Postdoctoral Research Associate > Straight Lab > Texas A&M University > 2128 TAMU > College Station, TX 77843-2128 >