[mira_talk] Re: Filtering (size or quality) for Ion Torrent data

  • From: Adam Witney <awitney@xxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 03 Aug 2012 09:51:17 +0100


Hi Nick,

Take a look at seqtk or sickle, these might do what you need:

https://github.com/lh3/seqtk/
https://github.com/najoshi/sickle

Adam

On 03/08/2012 08:41, Nicholas Heng wrote:

Hi Bastien and other MIRA-using Folk,

I have a dataset from an Ion Torrent 318 run that's about 480 Mb in size 
(average readlength 177 bp).  My underpowered 8 GB bioinformatics machine spent 
7 solid days churning away at the data using MIRA 3.4 to no avail.

Can any of you next-gen sequencing gurus please suggest a program that would allow 
me, one with absolutely no programming skill, to filter the dataset into say <150 
bp and >150 bp subsets such that it's suitable for MIRA to handle?  Alternatively, 
a program that would sort by sequence quality... but this may be harder.

The bacterial genome I'm sequencing is de novo (no reference) and it's about 2 
- 2.5 Mb in size.  There is a 454 run being done but I'd like subsets of Ion 
Torrent for a hybrid assembly with MIRA.  And no, we can't afford a more 
accurate Illumina or SOLiD run at the present time.

Any help is greatly appreciated.

Cheers,
Nick.


================================
Nicholas (Nick) C.K. Heng, Ph.D.
Department of Oral Sciences
Faculty of Dentistry
University of Otago
P.O. Box 647
Dunedin 9054
NEW ZEALAND.
Ph: +643 4799254
Fx: +643 4797078
================================


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: