[mira_talk] Re: reducing number of Illumina reads

  • From: Robert Bruccoleri <bruc@xxxxxxxxxxxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 21 Apr 2011 18:08:06 -0400

However, there is a problem with taking this approach -- you could significantly change the statistics for the determination of repetitive regions and cause misassemblies as a result.


Mira will generate coverage equivalent reads which can help in this situation.

--Bob

ALLO (Alfredo Lopez De Leon) wrote:

You can try to collapse your reads with the FASTX toolkit this will leave you with a set made of unique reads.

This method will preserve the sequence coverage and remove the redundancy.

http://hannonlab.cshl.edu/fastx_toolkit/

AlLo

*From:* mira_talk-bounce@xxxxxxxxxxxxx [mailto:mira_talk-bounce@xxxxxxxxxxxxx] *On Behalf Of *Goldman, Thomas
*Sent:* Thursday, April 21, 2011 2:27 PM
*To:* mira_talk@xxxxxxxxxxxxx
*Subject:* [mira_talk] reducing number of Illumina reads

Hello all,

I have a 24GB RHEL5 machine on which I was able to do a de novo assembly of 454 paired-end and fragment reads (~1.5 million reads). I also have about 6 million 36bp Illumina reads.

I would like to:

1)      Map the Illumina reads to the 454 backbone

2) Include the Illumina reads with the 454 reads for a de novo assembly

But I believe I don't have enough memory to handle all the Illumina reads. I think my VM could handle maybe 20% of the Illumina reads. What is the best way to reduce the Illumina reads used for the mapping and/or the de novo assemblies? Would it be to just randomly pick 20% of the reads out of the fastq file? Is there a tool out there I could use for this?

Thanks,

Tom


begin:vcard
fn:Robert Bruccoleri
n:Bruccoleri;Robert
org:Audacious Energy, LLC and Congenomics, LLC
adr:;;;;;;USA
email;internet:bruc@xxxxxxx
title:President
version:2.1
end:vcard

Other related posts: