fastqSample distributed with the wgs-assembler package can subset your fastq files for you. I haven't checked to see if it picks random or series (i.e. every 3rd or every 4th, etc) of reads. There's a little blurb about it at the bottom of this page: http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=FastqToCA#fastqSample Otherwise it wouldn't be too tough to write a quick perl script to output every fifth read from your data set to another file. - E On Apr 21, 2011, at 5:08 PM, Robert Bruccoleri wrote: > However, there is a problem with taking this approach -- you could > significantly change the statistics for the determination of repetitive > regions and cause misassemblies as a result. > > Mira will generate coverage equivalent reads which can help in this situation. > > --Bob > > ALLO (Alfredo Lopez De Leon) wrote: >> >> You can try to collapse your reads with the FASTX toolkit this will leave >> you with a set made of unique reads. >> This method will preserve the sequence coverage and remove the redundancy. >> http://hannonlab.cshl.edu/fastx_toolkit/ >> >> AlLo >> >> From: mira_talk-bounce@xxxxxxxxxxxxx [mailto:mira_talk-bounce@xxxxxxxxxxxxx] >> On Behalf Of Goldman, Thomas >> Sent: Thursday, April 21, 2011 2:27 PM >> To: mira_talk@xxxxxxxxxxxxx >> Subject: [mira_talk] reducing number of Illumina reads >> >> Hello all, >> >> I have a 24GB RHEL5 machine on which I was able to do a de novo assembly of >> 454 paired-end and fragment reads (~1.5 million reads). I also have about 6 >> million 36bp Illumina reads. >> >> I would like to: >> 1) Map the Illumina reads to the 454 backbone >> 2) Include the Illumina reads with the 454 reads for a de novo assembly >> But I believe I don’t have enough memory to handle all the Illumina reads. I >> think my VM could handle maybe 20% of the Illumina reads. What is the best >> way to reduce the Illumina reads used for the mapping and/or the de novo >> assemblies? Would it be to just randomly pick 20% of the reads out of the >> fastq file? Is there a tool out there I could use for this? >> >> Thanks, >> Tom > > <bruc.vcf>