From: Mcnally, Alan<alan.mcnally@xxxxxxxxx> > I have sent Bastian the first 10,000 lines of my FastQ input > file.................hopefully he can solve for me...........its getting > very frustrating The file you sent me is everything but a valid FASTQ. Here's an excerpt, the first few lines (I've Xed out the bases): ------------------------ snip -------------------------- @HWI-ST300:133:B0908ABXX:3:1101:1242:2117 1:Y:0:ATCACG XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX @HWI-ST300:133:B0908ABXX:3:1101:1242:2117 2:Y:0:ATCACG NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN + #################################################################################################### + #################################################################################################### ------------------------ snip -------------------------- That file is so plain wrong and not FASTQ, I do not know where to start: 1) the first read (HWI-ST300:133:B0908ABXX:3:1101:1242:2117 1:Y:0:ATCACG) has no quality 2) the second read (HWI-ST300:133:B0908ABXX:3:1101:1242:2117 2:Y:0:ATCACG) has two quality lines May I *strongly* suggest you leave out any script (I have this ominous shuffleseq.pl in mind you wrote about) and simply cat the first n million lines of your data sets together? That's how I do it and it works like a charm :-) Furthermore, those are CASAVA 1.8 data sets where Illumina has changed the read naming scheme. I.e., you will notice that the first two reads have actually the very same name. For this atrocity the developers at Illumina should be nailed, crucified, tarred, feathered, shot into space and subsequently dissected by slimy green purple aliens with 18 tentacles and a single eyeball (in this order, and preferably alive during the whole procedure). Anyway, while the development version of MIRA now knows these things, you need to rename your reads for all public versions out there. More specifically, you need to pull the first character of the comment to the read name, separating it with a slash. I.e., a line reading @HWI-ST300:133:B0908ABXX:3:1101:1242:2117 1:Y:0:ATCACG must be changed to @HWI-ST300:133:B0908ABXX:3:1101:1242:2117/1 1:Y:0:ATCACG and a line with @HWI-ST300:133:B0908ABXX:3:1101:1242:2117 2:Y:0:ATCACG to @HWI-ST300:133:B0908ABXX:3:1101:1242:2117/2 2:Y:0:ATCACG Best, Bastien -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html