[mira_talk] Re: MITObim/Mira aborting process with Genbank data

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 6 Apr 2015 21:37:11 +0200

On 06 Apr 2015, at 18:26 , Hernan Vazquez Miranda <miran050@xxxxxxx> wrote:

They look like this (head downyS10.fastq). I don't recall "I"s being so
pervasive in fastq file but I figured they'd be a little different coming
from a SRA file deposited in Genbank.

At first sight these look like valid reads and I have no idea why MIRA suddenly
thinks there should be quality values >>100. I suspects that there something
really wrong hidden somewhere and that the FASTQ itself is broken in such a
weird way that I haven’t written a test for that.

Is there a solution or a way to transform them?

The solution is to find out what wreaks havoc. What I’d do is this:

- take the first 16 reads and test. If it does not work with those, have a
really hard look at that data.
- take the first 1024 reads.
o If it works, double the amount until it does not work.
o As soon as it does not work, isolate a minimum subset (binary search)
which does not work and then have a hard look.

I’m sorry, but I see no other good way atm.

B.



--
You have received this mail because you are subscribed to the mira_talk mailing
list. For information on how to subscribe or unsubscribe, please visit
http://www.chevreux.org/mira_mailinglists.html

Other related posts: