On Monday 01 August 2011 11:26:59 Lionel Guy wrote: > In case you missed it, Karch et al. used MIRA to assemble (by reference and > de novo) IonTorrent reads from the E. coli outbreak strain, and from a > historical related strain. I had a look at the reads from the first > strain, and I'm not convinced by the quality yet. Searching for SNPs, it > requires to ignore any homopolymer with length > 3 if you want to avoid > errors. Oh, had not seen that paper, nice. Current IonTorrent data indeed very much feels like the first 454 GS20 data from early 2005/2006: problems with homopolymers all over the place. However, 454 was able to crank up quality to current levels within something like 12 to 18 months, I have some hope that Ion will be able to do that also. > And even in the shorter homopolymers, there are some errors that > are obvious. They say that after the mapping assembly, they had to correct > manually 51 out of the 1144 "core" genes, (~5%), which is not impressive. Yep, I would not want to use Ion data without complementing it with Solexas, which with current Ion lengtzh of ~100bp is a somewhat futile exercise. But as soon as Ion gets their reads in the 250 bp range things become interesting again from read length perspective if Illumina stays <= 150bp. > BTW, in my opinion, they got the MIRA citation wrong (Genome Res 2004 > paper). Cannot really blame them for that: it's the only MIRA paper in a journal (admittedly high-ranking). The other "more-or-less-paper" is from the GCB 99 conference and is not indexed by PubMed. But as "Bioinformatics" reviewers somehow always rejected papers I submitted on MIRA or specific algorithms, I prefer to spend my spare time by developing MIRA instead of chasing wild goose and some cranky ideas of reviewers. Whether or not a new paper on MIRA will appear is pretty uncertain :-) B.