Hi Sven,Thanks for your suggestions, I'll implement them soon. About the chromatogram names, is it enough to give the name and positions in the phd file? Don't you need an actual file? Does it work for Sanger reads too (I guess I could link the actual abi files there, otherwise)?
Do the numbers (15, 19) in the calculation of $peakpos come from empirical data?
Cheers, Lionel On 25 Oct 2009, at 17:11 , Sven Klages wrote:
Hi Lionel, if I find some time I'll test it as well.We have even phd.ball of almost 30G(!), for more or less historical reasons, as consed supports loading more than one phd.ball since v17 AFAIK. We started using phd.balls quite ealier (we also wrote our own predPhrap), because we were not able to (effeciently) handle 400,000 or more single phd files in a single filesystem ..You should think about distinguishing sanger and 454 data, as for 454 data you probably canomit the follwing tags: CALL_METHOD: QUALITY_LEVELS:I'd also think about adding real chromatogram names to the phd.ball as only this option lets you edit single reads (and thus lets you changing consensus) ...If you do so, you need to calculate the peak positions as well. $peakpos = (++$basepos - 1)*19 + 15; just some thoughts, Sven 2009/10/23 Lionel Guy <guy.lionel@xxxxxxxxx> Hi there,Following my yesterday's message, I changed my original idea and finallyparsed the mira-produced caf file to obtain a phd.ball file to be used with consed. The idea behind that is to have qualities associated withreads when editing mira assemblies within consed. This is very importantfor example when merging/tearing contigs, because the consensus is recalculated in a very, very bad way if you don't have qualities (especially because mira doesn't physically trims the reads from the vector sequences...).The result is a small perl script that works for my data, but I would beglad if others could test it to see if it works with other types of data. All comments are welcome! CAVEAT: this script produces huuuuge files, because it writes one lineper base, plus headers. For example, I have 350'000 reads and some longSanger, and I get a file which is 1.4 Gb... Cheers, Lionel
============================================ Lionel Guy Thunmansgatan 25, SE-75421 Uppsala phone: +46 (0)18 245596 mobile: +46 (0)73 9760618 email: guy.lionel@xxxxxxxxx ============================================ -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html