>as soon as a read is edited in a way that bases are > inserted or deleted, the mapping between the sequence in the ACE and the > quality values in the PHD-ball will be completely bogus. That is, unless > consed alters the PHD files or does some other funny things. When things are edited in consed, edited copies of the .phd-files are created for reads that have changed; the original phd-copy is *.phd.1 and subsequent versions are *.phd.2, *.phd.3 and so on, making sure different versions of the ace-file always relate back to the right phd-versions. I am not entirely sure at the moment what happens to the phd.ball file, but I would be very surprised if consed would end up with bogus quality values. (But yes, the whole idea with .phd files instead of a decent complete alignment file is quite bad). Björn On Sun, 27 Sep 2009 14:45:15 +0200 Bastien Chevreux <bach@xxxxxxxxxxxx> wrote: > On Freitag 25 September 2009 Sven Klages wrote: > > as there are people who want to use the ace output of MIRA3 for further > > editing in consed, wouldn't it be a good idea to (optionally) write > > distinct phd.balls for each chemistry used in assembly? I mean you got > > everything together during assembly ... doing this afterwards is always > > some kind of hassle (for some people simply not possible). > > What do you think? > > Hi Sven, > > I'm pretty much for it, but there's one big problem which has stopped me from > doing so in the past: the ACE format. > > In fact, beside missing base quality values for reads, it also misses another > vital part: information regarding inserted or deleted bases. In the > documentation to consed I have seen no way to describe in ACE the fact that a > base has been deleted (be it automatically by the assembler or manually in a > finishing program) > > Take this simple example of a read with three bases. > > ATA > > When deleteing the "T", the read is stored as > > AA > > in the ACE ... and there's no information there ever was a base between the > two A. In other formats (for example CAF), there is adjustment information > pertaining to the read which show that there was "something" between the A. > > If you now combine the above facts (no quality values in ACE and no > adjustment > information) with the MIRA editors for Sanger, 454 and Solexa data, you > certainly see the problem: as soon as a read is edited in a way that bases > are > inserted or deleted, the mapping between the sequence in the ACE and the > quality values in the PHD-ball will be completely bogus. That is, unless > consed alters the PHD files or does some other funny things. > > If you have an idea how this should be handled ... I'm all ears :-) > > > Hopefully alignment format will change in the future ... :-) > > I my despair (ACE is no good, CAF too complicated/slow to parse, BAF not > ready > yet, ASM also not really ideal), I got MIRA to write an own format which > should be easily parsable ... but whether it was a good idea only the future > will tell. > > Regards, > Bastien > > PS: "Ceterum censeo: .ace esse delendam." > > > -- > You have received this mail because you are subscribed to the mira_talk > mailing list. For information on how to subscribe or unsubscribe, please > visit http://www.chevreux.org/mira_mailinglists.html -- ==================================== Björn Nystedt PhD Student Molecular Evolution EBC, Uppsala University Norbyv. 18C, 752 36 Uppsala Sweden phone: +46 (0)18-471 45 88 email: Bjorn.Nystedt@xxxxxxxxx ==================================== -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html