On Freitag 12 März 2010 Martin A. Hansen wrote: > I get errors of the following type using Mira 3.0.2: > > Error: read name FQIBXOY01A4H2K present multiple times in readpool! > > Which is funny since there is only ONE sequence in the input with that > name. Error introduced in 3.0.1. Please try out http://www.chevreux.org/tmp/mira_3.0.3_prod_linux-gnu_x86_64_static.tar.bz2 where the error should be fixed. > I don't know why Mira considers non-unique names a problem unless these are > used as hash keys?. In that case Mira's should perhaps assign a forth > running ID number to reads, do the assemble magic, and upon finishing the > output substitute the IDs with the original sequence names? MIRA has no problem at all with duplicate names. It could do very well without names at all as it indeed uses internal ID to address the reads. Programs reading ACE, CAF, MAF files however will be absolutely unhappy with duplicate names. Which is understandable as in all those formats, the general way to address reads and to place them is via their name ... and having duplicate names wreaks havoc with that logic. That being said, I have yet to see a use case where having the same reads twice (or even the same read names twice with different data) would be something useful ... and not due to some handling error by the user. For both reasons given above, checking for duplicate read names is a necessity rather than anything else :-) Regards, Bastien -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html