On Fri, Jul 2, 2010 at 6:47 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote: > On Freitag 02 Juli 2010 Peter wrote: >> Funnily enough, when I asked the Tablet team if anyone else was >> interested in it supporting MAF directly, they also said "Yet another >> output format?". Certainly SAM/BAM seems to be the most widely >> supported alignment/mapping format for big genome projects, but >> I understand that MIRA is more focussed on smaller genomes >> where the older file formats suffice. > > It's not that the "older" file format suffice, it's that they provide the only > possibility for MIRA to have the results supported at all. I would have to agree SAM/BAM are better suited for mapping onto existing genomes than for de novo assembly. >> You mentioned you would like to add tags - the SAM/BAM format >> does allow for per-read tags, and I think there is some flexibility >> in the header too. I guess it depends on what exactly you need - >> I don't see an existing header convention for annotating a region >> of a contig. > > SAM/BAM has no sequencing technology type field, it has no read base tags, no > consensus tags no nothing MIRA needs to pass on information about all those > things which make life during finishing so easy. There is a SAM/BAM read group (RG) which might be suitable for the technology type (and/or strain information). My impression is that all these "optional" tags in the SAM/BAM read tags and headers are only minimally defined in the file format specification. The paired end stuff is pretty nicely done though (which was what I was initially interested in here). Peter -- You have received this mail because you are subscribed to the mira_talk mailing list. For information on how to subscribe or unsubscribe, please visit http://www.chevreux.org/mira_mailinglists.html