[mira_talk] Re: mid-tags

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 18 Feb 2009 14:59:58 +0100

On Tuesday 17 February 2009 Oscar Franzén wrote:
> I have a question about MIRA (great software by the way):
> is it possible to assemble 454 data sequenced with MID-tags?

Hello Oscar,

yes and no. You must "remove" the MID tags from the input sequence as else 
they'd wreak havoc.

Assuming that in the following read

>demo
tcag ttgccaggtaac ctcgattgagtactatctgacgagcgacgactgtctgcat

the "tcag" is the 'normal' remainder of the 454 adaptor (clipped away by 
sff_extract vie a corresponding left clip entry in the ancillary XML data) and 
"ttgccaggtaac" is one of your MID tags, you can:

1) physically remove the whole stretch (I do not recommend this), leading to
>demo
ctcgattgagtactatctgacgagcgacgactgtctgcat

2) mask the MID tag (and perhaps also the remainder of the adaptor) and use -
CL:mbc
>demo
xxxx xxxxxxxxxxxx ctcgattgagtactatctgacgagcgacgactgtctgcat

3) (prefered) keep the whole sequence as is and use a script that sets correct 
values in the XML file with ancillary data.

The problem with all three possibilities above: even though a number of people 
have inquired previously by mail regarding this topic, I yet haven't got back 
any script that performs this kind of data mangling[*]. Feel free to be the 
first :-)

Regards,
  Bastien

[*] I would assume that this belongs to "normal" data processing that the 
Roche software should perform, but until now this is not part of their 
software pipeline.


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: