[mira_talk] Re: mid-tags
- From: Bastien Chevreux <bach@xxxxxxxxxxxx>
- To: mira_talk@xxxxxxxxxxxxx
- Date: Wed, 18 Feb 2009 14:59:58 +0100
On Tuesday 17 February 2009 Oscar Franzén wrote:
> I have a question about MIRA (great software by the way):
> is it possible to assemble 454 data sequenced with MID-tags?
Hello Oscar,
yes and no. You must "remove" the MID tags from the input sequence as else
they'd wreak havoc.
Assuming that in the following read
>demo
tcag ttgccaggtaac ctcgattgagtactatctgacgagcgacgactgtctgcat
the "tcag" is the 'normal' remainder of the 454 adaptor (clipped away by
sff_extract vie a corresponding left clip entry in the ancillary XML data) and
"ttgccaggtaac" is one of your MID tags, you can:
1) physically remove the whole stretch (I do not recommend this), leading to
>demo
ctcgattgagtactatctgacgagcgacgactgtctgcat
2) mask the MID tag (and perhaps also the remainder of the adaptor) and use -
CL:mbc
>demo
xxxx xxxxxxxxxxxx ctcgattgagtactatctgacgagcgacgactgtctgcat
3) (prefered) keep the whole sequence as is and use a script that sets correct
values in the XML file with ancillary data.
The problem with all three possibilities above: even though a number of people
have inquired previously by mail regarding this topic, I yet haven't got back
any script that performs this kind of data mangling[*]. Feel free to be the
first :-)
Regards,
Bastien
[*] I would assume that this belongs to "normal" data processing that the
Roche software should perform, but until now this is not part of their
software pipeline.
--
You have received this mail because you are subscribed to the mira_talk mailing
list. For information on how to subscribe or unsubscribe, please visit
http://www.chevreux.org/mira_mailinglists.html
Other related posts: