[mira_talk] Re: Vector screen

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 14 Mar 2011 17:44:43 +0100

On Monday 14 March 2011 17:15:30 hikaru wrote:
> I am performing a assemble of 454 data with clipping the cloning vector
> sequences using SSAHA2 or SMALT.

Hello Hikaru,

when opening a new thread on MIRA talk: please do not hit reply on your mail 
reader and change the subject of the mail. While this may seem practical, 
there is "hidden" information passed through mail headers which allows mail 
readers to correctly thread the mails. As you "replied" to a mail from 
Stephanie from last Thursday, your thread is now a subthread of "Help with 
assembling multiple strains, XML file" ... which you will agree is hardly 
related to your problem and could make it difficult for me to find it again in 
the future.

No harm done, just for information :-)

> SSAHA2 and SMALT certainly recognized vector regions in each 454 read
> sequences. The Mira assembly process was done completely without any
> error. But the vector sequence were still remained in assembles; Checked
> by SSAHA2 and BLAST search. The yielded contigs contain the vector
> sequences at the (one or both) terminal of them and sometimes at the
> middle of contigs (chimera?).

First thimgs first: is there any special reason why you do a vector screen 
yourself? Normally the 454 pre-processing software is quite good at that task 
and the clipping points in the SFF are solid enough ... at least to kmy 
knowledge.

Note: anyone having other experiences, please comment.
 
> Has anyone had a similar problems? Is there a any way to correct the
> problem?

Probably yes. You used SSAHA2 and SMALT? You are the second person I know of 
to do that ... and I'm not sure how undefined the state is when MIRA encounters 
that (see it as a buggy feature which will be removed from future releases).

To cut things short: do not use SSAHA2 for this task anymore. It was a 
makeshift and the author itself does not recommend doing that. Go with SMALT 
(and remove all SSAHA2 output files from the MIRA working directory).

Once you have done that and re-ran the assembly, check whether there are still 
vector/adaptor remnants. These could be due to two sources: error in SMALT 
(something which was not recognised) or error in MIRA (wrongly interoreted 
SMALT results). To check for that, it will be important to find out exatly what 
happened. For that, have a look at the contigs with contaminant in an assembly 
finishing tool (gap4, gap5 or consed) and find the read which has the 
contaminant not clipped. Then look at the SMALT result file whether it is 
masked there. If not: SMALT problem: if it is: MIRA problem.

In both cases: please report back to the list what you find out so that we can 
have a look on how to proceed best.

Best,
  Bastien

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: