[mira_talk] Re: Why so many sequences in debris file?

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 27 May 2011 23:09:14 +0200

On Monday 23 May 2011 21:55:47 jyotsna guleria wrote:
> After taking singlets into account, I am still getting one fourth of my
> sequences in debris. I couldn't figure this out as quality scores for those
> sequences are good and read length as well.
> 
> Can anyone help me here?

You might find 
  
http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html#sect_ref_contigs_singlets_debris

to give you some information why things land in debris. Have a look at the 
tmp/log file called <projectname>_int_clippings.0.txt and grep for read names 
there.

> Please someone suggest me what to do to avoid getting too many sequences in
> debris or help me understanding debris file in detail.
 
Have you looked at the debris "by hand" and judged whether they would be useful 
to the assembly? Easiest thing to do would be to run BLAST/fasta36 (the 
program) with a couple of sequences of the debris against some contigs of MIRA 
and see whether you deem them to be useful. They're probably not, but maybe 
MIRA was set up too strict and feedback on that is always welcome.

B.

Other related posts: