[mira_talk] Re: 454 cleaning

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 30 Nov 2010 22:17:09 +0100

On Dienstag 30 November 2010 Robin Kramer wrote:
> It is hard to tell if SMALT is doing any better with those settings.

MIRA will probably not be of much help in deciding which filtering was better. 
What you could do is: run the filtered library again through some search 
program (BLAST or FASTA3) and count hits (above a certain threshold).

> I am wondering if MIRA is reconstructing noisy adapters, that the aligners
> aren't able to find, almost doing too good of a job being able to assemble
> even the faintest of adapter sequence signal.

*rotfl* That made my day.

> In any case the vector is still making chimera looking things by the
> cat
> SRR054580_Asha_assembly/SRR054580_Asha_d_results/SRR054580_Asha_out.unpadde
> d.fasta
> 
> | seqs_filter_by_len -s 100 | grep AAGCAGTGGTATCAACGCAGAGTACGGGGG|wc -l
> 
> 3498
> The adapter is highlighted in this chimera.
> CAACTCCAACGCATGAATGCCCTCAAGCAGTGGTATCAACGCAGAGTACGGG
> GGGTGGGTTCATGAGACATGGAACCCTA

Hmmm ... have you tried to track down how this sequence came to be? Just by 
looking at this very contig in an assembly viewer ... gap4 or tablet? If need 
to be, use "convert_project -f maf -t txt" (or even "-t html"). Is it two 
reads with partial adaptors assembled or just one read with an adaptor not 
found? In any case, you should then look up the SSAHA2 / SMALT hit files for 
that read (those reads) to see what happened. Maybe they were found but the 
SSAHA2 / SMALT filtering routines from MIRA were too strict in the parameters 
and did not filter? that would be interesting (and important) to know, so that 
I could take some countermeasures.

B.

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: