[mira_talk] quantity vs quality of SNP predictions

  • From: Jorge.DUARTE@xxxxxxxxxxxx
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 6 Apr 2009 15:00:29 +0200

Hi all,

Following Bastien's advises i've seen in this list previously, 
i have used very strict parameters to assemble 454 data in order to detect 
potential SNPs with a good confidence.

Problem is, using these parameters, the number of reads falling into 
debrislist reaches 60%,
and out of the 40% remaining reads used during assembly only 25% en up in 
contigs with
good enough coverage for SNP detection.

At the end, only 10% of the reads were effectively used to predict 
potential SNP positions...

Does someone else have similar results ? Or is it that my reads are really 
of low quality ?

Is it really worth it to loose 90% of reads in order to gain in confidence 
of SNPs discovered ?

Did someone tried using different parameter settings in order to evaluate 
sensitivity vs specificity of SNP detection with mira ?

The species i'm working with is a polyploid eukaryote, and the sequences 
are PCR amplicons which were ligated before sonication
and 454 titanium sequencing on 4 different cultivars.

I've developped a script in order to detect and split potential chimeras, 
and although it worked quite well,
i'm pretty sure i didn't detect all chimeric sequences. So if someone 
knows of a tool which does this kind of clipping, 
i'd also like to hear from him !!!

Thanks for any comments or feelings you could have on these topics,

jorge.
--- 
Jorge Duarte
Bioinformatics Research Engineer
BIOGEMMA - Upstream Genomics Group
Z.I. Du Brézet
8, Rue des Frères Lumière
63028 CLERMONT FERRAND Cedex 2
FRANCE
Tel : +33 (0)4 73 39 60 73
Fax : +33 (0)4 73 39 60 71
E-mail : jorge.duarte@xxxxxxxxxxxx

*****************************************************************
       Pour toute demande de support merci d'inclure
BIOGEMMA_BioInfo_Service ou bioinfo@xxxxxxxxxxxx
         dans les destinataires lors du premier contact
*****************************************************************
BIOGEMMA S.A.S. au capital social de 48.335.652,00 ?. 1, Rue Edouard 
Colonne - 75001 PARIS. RCS PARIS 412 514 366
This message and any attachments are confidential and intended solely for 
the use of the addressee(s) named above. The information contained in this 
email may also be legally privileged. If you have received this email in 
error, please notify us immediately by reply email or by fax and then 
delete it. Any use, distribution or reproduction of this message is 
strictly prohibited. The integrity or authenticity of this message cannot 
be guaranteed. We therefore shall not be liable for the message if 
altered, changed or falsified. Thank you.

Cet email et ses pièces jointes sont strictement confidentiels et destinés 
uniquement à l'usage du (des) destinataire(s) sus-indiqué(s). Les 
informations contenues dans cet email sont légalement protégées. Si vous 
avez reçu cet email par erreur, merci de nous le retourner immédiatement 
par courrier électronique ou télécopie avant de le supprimer. Toute 
utilisation ou reproduction de cet email est strictement interdite. La 
véracité et l'authenticité de cet email et de son contenu ne peuvent être 
garanties et nous ne pouvons être tenus responsables de leur altération, 
modification ou falsification. Merci.

Other related posts: