[mira_talk] denovo assembly of genome with a lot of indels

  • From: AYeroslaviz <Assa.Yeroslaviz@xxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 17 Oct 2012 15:40:45 +0200

Hi,

we have a PE experiment of mouse genome with loads of mutations in the 
sequence. a bowtie and/or tophat brought only ~2% mapping results. So we 
decided to go with a denovo assembly to see what happens.

That problem is that we are also expecting very large deletions to be found. 
Those are the size of several kb long (we saw traces of them in the tophat 
results, but with a very low read number). 

I am wondering if it a good idea to run denovo assembly with paired-end reads 
or just put both fastq files into one and run it as if it was just one big 
library.

so my two command will probably look like that:
for PE with insert size of 500:
mira --fastq --project=G_AGTCAA_10989142 -DI:trt=/tmp 
--job=denovo,genome,accurate,solexa SOLEXA_SETTINGS -GE:tismin=250:tismax=750 
>& log_assembly_G.txt


for the two files together as one big library of single reads
mira --fastq --project=G_AGTCAA_10989142 -DI:trt=/tmp 
--job=denovo,genome,accurate,solexa  >& log_assembly_G.txt

What will happens to inserts bigger than the given distance of 750? will they 
be ignored completely? 

thanks for the help

cu,
Assa

-- 

Assa Yeroslaviz
Max Planck Institute for Biology of Ageing / Max-Planck-Institut für Biologie 
des Alterns 
Application service, Bioinformatics group / Bioinformatische Servicegruppe
office: ZMMK, Robert-Koch-Str. 21 D-50931 Cologne
Postal address: Postfach 41 06 23, D-50866 Köln / Cologne 
+49 (0)221 47889795
Assa.Yeroslaviz@xxxxxxxxxx
www.age.mpg.de




--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: