[mira_talk] assembly to identify a large insert

  • From: Thomas Goldman <tomgoldman@xxxxxxxxxxx>
  • To: <mira_talk@xxxxxxxxxxxxx>
  • Date: Thu, 25 Oct 2012 11:08:55 -0700

Hello,

 

Firstly, I would like to thank Bastien for MIRA and all he does for support.
I use it quite often.

 

Secondly, I have an issue that I was hoping someone has had some experience
with. I'm trying to identify the location of a large insert (~7.2KB) in a
MIRA 3.4 assembled genome for which I have MiSeq reads against a reference.
Unfortunately, the inserted region reads (cassette) do not assemble to the
reference, I'm assuming simply because the reads are not in the reference
and are thrown out. I also tried a de novo assembly. In this case, the
cassette is assembled, but unfortunately it produces a scaffold by itself
without flanking reference sequence, so I still cannot determine where the
cassette is inserted. I think this is because the 5' and 3' regions of the
cassette itself has homology to other parts of the genome. Is there
something in the mapping parameters I can change in order to force the
cassette reads into the reference? Or something else I can try to determine
where the large insertion is located?

 

I used the following parameters for the mapping and de novo assemblies:

 

-job=mapping,genome,accurate,solexa -GE:kpmf=30:not=8 -SK:mmhr=90
-MI:somrnl=0 -OUT:orc=on:ora=on:ors=off -AS:nop=1 -SB:bft=fasta
SOLEXA_SETTINGS -CO:msr=no:mrpg=2 -GE:uti=no:tismin=250:tismax=550 -SK:pr=90
-AL:mrs=90

 

-job=genome,denovo,accurate,solexa -GE:kpmf=30:not=8 -SK:mmhr=90
-MI:somrnl=0 -OUT:orc=on:ora=on:ors=off SOLEXA_SETTINGS
-GE:tismin=250:tismax=550 -CO:mrpg=2 -LR:lsd=yes -SK:pr=90 -AL:mrs=90
-OUT:sssip=on:stsip=on

 

Thanks in advance,

Tom

Other related posts: