[mira_talk] Re: help with mira parametrs of repeated sequences.

  • From: jyotsna guleria <jyotsna.guleria@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 20 May 2011 15:14:55 -0400

Hello,

Thank you very much !!

I will use that but as "uniform distribution" is not very clear to me, do
you think I should use this parameter in my case?
-AS:urd=yes ??

Also, did any one ever used mira for antibody data ( this is a new project
in pipeline and everybody want to know if we could analyze the data before
we start sequencing). Please share if you know from your experience.

Thank you very much again!!

Jyo




On Fri, May 20, 2011 at 3:05 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote:

>  On Friday 20 May 2011 20:46:33 jyotsna guleria wrote:
>
> > I am woking with roche454 sequences that contain repeated sequences. We
>
> > sequenced a region in viral genome and the parameters I am using are:
>
> >
>
> > mira --project=reads --job=denovo,genome,accurate,454 454_SETTINGS
>
> > -LR:mxti=no COMMON_SETTINGS -SK:mnr=yes:nrr=10 -highlyrepetitive
>
> >
>
> > The reason I used "LR:mxti=no" is that mira was complaining that it
> cannot
>
> > find xml file. I used sff_extract and had XML file. Can you please
> explain
>
> > this? Am I missing something?
>
>
> I suppose the file ha the wrong name ... look in the error message
> respectively the log output what MIRA searched for and name your file
> accordingly.
>
>
> > Also, I got many sequences in debris and I have not used -AS:urd=yes
>
> > (uniform distribution).
>
> >
>
> > I want contigs with singlets as well, how can I get that? If I use
>
> > -AS:mrl=2 will that be okay as I know default for roche454 is 5.
>
>
> I never had viral sequences to assemble myself, but from what I gather from
> users is that it is not always trivial: variation in the population, extreme
> GC contents, sometimes even repeats etc.pp
>
>
> If you want most reads afterwards in the assembly (even as singlet), you
> shoulf use -AS:mrpc=1 -OUT:stssip=yes:sssip=yes
>
>
> > Also, is there any way that I can combine the contigs where there is a
>
> > difference of one or two mismatches or a gap? I mean how can I tune mira
>
> > to output contigs with a little flexibility of mismatches and gaps. We
> are
>
> > not looking for SNPs, we just want to target the repeats pattern of
>
> > sequences in that region of genome.
>
>
> You could try to convince MIRA that all differences it sees are SNPs and
> not bases markibng a repeat: -CO:asir=yes
>
>
> > For example: I got 11 contigs and I can combine them into 3 groups after
>
> > aligning them. So, I just want mira to handel that, is it possible?
>
>
> If I were you, I'd have a closer look at finishing programs like "gap4".
> It's probably better to finish a project by a couple of manual tidying steps
> than to risk the assembler doing bad things because it was forced to.
>
>
> > What parameters do you suggest me to use for such a data.
>
>
>
> Try the one above and see what comes out of it. May work well enough.
>
>
> B.
>
>
>

Other related posts: