[mira_talk] Re: --highlyrepetitive vs normal

  • From: Juan Daniel Montenegro Cabrera <jdmontenegroc@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Sat, 13 Oct 2012 10:29:43 -0500

Thank you Bastien,
I can see that I was overriding some of the changes of the flag with
another parameters. When would you consider a assembly  "higlyrepetitive"?,
does it depend on the k-mer distribution? Do you have a treshold for it,
would you recommend one?
Regards,

Juan Montenegro

2012/10/13 Bastien Chevreux <bach@xxxxxxxxxxxx>

> On Oct 11, 2012, at 18:43 , Juan Daniel Montenegro Cabrera wrote:
>
> [...]
> and the Y axis is the result of adding the --highlyrepetitive flag.
>
> The assemblies are quite similar, but for some reason there is one huge
> missassembly in the first reference contig. The mira manual does not say
> much about the  flag and I wanted to know what are the exact changes of
> this flag in the assembly process.
>
>
> The changes by "highlyrepetitive" are quite varied. Once encountered, a
> whole set of standard parameters is changed by default and some are
> dynamically adjusted. First the default settings applied:
>
> "\n_COMMON_SETTINGS"
> "\n\t-AS:sep=yes"
> "\n\t-CO:mr=yes:mroir=false"
> "\n\t-SK:mnr=yes:nrr=10"
> "\n"
> "\n_SANGER_SETTINGS"
> "\n\t-GE:uti=yes"
> "\n\t-AS:urdcm=1.2"
> "\n\t-CL:pvlc=yes:pvcmla=10"
> "\n\t-DP:ure=yes:feip=0:leip=0"
> "\n\t-CO:emea=15:amgb=yes:amgbemc=yes:amgbnbs=yes"
> "\n"
> "\n_454_SETTINGS"
> "\n\t-DP:ure=no"
> "\n\t-AS:urdcm=1.4"
> "\n\t-CL:pvlc=yes:pvcmla=10"
> "\n\t-CO:emea=5:amgb=no"
> "\n"
> // TODO: what for IonTorrent? atm like 454
> "\n_IONTOR_SETTINGS"
> "\n\t-DP:ure=no"
> "\n\t-AS:urdcm=1.4"
> "\n\t-CL:pvlc=yes:pvcmla=10"
> "\n\t-CO:emea=5:amgb=no"
> "\n"
> "\n_PCBIOHQ_SETTINGS"
> "\n\t-DP:ure=no"
> "\n\t-AS:urdcm=1.4"
> "\n\t-CL:pvlc=yes:pvcmla=10"
> "\n\t-AL:egp=no"
> "\n\t-CO:emea=5:amgb=no"
> "\n"
> // TODO: PacBio LQ
> "\n_PCBIOLQ_SETTINGS"
> "\n\t-DP:ure=no"
> "\n\t-AS:urdcm=1.4"
> "\n\t-CL:pvlc=yes:pvcmla=10"
> "\n\t-AL:egp=no"
> "\n\t-CO:emea=5:amgb=no"
> "\n"
> "\n_SOLEXA_SETTINGS"
> "\n\t-DP:ure=no"
> "\n\t-AS:urdcm=1.9"
> "\n\t-CL:pvlc=no"
> "\n\t-CO:emea=5:amgb=yes:amgbemc=yes:amgbnbs=yes"
> // for Text "technology", nothing
>
> After that, -AS:nop and -AS:rbl are adjusted, the code and output should
> speak for itself:
>
>       if(tmpactpar->mp_assembly_params.as_numpasses<6){
> cout << "  - increassing number of passes (-AS:nop) ";
> tmpactpar->mp_assembly_params.as_numpasses++;
> if(tmpactpar->mp_assembly_params.as_numpasses<=5){
>   tmpactpar->mp_assembly_params.as_numpasses++;
>   cout << "by two.\n";
> }else{
>   cout << "by one.\n";
> }
>       }
>       if(tmpactpar->mp_assembly_params.as_numrmbbreakloops<3){
> cout << "  - increasing maximum of RMB break loop (-AS:rbl).\n";
> tmpactpar->mp_assembly_params.as_numrmbbreakloops++;
>       }
>
>
> B.
>
>

Other related posts: