[mira_talk] Re: mmhr problem
- From: "Clayton Coffman" <clayton.coffman@xxxxxxxxx>
- To: mira_talk@xxxxxxxxxxxxx
- Date: Fri, 21 Nov 2008 04:46:29 -0600
Thanks for the quick reply! I'll try that and let you know how it works!
I really like this software BTW, thanks so much for making it. I am not a
big bioinformatics person and I feel like I am getting along with it very
well. I can tell it was made by someone who needed to use it, and not just
someone who wanted to sell it.
Cheers,
C
On Thu, Nov 20, 2008 at 5:54 PM, Bastien Chevreux <bach@xxxxxxxxxxxx> wrote:
> On Thursday 20 November 2008 23:59, Clayton Coffman wrote:
> > I am having a problem running a de novo 454 est assembly. Essentially it
> > tells me I have lots of megahubs, and I am trying to track down what
> these
> > are. To do that I am trying to force it to go ahead with the assembly by
> > setting mmhr=1 but it always aborts anyways saying the ratio is greater
> > than 0, even though I set it to max at 1. I could be doing it wrong,
> heres
> > what I do:
> >
> > mira -fasta -project=Px -SK:mmhr=1 -job=denovo,est,draft,454
>
> Hello Clayton,
>
> I need to clarify in my docs that the quick switches (--job=... and
> friends)
> should be used towards the front of the command line as they overwrite
> almost
> every other option of MIRA.
>
> So, use: "mira -fasta -project=Px -job=denovo,est,draft,454 -SK:mmhr=1"
>
> and you're good to go.
>
> > Is there a better way to find out what the megahub is? My sequences
> aren't
> > paired-end and I set ssf_extract to trim an apporpriate number of bases
> on
> > the left to account for an adapter which I know is supposed to be there.
>
> To see whether the adaptor is consistently on the left side of your reads,
> use
> sff_extract once without a left clip. If the adaptor sequence is
> consistently
> there, sff_extract will report that.
>
> The following is a short guide on how to find out the really nasty repeats
> in
> your reads. I admittedly need to smooth out a few things with MIRA, but at
> least it works :-)
>
> Run the assembly in a separate directory once with
> "-SK:mnr=yes:rt=<some-int-between-5-and-10>"
>
> MIRA will then mask the <int>-% highest occuring k-mers in your reads and
> report these to a file in the log directory.
>
> This happens almost directly after loading, so you can CTRL-C the program
> once
> you've seen these lines:
>
>
> ---------------------------------------------------------------------------
>
> Skimming for repeats (1/3)
> [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|....
> [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|....
> [80%] ....|.... [90%] ....|.... [100%]
> Compressing hash histogram ... done. Sorting ... (this may take a while)
> ...
> done.
> Used hashes: 6105868
> Unused hashes: 262329588
> Median hashes: 14
> Alternative median hashes: 21
> Max hashes: 265
> Masking starts at: 210
>
> Skimming for repeats (2/3)
> [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|....
> [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|....
> [80%] ....|.... [90%] ....|.... [100%]
>
> Skimming for repeats (3/3)
> [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|....
> [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|....
> [80%] ....|.... [90%] ....|.... [100%]
> Localtime: Fri Nov 21 00:48:02 2008
>
> ---------------------------------------------------------------------------
>
> The file you should look for is
> named "*_int_skimmarknastyrepeats_nastyseq_preassembly.0.lst" and is in tab
> delimited format (name, masked sequence):
>
> nGGMAW54TR
> GGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGG
> nGGMAY80TF
> GGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGG
> nGGMB067TR GGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGG
> nGGMBD71TR GGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGGCTTCGG
>
> Hope it helps,
> Bastien
>
>
> --
> You have received this mail because you are subscribed to the mira_talk
> mailing list. For information on how to subscribe or unsubscribe, please
> visit http://www.chevreux.org/mira_mailinglists.html
>
Other related posts: