[mira_talk] megahubs ?

From: Jan van Haarst <jan@xxxxxxxxxxxxx>
To: mira_talk@xxxxxxxxxxxxx
Date: Sat, 28 Feb 2009 21:05:01 +0100

Hello All,

I just finished my first assembly, it consists of only 454 FLX data, paired
and unpaired.
I used the manual at
http://www.chevreux.org/uploads/media/mirav2939_454dev.html.

I now have the following problem, the assembly stopped because of
"megahubs".
The relevant part of the output says :

Total megahubs: 28773


MIRA has detected megahubs in your data.This may not be a problem, but most
probably is, especially for eukaryotes.



You have more than 0.0% of your reads found to be megahubs.

You should sheck the following:

        1) for Sanger sequences: are all the sequencing vectors masked /
clipped?
        2) for 454 sequences: are all the adaptors masked / clipped?

To learn more on the types of repeats you have ... (to be extended)
*ONLY* when you are sure that no (or only a very negligible number) of
sequencing
vector / adaptor sequence is remaining, try this:

        3) for organisms with complex repeats (eukaryots & some bacteria):
                - use -SK:mnr=yes
                - reduce the -SK:rt parameter by 2 and try again
(iteratively, down to 4)

*ONLY* if the above fails, try increasing the -SK:mmhr parameter



You have 0.5% of your reads as megahubs.
You have set a maximum allowed ratio of: 0.0

Ending the assembly because the maximum ratio has been reached/surpassed.

What should I do to get the assembly to finish ?
0.5 % of bad reads doesn't sound that bad, does it ?
-- 
Dag,
Jan

Follow-Ups:
- [mira_talk] Re: megahubs ?
  - From: Bastien Chevreux

[mira_talk] megahubs ?

Other related posts: