[mira_talk] Re: Total consensus length is double ?

From: Lionel Guy <guy.lionel@xxxxxxxxx>
To: mira_talk@xxxxxxxxxxxxx
Date: Fri, 9 Jul 2010 12:44:13 +0200

Hi Jan,

From your plot it seems that you have a lot of very small contigsthat are non aligned. These are likely to be spurious. I wouldrecommend filtering out these. As a rule of thumb, any contig that isslightly longer than the read length and has a coverage lower than 1/3or 1/2 of the average coverage may be discarded.

You can use convert_project to do that. With an assembly of 454Titanium data (read length ~400) and ~60X average coverage, I use

convert_project -f caf -t caf -x 500 -y 20 raw_assembly.caffiltered_assembly




On 9 Jul 2010, at 11:39 , Jan van Haarst wrote:

Dear All,
I'm working on some benchmarks of assemblers[1], and one of those isMIRA.For that purpose, I have downloaded 3 datasets [2], and put thoseinto MIRA after using sff_extract.
I have run the assembly using the parameters as mentioned on theMIRA website :
mira --project=$PROJECT --job=denovo,genome,accurate,454
I have used mira_3.2.0rc1_dev_linux-gnu_x86_64_static for this.
What I see is that the resulting consensus is twice the size of thereference E. coli genome !If I do a mummerplot of the consensus versus the reference(hopefully attached), I see that the complete reference is present,but also a lot of other data.
I would like to know what I can do to get MIRA to give about thesame (or better) results than newbler or CABOG using this dataset.
--
Dag,
Jan

[1] https://wiki.nbic.nl/index.php/Raw_results_of_NGS_de_novo_assembly
[2] ENA SRR00086ENA SRR000870ENA SRR001028
<mira_vs_reference_filtered_SNP.png>


============================================
Lionel Guy
Thunmansgatan 25, SE-75421 Uppsala

phone: +46 (0)18 245596
mobile: +46 (0)73 9760618
email: guy.lionel@xxxxxxxxx
============================================


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

References:
- [mira_talk] Total consensus length is double ?
  - From: Jan van Haarst

[mira_talk] Re: Total consensus length is double ?

Other related posts: