[mira_talk] Re: Total consensus length is double ?

  • From: Lionel Guy <guy.lionel@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 9 Jul 2010 12:44:13 +0200

Hi Jan,

From your plot it seems that you have a lot of very small contigs that are non aligned. These are likely to be spurious. I would recommend filtering out these. As a rule of thumb, any contig that is slightly longer than the read length and has a coverage lower than 1/3 or 1/2 of the average coverage may be discarded.

You can use convert_project to do that. With an assembly of 454 Titanium data (read length ~400) and ~60X average coverage, I use

convert_project -f caf -t caf -x 500 -y 20 raw_assembly.caf filtered_assembly



On 9 Jul 2010, at 11:39 , Jan van Haarst wrote:

Dear All,

I'm working on some benchmarks of assemblers[1], and one of those is MIRA. For that purpose, I have downloaded 3 datasets [2], and put those into MIRA after using sff_extract.

I have run the assembly using the parameters as mentioned on the MIRA website :
mira --project=$PROJECT --job=denovo,genome,accurate,454
I have used mira_3.2.0rc1_dev_linux-gnu_x86_64_static for this.

What I see is that the resulting consensus is twice the size of the reference E. coli genome ! If I do a mummerplot of the consensus versus the reference (hopefully attached), I see that the complete reference is present, but also a lot of other data.

I would like to know what I can do to get MIRA to give about the same (or better) results than newbler or CABOG using this dataset.

--
Dag,
Jan

[1] https://wiki.nbic.nl/index.php/Raw_results_of_NGS_de_novo_assembly
[2] ENA SRR00086ENA SRR000870ENA SRR001028
<mira_vs_reference_filtered_SNP.png>

============================================
Lionel Guy
Thunmansgatan 25, SE-75421 Uppsala

phone: +46 (0)18 245596
mobile: +46 (0)73 9760618
email: guy.lionel@xxxxxxxxx
============================================


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: