Below are two assemblies I did:
1. Gave 90x adaptor free reads to MIRA, 62x reads were used and generated
754 contigs.
Length assessment:
------------------
Number of contigs: 754
Total consensus: 7705193
Largest contig: 102257
N50 contig size: 22587
N90 contig size: 5455
N95 contig size: 2524
2. Gave 100x raw reads to MIRA, 70x reads were used and generated 366
contigs:
Length assessment:
------------------
Number of contigs: 366
Total consensus: 7261126
Largest contig: 152181
N50 contig size: 45496
N90 contig size: 12905
N95 contig size: 6847
I think using raw data is okay for MIRA. But I don't understand why the
trimmed data got worse result than raw data. It should be better, right? or
something wrong with my setting?
Here is the config file I used:
project = 90xmiradenovo
job = genome,denovo,accurate
parameters = -DI:trt=/tmp -NW:cmrnl=no SOLEXA_SETTINGS -CL:pec=yes
readgroup = DataForautopairing
data = ./45x_1.fastq ./45x_2.fastq
technology = solexa
template_size = 300 700 autorefine
strain = B14
Thanks!
Yi
2015-11-11 21:57 GMT+08:00 Huang Yi <huang.y.hy@xxxxxxxxx>:
thank you!
2015-11-11 21:54 GMT+08:00 Bastien Chevreux <bach@xxxxxxxxxxxx>:
On 11 Nov 2015, at 8:49 , Huang Yi <huang.y.hy@xxxxxxxxx> wrote:
[…]
My other question is: there is "Num. reads assembled" in file
"info_assembly.txt", which I think shows how many reads were used for MIRA
assembly. This number is less than what I feed to MIRA. Is it possible to
pull those reads out?
Yes. Reads mapped/assembled are tracked in the
“*projectname*_info_contigreadlist.txt”
file, reads not mapped/assembled (and the reason why) in the file “
*projectname*_info_debrislist.txt”.
B.