[mira_talk] Re: adaptor trim

  • From: Andrej Benjak <abenjak@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 18 Nov 2015 09:29:15 +0100

I guess it depends on the bug, IMHO.
I work mostly with /Mycobacterium tuberculosis /(4.4 Mb, several identical repeats + some nasty areas)//and /M. leprae /(3.268 Mb, several repeats and couple of duplicated genes(fragments)) and with Illumina only (either SE or PE) the best I normally get are few dozens contigs. My last assembly was on /M. leprae /using MIRA 4.9.5_2 default parameters, //Illumina HiSeq SE, 114 avg. coverage: 56 contigs (3.26 Mb) + a bunch of short contigs. Granted, /M. leprae/ is tricky because it cannot be cultured in vitro so it always have some host DNA (the above example had only few % though).
Did not assemble any /Mtb /with the newest MIRA (which I think improved a lot), but I would not be surprised if I got say 50 contigs.
I know, library prep, merging PEs, changing parameters, subsampling all influence the assembly, but getting only one or few contigs for these bugs using only Illumina would be difficult to say the least.

Then there are the nasty bugs like /Dactylosporangium,///10Mb, super high GC, probably very repetitive. Illumina SE gave >1000 contigs, we even had troubles with PacBio couple of years ago (hundreds of contigs). Finally we did a new PacBio just recently, 1 cell, newest chemistry: 1 contig :)

Hope this gives some perspective.

Andrej

On 18-Nov-15 07:52, Veljo Kisand wrote:

On 11/18/2015 06:23 AM, Bastien Chevreux wrote:
On 17 Nov 2015, at 23:20 , Huang Yi <huang.y.hy@xxxxxxxxx> wrote:
Then what is "todays standard"?
For bacteria? 1 contig per chromosome for 80 to 90% of the bugs.

Sometime a couple of contigs more in case of really nasty repeat structures,
but that’s more or less it.

B.
Bastien, do you mean such standard is based on Illumina reads only?

-v-


Other related posts: