[mira_talk] Re: PacBio for scaffolding?

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 18 Jul 2011 22:54:14 +0200

On Jul 16, 2011, at 18:11 , 000.calabi.yau.000@xxxxxxxxxxxxxx wrote:
> I have seen that PacBio released some E.coli datasets. 
> (http://www.pacbiodevnet.com/share/datasets/EColiOutbreak).

Yep, seen them too.

> I wonder what your opinion is on using reads of this length for scaffolding 
> in larger genome projects.

Reads of length 3kb? Wonderful.

> I mean the error rate seems pretty high still, but with such long reads this 
> shouldn't be a too big problem, or?

Unfortunately, it is. 15% error rate means an error every 6 to 7 bases on 
average. That's way too much to my likings. The normal MIRA workflow would also 
not work well, but I had plans to test a couple of things.

> So I am wondering if one would think into that direction would it make sense 
> to do a MIRA hybrid assembly or would this need more specialized assembly 
> routines?
> And if yes are thinking about adding support like this to MIRA?

I am. Probably PacBio also realised that they would not get much momentum if 
many of the available tools do not work with their data. At least that is my 
interpretation of their recent efforts to present long CCS-reads (circular 
consensus sequence reads) which they say have 93% accuracy. Now, there's 
something MIRA can start to work with. Not really perfect, but anyway not bad.

Will need some time though.

B.
--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: