[mira_talk] Re: How does MIRA use paired end information?

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 15 Sep 2011 23:09:56 +0200

On Sep 15, 2011, at 20:27 , Cleo HC Ho wrote:
> Does anyone know how MIRA uses 454 paired end information in the assembly? 

I suppose I do.

> When Newbler assembles, it considers PE info and therefore joins contigs 
> within the assembly to produce both scaffolds and contigs as output. 
> Since PE data can be input to MIRA, I'm just wondering where the PE info 
> eventually go, and whether it's ever used to extend/join contigs, other than 
> being assembled into contigs. 

Unlike other assemblers, MIRA does not build non-repetitive and repetitive 
contigs separately if it can avoid it.

Here's a very brief overview what it does:

1) start a contig at a read estimated to contain non-repetitive sequence
2) extend the contig with reads containing non-repetitie sequence or with read 
where the repetitive sequence is flanked by non-repettive sequence within the 
read. There PE information is already used: if read distance or direction does 
not match when the second read of a pair is entered, it is rejected (and 
perhaps placed later on somewhere else)
3) using the reads from step 2, cautiously venture into repetitive ends of a 
contig by placing fully repetitive reads which have their non-repetitive 
partner in the contig. Here too, PE information is used to asses whether the 
pairs are OK. If not, the repetitive read gets rejected (to be placed perhaps 
later on). If this strategy bridges a repeat, go back to 2.

4) if the step in 1 yielded no starting read, then a contig gets build which 
very probably only consists of repetitive reads. PE information is still used 
to assess whether reads are being placed at the right distance 

Things are - should I say of course? - a lot more complicated than that. E.g., 
earlier on in the assembly, the routines which reduce overlap graphs treat 
reads with PE information with different rules than reads without them. Also, 
the routines slashing their way through the constructed graph (the pathfinder 
routines interacting with the contig being currently built) also will treat 
reads with PE info slightly differently than reads without. Etc.pp

MIRA lacks an own scaffolder, I agree.

B.



--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: