[mira_talk] Re: Where is my assembly at?

  • From: Artemus Harper <subanark@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 15 Sep 2011 15:08:28 -0700

Yep, that's the plan. I got a box with 512G of memory, which hopefully
should be enough. I'm only putting in 1/7 of the data though.

On Thu, Sep 15, 2011 at 2:58 PM, Robert Bruccoleri <
bruc@xxxxxxxxxxxxxxxxxxxxx> wrote:

> **
> Dear Artemus,
>     Is this a plant genome? Are you sequencing the whole thing?
>
>     Cheers,
>     Bob
>
>
> Bastien Chevreux wrote:
>
> On Sep 15, 2011, at 23:02 , Robert Bruccoleri wrote:
>
>
>      For the benefit of all mira users, could you explain these one letter 
> codes in more detail? Specifically, what do they all mean and what can be 
> done about them?
>
>
>  Probably, but not atm, I'm a bit short on time.
>
>
>
>      I've looked in the source code, and I understand some of the them (like 
> 'G' which means repetitive sequence), but I don't understand what 'a' really 
> means.
>
>
>  'a' == Align problem
>
> Specifically, there was an align overlap in pairwise comparison between reads 
> r1 and r2 which could be computed during the Smith-Waterman screening. But 
> during contig building, one of the reads (say, r1) got inserted in the contig 
> and when the pathfinder told the contig to use the align overlap of r1 & r2 
> as template to insert read r2, the contig suddenly did not find any overlap 
> anymore. Often happens at repetitive sites or when reads inserted in-between 
> bring in too much noise through sequencing errors.
>
> But the somewhat larger amount of 'a' Artemus posted isn't really what made 
> me gasped ... it was more the x / y / z numbers at the end of each line: it's 
> a timing MIRA keeps track which shows how much time it spend where. The 'x' 
> component is the one for the pathfinder and is generally in the single or 
> two-digit range. Repetitive areas spike it up to higher numbers (three, very 
> rarely four digits), but these normally then go back down more or less 
> quickly.
>
> The numbers posted are 6-digit! Meaning that for considerable stretches it 
> takes 10,000 times longer than it should. I'd now like to find out what 
> triggers this.
>
> B.
>
>
>
>
>
>
>


-- 
Artemus Harper

Other related posts: