[mira_talk] Re: repeat clusters

From: Bastien Chevreux <bach@xxxxxxxxxxxx>
To: mira_talk@xxxxxxxxxxxxx
Date: Wed, 9 Jun 2010 00:32:45 +0200

On Dienstag 08 Juni 2010 bio5yz wrote:
>     Here are 2 examples, from our previous pairwise alignment (done after
> known repeats were masked):
> 
> Read : SETTMV_2SR9W01AWCRM  obtained 4675 hits in region (107,289) and 4488
> hits in region (329,400).
> This was assembled to 4 member contig SETTMV_rep_c7735 by MIRA.
> 
> Read: SETTMV_2SR9W01B13M0 obtained 5171 hits in region (0,132) and 7138
>  hits in region (155,472).
> This read was assembled to 2 member contig SETTMV_rep_c7339 by MIRA.
> 
> The purpose of this exercise was originally to determine chimeras in a
> dataset that would be used for expression analysis later.
> 
> A large number of the reads that these 2 hit were found in the debris
>  files. There are definitely repeat regions within all these sequences 

Nasty repeat mask, that's pretty sure. Here you have it:

> RD      SETTMV_2SR9W01AWCRM
> [...]
> RT      MNRr 41 99
> RT      MNRr 117 170
> RT      MNRr 172 406
> [...]

From the 406 bases, only a couple of stretches are not masked as nasty (first 
40 bases, 18 bases at pos 99 and 2 bases at 170. The rest is masked. I suppose 
that the non-masked areas contain rare splices or sequencing error or adaptor 
remnants (at the front).

> RD      SETTMV_2SR9W01CBHGG
> RT      MNRr 15 80
> RT      MNRr 98 403
> RT      MNRr 405 446

Same thing, almost completely masked. And the remaining reads also. MIRA 
doesn't skim masked areas (but does SW alignment on them), so if some reads 
have rare events (SNP, splice variants, errors) etc. it will find an overlap 
there, but not on the rest.

Look in "*_assembly/*_d_log/*nasty*" files for more info what was masked in 
which reads.

> I  was wondering if there was a way to trace what reads were debried and for
>  what reason.

Long standing feature request, but currently no time to implement such a 
thing.

B.

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Follow-Ups:
- [mira_talk] Re: repeat clusters
  - From: bio5yz

References:
- [mira_talk] repeat clusters
  - From: bio5yz
- [mira_talk] Re: repeat clusters
  - From: Bastien Chevreux
- [mira_talk] Re: repeat clusters
  - From: bio5yz

[mira_talk] Re: repeat clusters

Other related posts: