[mira_talk] Re: estimation number of genes

  • From: Jordi Durban <jordi.durban@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 24 Apr 2012 12:33:29 +0200

Yes, of course, I can explain it in a more detailed way!
Actually, I performed a NGS analysis from 454 data of a non-model organism
for which no genome data are available. I did a keyword selection form our
blast results and finally, as the data came from a transcriptome, I aligned
the results to an open reading frame reference sequence. Well, I was able
to separate the 454 reads according to the known proteins they came from,
so I had different groups of related 454 reads, each of them belonging to a
given protein.
The whole of aligned reads (that is, the coding ones) were assembled with
MIRA in order to get the as many different coding sequences I had for a
given protein.
According to this, the number of resulting contigs could give an idea of
the number of expressed genes for a given protein.
My doubts came from the debris file. What should I do with those sequences?
They really encode for protein and they are not included in a contig, so I
think they should be considered as a putative different.. "isoforms"?? But
why they are included in the debris file?? Perhaps singletons?
I hope I have explained the issue better.
Thanks a lot.
2012/4/23 Bastien Chevreux <bach@xxxxxxxxxxxx>

> On Apr 20, 2012, at 12:49 , Jordi Durban wrote:
> > [...]
> > What do you think about such an approach??
>
> Hi Jordi,
>
> I, uh, am not sure I completely understood what you are trying to do,
> sorry for that. Care to explain in more detail?
>
> B.
>
>
> --
> You have received this mail because you are subscribed to the mira_talk
> mailing list. For information on how to subscribe or unsubscribe, please
> visit http://www.chevreux.org/mira_mailinglists.html
>



-- 
Jordi

Other related posts: