[mira_talk] Re: Getting coverage by each technology per each contig

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Tue, 20 May 2014 20:32:12 +0200

On 20 May 2014, at 12:27 , Martin MOKREJŠ <mmokrejs@xxxxxxxxx> wrote:
>  did anybody try to write some script to calculate how many reads were used 
> for each resulting contig? Or even better, getting coverage by each 
> technology for a given contig? I think it would be helpful to have these in 
> the resulting FASTA files. I think it was already mentioned on this list but 
> I just cannot find it.

I do not remember seeing this on the list. I hope that this isn’t a sign of my 
memory starting to fail me … :-)

>  What I am really after is to split the resulting contigs into those specific 
> to one or another technology. Looks mira_convert cannot do this right away 
> but I hope to collect the contig names first and then ask just for them. So, 
> in two executions could do the job I think.

There is indeed nothing you could use out of the box from the MIRA package.

The easiest solution I can come up with atm is if you parse the contigreads 
file in the info directory and categorise reads using a hopefully common name 
identifier per technology. That would enable you to create lists of contigs 
which are formed by one technology only (or predominantly by one technology if 
you want).

If that does not work for you, then you would need to parse the MAF file. The 
documentation for it in the MIRA docs is only for MAF v1, but MAF v2 has not 
changed much, it just added the concept of readgroups and a couple of other, 
quite minor changes. Just ask if you need help with that.

B.


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: