Le 21/05/2014 21:07, Martin MOKREJŠ a écrit :
Hi Bastien, thank your for encouraging me to dive into this myself. ;) Was quite simple. Attached is what I stitched in a half an hour, it works for my purpose although some bits should be moved to separate functions. Anyway, if you want you can include it in the mira bundle, unless you are going to write the "whole" thing yourself (which would be a good idea). The numpy dependency is not ideal, for doing using median() and average() it is not pretty. But in overall, it works. Would *_contigstats.txt contain one more column with the sequencing technology abbreviated, things would have been even easier. Martin Bastien Chevreux wrote:On 20 May 2014, at 12:27 , Martin MOKREJŠ <mmokrejs@xxxxxxxxx> wrote:did anybody try to write some script to calculate how many reads were used for each resulting contig? Or even better, getting coverage by each technology for a given contig? I think it would be helpful to have these in the resulting FASTA files. I think it was already mentioned on this list but I just cannot find it.I do not remember seeing this on the list. I hope that this isn’t a sign of my memory starting to fail me … :-)What I am really after is to split the resulting contigs into those specific to one or another technology. Looks mira_convert cannot do this right away but I hope to collect the contig names first and then ask just for them. So, in two executions could do the job I think.There is indeed nothing you could use out of the box from the MIRA package. The easiest solution I can come up with atm is if you parse the contigreads file in the info directory and categorise reads using a hopefully common name identifier per technology. That would enable you to create lists of contigs which are formed by one technology only (or predominantly by one technology if you want). If that does not work for you, then you would need to parse the MAF file. The documentation for it in the MIRA docs is only for MAF v1, but MAF v2 has not changed much, it just added the concept of readgroups and a couple of other, quite minor changes. Just ask if you need help with that. B.
very usefull script, thank you Martin. -- +----------------------------------+ .-. Laurent MANCHON /v\ Email: lmanchon@xxxxxx // \\ /( )\ ^^_^^ +----------------------------------+