[mira_talk] Re: mirabait reuse of hashstat

  • From: "Walter, Mathias" <mathias@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 22 Nov 2013 23:30:18 +0100

Hi,

2013/11/22 Bastien Chevreux <bach@xxxxxxxxxxxx>:
> On 22 Nov 2013, at 20:14 , Walter, Mathias <mathias@xxxxxxxxx> wrote:
>> is there any way to reuse the created hashstat.bin?
>
> Not yet, and suggesting mirabait to be able to reuse it is akin to preaching 
> the choir here. Indeed, the only reason I did not implement it (it’s just a 
> couple of lines) is because the current routines make use of 
> unprotected/unchecked MIRA structures, i.e., it would be too easy for users 
> to use it wrongly. E.g. creating a hashstat with a given k-mer size and then 
> use it with a different one would yield, um, interesting results.

But this could be addressed very easy by storing the k-mer in a tiny
header of the hashstat file. The point is that it takes a long time
(even on a SSD) to create the hashstat for larger eukaryotic genomes
and it is very io intensive because it creates these stat temp files
and overrides it quite often.
I don't know, which use case is more frequent: using different k-mers
or using the same bait for different data sets. I assume the later
case.

--
Regards,
Mathias

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: