[mira_talk] mirabait reuse of hashstat

  • From: "Walter, Mathias" <mathias@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Fri, 22 Nov 2013 20:14:53 +0100

Hi,

is there any way to reuse the created hashstat.bin?

As far as I understand it correctly, the hashstat.bin is just the kmer
hash created out of the bait.
Sometimes I have to perform the same filtering with different sequence
results, i.e. from different strains. Or I like to get the reads
hitting the bait and those not hitting the bait. It does not seem to
be possible to get them simultaneaously, does it?

Use case:
1. filtering (inverse hits) various sequence data sets against a large
reference genome (i. e. bacteria grown on BGM or THP1 cells)
2. filtering (inverse hits) the resulting reads against bacterial contaminants
3. filtering (true hits) the resulting reads against bacterial contaminants
4. matching the "contaminant" reads against closely related genome
5. reintegrate 4. into 2. because rRNAs (and some other loci) have
highly conserved kmers which were filtered out in 2.

The output of 2. and 3. could be written simultaneaously, if mirabait
supports this.

--
Regards,
Mathias

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: