[mira_talk] Re: Parallel Implementation of MIRA

From: Bastien Chevreux <bach@xxxxxxxxxxxx>
To: mira_talk@xxxxxxxxxxxxx
Date: Tue, 1 Oct 2013 20:39:09 +0200

On Oct 1, 2013, at 11:05 , Habeeb Syed <habeeb.syed@xxxxxxx> wrote:
> I was wondering if anybody  tried on large machines,  something like few 
> hundred cores?  How is the scale up? 

Someone (maybe Jorge? I don't remember) made a couple of test runs on larger 
machines. He found out that on his machine, performance improvements plateaued 
out at around 40 cores iirc. I suppose that either the bus or the disk were a 
bottleneck, but this was never investigated.

MIRA itself has only one part left which could "easily" be multithreaded and is 
not yet: the Smith-Waterman matrix calculation and recursive alignment 
generation. Incidentally, I suppose large speed-ups there could be achieved by 
using SIMD instructions (SSE2/3/4 family) combined with more intelligent memory 
usage and plain old parallelism on multiple cores. Of that, I may attack the 
latter sometime this year.

Almost all other parts of MIRA which can be parallelised have been 
parallelised. However, one of the key steps in MIRA - the contig building - 
cannot be parallelised unless MIRA does like other assemblers and stops at 
repeats. Which I don't plan to implement because, well, a repeat is only a true 
repeat if it is 100% identical and not, like other assemblers think, just 
almost identical.

> GPU acceleration will be nice;  have  you tried? 
> What about GPU implementation of MIRA.  Can anybody share some insight on 
> this? 

As far as I am aware of, GPU implementations of bioinformatics algorithms have 
not had a terrible success in the wild. Either because they didn't deliver that 
much of a speed-up or, equally important, because it's too much work to 
implement several different GPU algorithms for all the different cards out 
there. Even more vexing, quite a number of larger servers have no GPU at all. 
SSE instructions on the other hand are quite common nowadays, one would be hard 
pressed to find machines which have no SSE4 (let alone 3 or 2).

As last point, the paper you cited in your initial posting (if it's the one I 
have in mind, I did not check yet) had come as a complete surprise to me when I 
initially discovered it, totally by chance, a couple of months ago. Someone had 
been working on MIRA code without contacting me to find out whether I could 
make use of the changes or integrate it back into MIRA. That had been a quite 
strange experience at the time.

B.

--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Follow-Ups:
- [mira_talk] Re: Parallel Implementation of MIRA
  - From: Laurent MANCHON
- [mira_talk] Re: Parallel Implementation of MIRA
  - From: Habeeb Syed
- [mira_talk] Re: Parallel Implementation of MIRA
  - From: Veljo Kisand

References:
- [mira_talk] Re: Could not find executable 'miraconvert' for extracting large contigs?
  - From: Horváth Balázs
- [mira_talk] Parallel Implementation of MIRA
  - From: Habeeb Syed
- [mira_talk] Re: Parallel Implementation of MIRA
  - From: Bartha Dániel
- [mira_talk] Re: Parallel Implementation of MIRA
  - From: Habeeb Syed

[mira_talk] Re: Parallel Implementation of MIRA

Other related posts: