[mira_talk] Re: Parallel Implementation of MIRA

  • From: Habeeb Syed <habeeb.syed@xxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 3 Oct 2013 00:21:08 -0400

Thanks for the long mail Bastien.  It was very helpful to know your 
insight on parallelisation of MIRA.

Laurent,  Thanks for sharing analysis. 

Just wondering  where  can  we find pseudocode for computations like Fast 
Read comparision, Smith-Waterman Step, and Partial Path finder algorithm. 

-H Syed.




From:
Bastien Chevreux <bach@xxxxxxxxxxxx>
To:
mira_talk@xxxxxxxxxxxxx, 
Date:
10/01/2013 02:38 PM
Subject:
[mira_talk] Re: Parallel Implementation of MIRA
Sent by:
mira_talk-bounce@xxxxxxxxxxxxx



On Oct 1, 2013, at 11:05 , Habeeb Syed <habeeb.syed@xxxxxxx> wrote:
> I was wondering if anybody  tried on large machines,  something like few 
hundred cores?  How is the scale up? 

Someone (maybe Jorge? I don't remember) made a couple of test runs on 
larger machines. He found out that on his machine, performance 
improvements plateaued out at around 40 cores iirc. I suppose that either 
the bus or the disk were a bottleneck, but this was never investigated.

MIRA itself has only one part left which could "easily" be multithreaded 
and is not yet: the Smith-Waterman matrix calculation and recursive 
alignment generation. Incidentally, I suppose large speed-ups there could 
be achieved by using SIMD instructions (SSE2/3/4 family) combined with 
more intelligent memory usage and plain old parallelism on multiple cores. 
Of that, I may attack the latter sometime this year.

Almost all other parts of MIRA which can be parallelised have been 
parallelised. However, one of the key steps in MIRA - the contig building 
- cannot be parallelised unless MIRA does like other assemblers and stops 
at repeats. Which I don't plan to implement because, well, a repeat is 
only a true repeat if it is 100% identical and not, like other assemblers 
think, just almost identical.

> GPU acceleration will be nice;  have  you tried? 
> What about GPU implementation of MIRA.  Can anybody share some insight 
on this? 

As far as I am aware of, GPU implementations of bioinformatics algorithms 
have not had a terrible success in the wild. Either because they didn't 
deliver that much of a speed-up or, equally important, because it's too 
much work to implement several different GPU algorithms for all the 
different cards out there. Even more vexing, quite a number of larger 
servers have no GPU at all. SSE instructions on the other hand are quite 
common nowadays, one would be hard pressed to find machines which have no 
SSE4 (let alone 3 or 2).


As last point, the paper you cited in your initial posting (if it's the 
one I have in mind, I did not check yet) had come as a complete surprise 
to me when I initially discovered it, totally by chance, a couple of 
months ago. Someone had been working on MIRA code without contacting me to 
find out whether I could make use of the changes or integrate it back into 
MIRA. That had been a quite strange experience at the time.

B.


-- 
You have received this mail because you are subscribed to the mira_talk 
mailing list. For information on how to subscribe or unsubscribe, please 
visit http://www.chevreux.org/mira_mailinglists.html


=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you


Other related posts: