[mira_talk] Re: Mira mailing list

  • From: Bastien Chevreux <bach@xxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Mon, 17 Jan 2011 22:12:42 +0100

On Thursday 13 January 2011 15:08:16 Sandrine Moreira wrote:
> From Jonathan:
> My Name is Jonathan and I am an Scientific computing analyst at RQCHP. My
> role is to help scientist to do science using hpc platforms in Canada. So
> I will be responsible of helping Sandrine with her project. I can provide
> help to implement either parallelization or checkpoint restart, but in the
> present situation, it makes more sense to go with checkpoint restart.
> 
> In order to help me catchup on that, could you resume where your CR
> implementation stand, and what still need to be done? What kind of things
> are missing, and what can I do to help?

Hello Jonathan,

MIRA talk is subscription only to keep spam out as much as possible. But if 
you do not want to subscribe, we can also take the discussion offline if you 
want.

MIRA works effectively in several passes, the check pointing being foreseen to 
be made after each pass.

Checkpoint restart is in its infancy and stuck at a very crucial point. What 
has been done so far is:
- saving the parameters how MIRA was started (simply by logging the parameters
  on the command line)
- saving the current reads including all changes which they had to endure
  until that point.

What needs to be done:
- writing some structure dump and load functions to also save some info which 
MIRA amasses during the run but are not directly related to the reads 
themselves
- changing the main assembly function to allow loading/saving the above data
  and jump to the right point in the assembly
- or hooking the restart into the main mira program, making it load the
  restart data and jumping to the above mentioned assembly function.

Especially the later two are a bit tricky as the function in question has 
grown over the past 10 years and has become somewhat unreadable. It was 
scheduled for cleanup and rewrite for quite some time, but I never came around 
doing this. I'm not quite sure what the best way would be, if I had I'd be 
further in the implementation.

If you want to have a look: src/progs/mira.C contains the main caller function 
where I suppose the loading should be done and src/mira/assembly.C the 
assemble() function which should be changed to know about restart. You might 
see some comments and start of stubs regarding this, but they are not 
functional yet (apart saving reads as checkpoint data).

Best,
  Bastien

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: