woensdag 12 december 2012

Further DMTCP Testing

In attendance of the answer of the creators of DMTCP, some additional testing is being performed.
Regular Java sockets checkpointing and restarting worked without much difficulties.
To mimic our usecase even further, an EC2 image was created after checkpointing.
This image was then relaunched on different instances where we altered the IP's in the restartscript.
After making sure the SSH information was all correct, a dmtcp_coordinator was launched.
Thereafter the restart script was executed and functioned without any issues.

Even though the new instances have new network addresses, the connections were still recovered.
This is because of DMTCP's feature to use a cluster-wide discovery service to find new addresses.
(See DMTCP: transparent checkpointing for cluster computations and the desktop)

There have also been some new attempts in trying to automatically execute an EC2 snapshot while checkpointing in DMTCP.

Currently this has been tried through the usage of dmtcpplugin.
This method provides a couple of points during checkpointing after which you may execute custom code.
But problems arise since during checkpointing a lot of methods cannot be executed.
Therefor executing a system() call right after checkpointing is being refused.
I hope to solve this through further communication with the creators of DMTCP.

Geen opmerkingen:

Een reactie posten