maandag 12 november 2012

DMTCP and Java RMI

Initial Testing on Amazon


Some initial tests with DMTCP and Java RMI have proven it to be quite difficult.
Out of the box it simply did not work.
The first tests were done on standard Ubuntu 12.04 instances from Amazon and the following problems have been observed:

When starting an RMI program under JDK1.6 DMTCP can't start it.
To test this the RMIRegistry was ran but failed to start with following error thrown:

[1988] ERROR at connection.cpp:274 in onListen; REASON='JASSERT(tcpType() == TCP_BIND) failed'
     tcpType() = 4098
     id() = 2e03157ea2e7ccf4-1988-508e466b(99006)
Message: Listening on a non-bind()ed socket????
rmiregistry (1988): Terminating...

After using debugging capabilities of DMTCP it was traced back to a call to one of Java's internal libraries containing the implementation of a socket. Not really knowing what to do now a new version of the Java JDK was installed while hoping for the best.

Somehow this has solved the issue, I decided to perform the next stages of testing on my local network containing 2 Ubuntu 12.10 computers with Java OpenJDK 1.7 and DMTCP-1.2.6 installed.

Test Setup

First phase:

I created a small RMI program existing out of a server which is called upon by the client. It performs a simple hellow world and keeps track of the amount of calls that were made.
To immediately test the functionality with files, everything is logged both on the server and on the client.
Next to that we have an RMIRegistry capable of handling remote calls and a small HTTP server for remote classloading.

Without checkpointing the program runs as expected without any difficulties.

Second phase:

A first trial was to start up the program through the DMTCP framework by using the dmtp_checkpoint commands. This went well and the program still functioned as expected. The next step was to take a snapshot of the current state. This didn't go well and threw the following errors:


exception: java.rmi.ConnectIOException: error during JRMP connection establishment; nested exception is: 
 java.net.SocketException: Connection reset

java.rmi.ConnectIOException: error during JRMP connection establishment; nested exception is: 
 java.io.EOFException
 at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:304)
 at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
 at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:128)
 at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(RemoteObjectInvocationHandler.java:194)
 at java.rmi.server.RemoteObjectInvocationHandler.invoke(RemoteObjectInvocationHandler.java:148)
 at $Proxy0.say(Unknown Source)
 at HelloClient.main(HelloClient.java:24)
Caused by: java.io.EOFException
 at java.io.DataInputStream.readByte(DataInputStream.java:267)
 at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:246)

It seemed to be a problem with the restart of the program after having taken a snapshot.
A simple reboot of the system was the solution to this exception. It was now possible to take snapshots of the running RMI program.

The following fazes will be explained in my next blogpost.


Geen opmerkingen:

Een reactie posten