Forum: Managing VoltDB

Post: ERROR VEMCORE: Snapshot manager failed to connect to [SERVER IP] JMX endpoint

ERROR VEMCORE: Snapshot manager failed to connect to [SERVER IP] JMX endpoint
Bernardo
May 1, 2014
Hi all.

When starting the database, the following exception occurs for the 2 clusters. Here is the message for one of them:


2014-04-29 08:24:12,960   ERROR VEMCORE: Snapshot manager failed to connect to 10.0.1.202 JMX endpoint.
java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 10.0.1.202; nested exception is: 
	java.net.ConnectException: Connection refused]
	at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:369)
	at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:268)
	at org.voltdb.management.SnapshotManager.connectToServer(SnapshotManager.java:179)
	at org.voltdb.management.SnapshotManager.addServer(SnapshotManager.java:227)
	at org.voltdb.management.server.VEMCore.monitorServerForSnapshots(VEMCore.java:647)
	at org.voltdb.management.StartActionBase.startServers(StartActionBase.java:254)
	at org.voltdb.management.StartDatabaseAction.call(StartDatabaseAction.java:69)
	at org.voltdb.management.StartDatabaseAction.call(StartDatabaseAction.java:18)
	at org.voltdb.management.ActionBase.run(ActionBase.java:139)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at org.voltcore.utils.CoreUtils$4$1.run(CoreUtils.java:328)
	at java.lang.Thread.run(Thread.java:744)
Caused by: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 10.0.1.202; nested exception is: 
	java.net.ConnectException: Connection refused]
	at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:118)
	at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:203)
	at javax.naming.InitialContext.lookup(InitialContext.java:411)
	at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1936)
	at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1903)
	at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:286)
	... 14 more
Caused by: java.rmi.ConnectException: Connection refused to host: 10.0.1.202; nested exception is: 
	java.net.ConnectException: Connection refused
	at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
	at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
	at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
	at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:341)
	at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
	at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:114)
	... 19 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:579)
	at java.net.Socket.connect(Socket.java:528)
	at java.net.Socket.<init>(Socket.java:425)
	at java.net.Socket.<init>(Socket.java:208)
	at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
	at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:147)
	at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
	... 24 more
java.lang.InterruptedException
	at java.lang.Object.wait(Native Method)
	at java.lang.Thread.join(Thread.java:1280)
	at java.lang.Thread.join(Thread.java:1354)
	at org.voltdb.management.RemoteLauncher$Runner.run(RemoteLauncher.java:253)
	at java.lang.Thread.run(Thread.java:744)


There is no jobs running in the clusters yet, since this exception ocurred during the database start.

Any ideas?
Voltdb 4.2 Enterprise

Regards,

Bernardo
nshi
May 1, 2014
Hi Bernardo,

The exceptions you showed were from the VoltDB Enterprise Manager (VEM). They indicate that the VEM cannot connect to the VoltDB server located at 10.0.1.202, possibly because of firewall settings or other network related issues. By default, VEM tries to connect to port 9090 on the servers. Please make sure that the VEM node can access that port on the servers.

This error is not indicative of the health of the server process. They server may very well be up and running well. You can check by connecting to the server using a VoltDB client or visit 10.0.1.202:8080 in your browser. If you have changed the HTTP port of the VoltDB server, please use the port number instead of 8080, which is the default.
Bernardo
May 2, 2014
Hi Ning,

Now I believe I missed the point.
Isn`t the Enterprise Manager Server responsible for executing the catalog and starting the processes (like JMX) in the cluster nodes?
So when the database is created in the leader node, should not it do the rest of the job for the clusters (like executing the jar and starting JMX for remote monitoring by VEM)?

Regards,

Bernardo
nshi
May 2, 2014
Hi Bernardo,

When you start a database in VEM, it will launch the VoltDB processes on all the cluster nodes. The servers will load the catalog and set up JMX automatically.

In your previous reply, you mentioned that VEM warned about older version of Java on the cluster machines. That would prevent VEM from starting the VoltDB servers. If you have newer version of Java installed on those machines, you can set the JAVA_HOME environment variable to point to the newer version on the cluster machines. VEM will then start the servers using that version of Java. For example, if VEM uses your user account to SSH into the cluster nodes, you can export JAVA_HOME in your .bash_profile.