Forum: Building VoltDB Applications

Post: TPC-C and Elastic join

TPC-C and Elastic join
nkatsip
Dec 3, 2013
Good Evening,

I am currently working on a research project and I have set up Volt DB Enterprise Edition v3.7 on a server. Also, I have taken the TPC-C implementation from the VoltDB Community edition and I have built the catalog by issuing the command:


./run.sh catalog
./run.sh server


The server is initialized properly (its IP: vm.host1.ip). On another machine, after I built the catalog (tpcc.jar) I issue the command:


voltdb add -H vm.host1.ip -l ~/voltdb-ent-3.7/voltdb/license.xml -d deployment.xml 


and I get the following output on the secondary host I attempt to add:


user@snf-185219:~/voltdb-ent-3.7/tpcc$ voltdb add -H vm.host1.ip -l ~/voltdb-ent-3.7/voltdb/license.xml -d deployment.xml 
Initializing VoltDB...

 _    __      ____  ____  ____ 
| |  / /___  / / /_/ __ \/ __ )
| | / / __ \/ / __/ / / / __  |
| |/ / /_/ / / /_/ /_/ / /_/ / 
|___/\____/_/\__/_____/_____/

--------------------------------

Build: 3.7 voltdb-3.7.0.1-0-gacb9c60-local Enterprise Edition
Connecting to the VoltDB cluster leader /vm.host1.ip:3021
2 Notified of host 0
Host id of this node is: 2
WARN: Running without redundancy (k=0) is not recommended for production use.
Server completed initialization.
FATAL: Failed to join the cluster
FATAL: Fatal exception
java.lang.RuntimeException: Legacy hashinator doesn't support predecessors
	at org.voltdb.LegacyHashinator.pPredecessors(LegacyHashinator.java:71)
	at org.voltdb.TheHashinator.predecessors(TheHashinator.java:390)
	at org.voltdb.join.ElasticJoinUtils.findRemotePredecessorsForPartition(ElasticJoinUtils.java:60)
	at org.voltdb.join.ElasticJoinUtils.findRemotePredecessorsForPartitions(ElasticJoinUtils.java:41)
	at org.voltdb.join.ElasticJoinUtils.calculateRemoteDataSources(ElasticJoinUtils.java:119)
	at org.voltdb.join.ElasticJoinCoordinator.startJoinOnLeader(ElasticJoinCoordinator.java:265)
	at org.voltdb.join.ElasticJoinCoordinator.startJoin(ElasticJoinCoordinator.java:245)
	at org.voltdb.RealVoltDB.run(RealVoltDB.java:1503)
	at org.voltdb.VoltDB.main(VoltDB.java:871)
VoltDB has encountered an unrecoverable error and is exiting.
The log may contain additional information.
   ERROR: Command "/usr/lib/jvm/java-7-oracle/bin/java ..." failed with return code 65280.

   FATAL: Exiting.


Also, on the initial host I get the following output:


WARN: Attempted delivery of message to failed site: 2:-1
WARN: Attempted delivery of message to failed site: 2:-1
WARN: Attempted delivery of message to failed site: 2:-1
WARN: Attempted delivery of message to failed site: 2:-1
WARN: Received remote hangup from foreign host snf-185219.vm.okeanos.grnet.gr/vm.host2.ip:54235
WARN: Host 2 failed
WARN: Host 2 failed
FATAL: K-Safety violation: No replicas found for partition: 3
FATAL: K-Safety violation: No replicas found for partition: 2
FATAL: Some partitions have no replicas.  Cluster has become unviable.
VoltDB has encountered an unrecoverable error and is exiting.
The log may contain additional information.


The IP of the 2nd host is vm.host2.ip. The last part of the logs of vm.host1.ip contain the following information:


2013-12-03 04:41:30,471   INFO  [ZooKeeperServer] ZK-SERVER: Processed session termination for sessionid: 0x15c20e8138000001
2013-12-03 04:42:19,752   INFO  [Socket Joiner] HOST: Received request type REQUEST_HOSTID
2013-12-03 04:42:19,790   INFO  [ZooKeeperServer] REJOIN: Joining site 2:-1 known  active sites 0:-1, 2:-1
2013-12-03 04:42:20,072   INFO  [ZooKeeperServer] REJOIN: Shipping ZK snapshot from 0:-1 to 2:-1
2013-12-03 04:42:26,828   INFO  [LeaderAppointer-Babysitters] TM: Noticed partition change [partition_16383, partition_1, partition_2, partition_0], currenctly watching [0, 1]
2013-12-03 04:42:26,829   INFO  [LeaderAppointer-Babysitters] TM: Done [0, 1, 2]
2013-12-03 04:42:26,960   INFO  [LeaderAppointer-Babysitters] TM: Appointing HSId 2:1 as leader for partition 2
2013-12-03 04:42:27,246   INFO  [Iv2ExecutionSite: 0:2] TM: MP leader repair 0:2 found 3 surviving leaders to repair.  Survivors: 0:0, 0:1, 2:1
2013-12-03 04:42:27,253   INFO  [Iv2ExecutionSite: 0:2] TM: MP leader repair 0:2 finished repair.
2013-12-03 04:42:27,335   INFO  [LeaderAppointer-Babysitters] TM: Noticed partition change [partition_3, partition_16383, partition_1, partition_2, partition_0], currenctly watching [0, 1, 2]
2013-12-03 04:42:27,336   INFO  [LeaderAppointer-Babysitters] TM: Done [0, 1, 2, 3]
2013-12-03 04:42:27,437   INFO  [LeaderAppointer-Babysitters] TM: Appointing HSId 2:2 as leader for partition 3
2013-12-03 04:42:27,680   INFO  [Iv2ExecutionSite: 0:2] TM: MP leader repair 0:2 found 4 surviving leaders to repair.  Survivors: 0:0, 0:1, 2:1, 2:2
2013-12-03 04:42:27,684   INFO  [Iv2ExecutionSite: 0:2] TM: MP leader repair 0:2 finished repair.
2013-12-03 04:42:27,814   INFO  [Leader elector-/db/leaders/globalservice] EXPORT: Attempting to boot export client due to rejoin or other cluster topology change
2013-12-03 04:42:30,221   WARN  [ZooKeeperServer] org.voltdb.messaging.impl.HostMessenger: Attempted delivery of message to failed site: 2:-1
2013-12-03 04:42:30,229   WARN  [ZooKeeperServer] org.voltdb.messaging.impl.HostMessenger: Attempted delivery of message to failed site: 2:-1
2013-12-03 04:42:30,234   WARN  [ZooKeeperServer] org.voltdb.messaging.impl.HostMessenger: Attempted delivery of message to failed site: 2:-1
2013-12-03 04:42:30,240   WARN  [ZooKeeperServer] org.voltdb.messaging.impl.HostMessenger: Attempted delivery of message to failed site: 2:-1
2013-12-03 04:42:30,244   WARN  [Volt Network - 0] HOST: Received remote hangup from foreign host snf-185219.vm.okeanos.grnet.gr/83.212.97.38:54235
2013-12-03 04:42:30,245   WARN  [Volt Network - 0] NETWORK: Host 2 failed
2013-12-03 04:42:30,245   INFO  [ZooKeeperServer] REJOIN: Agreement, Processing FaultMessage {failed: 2:-1, reporting: 0:-1, witnessed: true}
2013-12-03 04:42:30,248   INFO  [ZooKeeperServer] REJOIN: Agreement, Sending survivors SITE_FAILURE_UPDATE from site: 0:-1 survivors: [0:-1] failed: [2:-1]
2013-12-03 04:42:30,255   INFO  [ZooKeeperServer] REJOIN: Agreement, Adding 2:-1 to failed sites history
2013-12-03 04:42:30,255   INFO  [ZooKeeperServer] REJOIN: Agreement, handling site faults for newly failed sites 2:-1 initiatorSafeInitPoints {2:-11567832091444379650}
2013-12-03 04:42:30,255   INFO  [ZooKeeperServer] ZK-SERVER: Initiating close of session 0x15c20ee8fc800002
2013-12-03 04:42:30,256   WARN  [ZooKeeperServer] NETWORK: Host 2 failed
2013-12-03 04:42:30,268   INFO  [ZooKeeperServer] ZK-SERVER: Processed session termination for sessionid: 0x15c20ee8fc800002
2013-12-03 04:42:30,287   INFO  [Leader elector-/db/leaders/globalservice] EXPORT: Attempting to boot export client due to rejoin or other cluster topology change
2013-12-03 04:42:30,304   FATAL [LeaderAppointer-Babysitters] TM: K-Safety violation: No replicas found for partition: 3
2013-12-03 04:42:30,352   FATAL [LeaderAppointer-Babysitters] TM: K-Safety violation: No replicas found for partition: 2
2013-12-03 04:42:30,890   FATAL [LeaderAppointer-Babysitters] HOST: Some partitions have no replicas.  Cluster has become unviable.


The following information are found on the end of the log file at vm.host2.ip server:


2013-12-03 04:42:29,453   INFO  [main] HOST: About to list cluster interfaces for all nodes with format [ip1 ip2 ... ipN] client-port:admin-port:http-port
2013-12-03 04:42:29,454   INFO  [main] HOST:   Host id: 0 with interfaces: 83.212.97.127 2001:648:2ffc:1112:a80c:eaff:fef4:44b5%2 21212,21211,8080 [PEER]
2013-12-03 04:42:29,456   INFO  [main] HOST:   Host id: 2 with interfaces: 83.212.97.38 2001:648:2ffc:1112:a80c:eaff:fe96:4e31%2 21212,21211,8080 [SELF]
2013-12-03 04:42:29,456   WARN  [main] HOST: Running without redundancy (k=0) is not recommended for production use.
2013-12-03 04:42:29,552   INFO  [Thread-8] HOST: Logging config info
2013-12-03 04:42:29,553   INFO  [main] CONSOLE: Server completed initialization.
2013-12-03 04:42:29,617   INFO  [main] JOIN: Start joining host 2 with new partitions to add [2, 3]
2013-12-03 04:42:29,713   FATAL [main] HOST: Failed to join the cluster
2013-12-03 04:42:29,720   FATAL [main] HOST: Fatal exception
java.lang.RuntimeException: Legacy hashinator doesn't support predecessors
        at org.voltdb.LegacyHashinator.pPredecessors(LegacyHashinator.java:71)
        at org.voltdb.TheHashinator.predecessors(TheHashinator.java:390)
        at org.voltdb.join.ElasticJoinUtils.findRemotePredecessorsForPartition(ElasticJoinUtils.java:60)
        at org.voltdb.join.ElasticJoinUtils.findRemotePredecessorsForPartitions(ElasticJoinUtils.java:41)
        at org.voltdb.join.ElasticJoinUtils.calculateRemoteDataSources(ElasticJoinUtils.java:119)
        at org.voltdb.join.ElasticJoinCoordinator.startJoinOnLeader(ElasticJoinCoordinator.java:265)
        at org.voltdb.join.ElasticJoinCoordinator.startJoin(ElasticJoinCoordinator.java:245)
        at org.voltdb.RealVoltDB.run(RealVoltDB.java:1503)
        at org.voltdb.VoltDB.main(VoltDB.java:871)


What is the issue with the Legacy hashinator? Why do I get this error when I attempt to elastically add a new node?

Thank you and sorry for the long post.

Sincerely,

Nick K.
pmartel
Dec 3, 2013
Nick,
In release 3.7, future use of the elastic feature must be enabled in the original configuration at cluster start up using the elastic attribute in the deployment.xml file:

<deployment>
<cluster hostcount="3"
sitesperhost="4"
kfactor="0"
elastic="enabled"/>

[ . . . ]

Without the "elastic" option enabled, VoltDB 3.7 will run in its back-release compatible non-elastic "legacy mode".
The message indicates that the original cluster was configured to run in this other mode.

This feature is more fully explained in our system documentation at http://voltdb.com/docs/UsingVoltDB/UpdateHw.php.

--paul
nkatsip
Dec 5, 2013
Oh, Thank you. I did the change in the deployment.xml file and the node is properly added. However, I am not sure that the client sends queries to both machine-nodes. How can I make the client send queries to both nodes?