Oct 24, 2014
The LocalCluster class is part of our test framework that has the ability to test different aspects or modes of the product.
NATIVE_EE_JNI means that we are testing the default "ee" (execution engine) C++ component of the product in the normal mode -- by accessing the compiled C++ code through its normal JNI interface.
Depending on other options passed to the LocalCluster, this JNI interface may be called from a thread inside the test executable that eclipse is running.
OR the JNI interface may be called from a separate "VoltDB server" jvm process -- see below.
The jar file actually comes from the LocalCluster as a side effect of using it to build and launch a server configuration for testing. This string tells it what to name that jar file -- usually the file gets created, used, and then destroyed by the LocalCluster object, so the name is not very important.
Three hosts means three different jvm processes will be set up to be VoltDB servers (each using JNI) -- at least two of these must be different from the test executable -- one of them may share the jvm of the test executable or it may be a stand-alone server process like the other two.
This depends on other options in the LocalCluster. This multi-server JNI configuration sometimes gets the label "// CLUSTER" in the test code to distinguish it from the single-server JNI case.
In a real user environment, each VoltDB server usually gets launched on its own physical or virtual host machine, but for testing convenience, LocalCluster, launches all of the server processes on the local machine.
This makes LocalCluster useful for testing correctness of basic database functionality but not as useful for testing performance or networking.
Nov 14, 2014
Yes, LocalCluster accepts 3 parameters that match settings in the normal deployment.xml on a production system:
the number of hosts (server processes, usually on separate VMs or machines)
the number of sites per host (threads in the server process dedicated to serving a "partition" a subset of the database data -- this is the same value for all hosts -- they are all assumed to be "the same size")
the k factor -- a measure of data redundancy -- k=0 means no redundancy, every site in every host in the cluster serves a unique partition of the data, so all must be available. k=1 means every site has 1 twin site that manages the same partition of the data, but runs on a different host, in case 1 host is stopped or becomes disconnected.
Two different sites on the same host will typically have their twins on two different other hosts. k=2 means every site has 2 identical sites running on 2 other hosts, in case 2 hosts are stopped or disconnected, etc.
A partition is the subset of the rows of a partitioned table that is served by (k+1) redundant sites. Each row in that table exists in exactly one partition. That means that a copy of each row exists on k+1 redundant sites that all serve the same exact piece of the table or (sub)set of rows. The number of partitions is determined by the formula:
(number of hosts) * (sites per host) / (k factor + 1)
The values for number of hosts, sites per host, and k factor must be chosen so that this formula has a whole number result.