Kfactor controls the number of copies of the data a cluster keeps. This provides availability
in the event of a node going down. If kfactor is 0, there's no copy of the data and a failure on a node will result in cluster failure. If kfactor is 1, there's a copy of the data and a failure on a node will not result in cluster failure.
To help explain what is happening between difference in kfactor, lets assume sites/host=2, nodes=2 and kfactor=0, your data will be split into 4 unique partitions
( (2 nodes * 2 sites/node) / (0 kfactor + 1) ) among 2 nodes. Voltdb partition data automatically, but for simplicity, lets say partition 1/2 is on node 1 and 3/4 is on node 2. When you execute a query with no where clause, every partition needs return its' data back to coordinator then returned back to the user. All 4 partitions are actively processing the workload.
Now lets change the kfactor=1. This means your unique partitions = 2 ((2 nodes * 2 sites/node) / (1 kfactor + 1)) but you have a copy of the data. The partitions is now 1/1/2/2. VoltDb will automatically partition the data, that is node 1 and 2 will have the a copy of the data, 1/2. When you execute a query with no where clause, only 2 of the partitions needs return its' data back to coordinator then returned back to the user. This actually will take longer as only 2 partitions are working on 50% of the data as opposed to the 4 partitions working on only 25% of the data. This is what your experiencing.
Assuming your hardware is capable, achieving the same kfactor=0 latency is done by doubling the sites/host. That is sites/host=4
, nodes=2, and kfactor=1, yields 4 unique partitions (( 2 nodes * 4 sites/host) / ( 1 kfactor + 1)).
May I ask what kind of application you are building? Typically, VoltDB applications would not require a full table scan of multiple tables.