Forum: VoltDB Architecture

Post: Is there fault protection on the table partition level?

Is there fault protection on the table partition level?
ccherng
Feb 14, 2014
This page http://voltdb.com/docs/UsingVoltDB/KsafeNetPart.php says that fault protection will keep up the larger segment of the cluster in the event of a network partition.

Is there a notion of keeping up the larger part of the replication of a partition of a table?

One could imagine a table that is partitioned such that partitions A are located on machines in the United States and partitions B are located on machines in say India. Assume most of the queries at each site India and the United States are mostly independent of each other.

Then in the event of a network split between the two sites would it make sense to have a notion of keeping both sites up instead of just keeping up the site with the most machines in the cluster. The idea being that each site can continue to handle queries for their partition of the table. And queries that require information or are dependent on the state of the other half of the table would be forbidden.
nshi
Mar 3, 2014
Hi ccherng,

Partition (fault) detection refers to the k-safety feature, where you have k+1 copies of the data in the database. When machines are separated from each other due to a network partition, if the two halves both still contain at least 1 copy of ALL partitions, they will diverge if they continue to run. In this case, the partition detection feature prevents the database from diverging by shutting down the smaller half.

However, if the cluster is separated in a way that one part doesn't have a complete copy of all partitions, those machines will be shut down because they are no longer complete.

In the scenario you described, letting partial database to continue running will result in inconsistent data during recovery.