Forum: VoltDB Architecture

Post: Network Partitioning

Network Partitioning
May 18, 2010
Thank you for all the answers so far!

What is the strategy VoltDB uses to recover from network partitioning?

If a mirror node has taken over, is it phased back into the background after the logical partition's master is back, and possibly, resynched? And most of all, how does the master understand that it was partitioned off of the music and that it hadn't stopped playing but he missed some beats?

For what I know these are almost philosophical issues that need practical minded approaches to get resolved. What where your decisions?

Thanks in advance,
Network Partitions
May 18, 2010
First, note that small clusters can be networked to make a network partition extremely unlikely, if not impossible, requiring a partial internal failure of a multi-port ethernet switch or multiple concurrent failures if using a redundant switched network.

Also, I'd like to clarify that VoltDB doesn't have mirror nodes or master partitions. Nodes are not the unit of replication - partitions are replicated. All partition replicas (aka sites) are equal - there is not a master-slave relationship between the sites replicating the same logical partition.

What is the strategy VoltDB uses to recover from network partitioning?

VoltDB currently will not recover a network partition automatically. If your cluster suffers a network partition failure, it must be restored from a previous snapshot or manually recovered.

VoltDB does not currently support, but could relatively easily support, a feature that only allowed the majority set of a partitioned network to continue operation. This may be desirable on larger clusters (10s of nodes). It does, however, make configurations of k=1 with 2 nodes or k=2 with 3 nodes impossible. In each case, the k'th failure would leave a majority of nodes disabled and would then trigger the partition majority-set failure.

As an aside, there are some interesting special cases with a small number of nodes that are impossible to network partition when sites are assigned to nodes in an offset, round robin fashion. For example, the three node cluster with partition replica assignments {AB}, {BC}, {CA} can not be network partitioned. The problem simply doesn't exist in this configuration.

Resynching a network partitioned cluster

Allowing operation of a partitioned sub-set, even with the prospect of merging data when the partition is resolved, does not meet VoltDB's consistency requirements. The divergence would violate VoltDB's consistency requirement.

VoltDB's philosophy is that you can solve your OLTP problem with a relatively small number of nodes (10s, not 1000s) that can be networked to be partition-resilient. Consequently, your OLTP application can reasonably choose consistency over network partition tolerance.

Mike Stonebraker recently wrote about this problem in the context of the CAP theorem on his CACM blog.