Forum: Building VoltDB Applications

Post: Questions on failover

Questions on failover
prashanth
Dec 6, 2013
We have a typical read-only Active/Standby/Subscriber deployment using Timesten in a 3 node set-up in a site. The data provisioning happens in the background (like a batch process)
Only the Active accepts the writes and replicates the data to the rest of the nodes. During failure of Active, on the receipt of the SNMP traps, we invoke procedures on Standby to take over as new Active. The similar thing happens for failure of Standby also. In summary, we make use of the data replication and fail-over procedures provided by Timesten.

Now, VoltdB recommends making use of K-safety over partitions so that in case of failure of a node, the data is still available in other nodes as part of k-safety configurations. Also this k-safety model may require us to declare the deployment as Active/Active/Active Voltdb in a site.

Q1. Does this mean that the data can be provisioned into any of the nodes since all are active ?
Q2. But while we read data from any node, if the requested partitions are not available on that node, does this get lazily copied to that node ? Please explain here.
Q3. If the data partitions are spread across different nodes, how do we benchmark the performance on a single node consistently as there can be delays in loading the remote partitions?
Q4. We have to support multi-site deployment also? Should we use k-safety for intra-site and data replication for across site ?
Q5. What about using no k-safety at all but using data replication even within a site where we can continue to deploy VoltDb as Active/Standby/Subscriber mode ? what are the disadvantages ?
jpiekos
Dec 6, 2013
> Q1. Does this mean that the data can be provisioned into any of the nodes since all are active ?

In general, you can treat the cluster as "the database". Connect to any node, and VoltDB will route the transaction to the appropriate partition(s) for execution. So for example, if you wanted to load 100M rows into a partitioned table, you could connect to any node, invoke INSERT for each row, and VoltDB would make sure the INSERT executed on the proper partition (as identified by the partition key in the row you were inserting).

> Q2. But while we read data from any node, if the requested partitions are not available on that node, does this get lazily copied to that node ? Please explain here.

All participating partitions in a k-safe are updated synchronously (essentially 2-phase commit) within the transaction. Check out our Volt University online training module (a great resource!) for more details: http://voltdb.com/resources/volt-university/tutorials/section-1-5/

> Q3. If the data partitions are spread across different nodes, how do we benchmark the performance on a single node consistently as there can be delays in
> loading the remote partitions?


There are no delays in loading remote partitions.

> Q4. We have to support multi-site deployment also? Should we use k-safety for intra-site and data replication for across site ?


Yes, generally you would deploy a highly available (k-safe) cluster in your primary data center. Then, for disaster recovery scenarios, use database replication to maintain a replica cluster (read only) to another location, in the event that the primary data center becomes unavailable for some bad reason.

> Q5. What about using no k-safety at all but using data replication even within a site where we can continue to deploy VoltDb as Active/Standby/Subscriber mode
> ? what are the disadvantages ?

Definitely possible. However, be aware that if you lose a node in a non-k-safe cluster, VoltDB will no longer have a complete set of data in memory, and will not be able to respond with correct and consistent answers, and will therefor terminate.

If you'd like to talk in further detail about your deployment, feel free to reach out to me directly (jpiekos, voltdb)

John