Forum: Managing VoltDB

Post: VoltDB don't recover

VoltDB don't recover
phifty
Nov 28, 2013
Hi all,

we have a 4 node cluster that generates automatically a snapshot every 5 minutes. We tried to simulate a node failure and killed the java process on one node. Our k-factor is 2. The cluster was still working, but when we tried to rejoin the node with

voltdb rejoin host $NODE deployment $DEPLOYMENT_FILE


the whole cluster crashed with the message

FATAL: Stored procedure EndVisit generated different SQL queries at different partitions. Shutting down to preserve data integrity.


How can this happen? The corresponding procedure has a fixed number of fixed statements, that doesn't change at all.

What even more problematic is, that the recovery didn't worked. The command

voltdb recover host $RUNNING_NODE deployment $DEPLOYMENT_FILE


issued the messages


VoltDB has encountered an unrecoverable error and is exiting.
Message: Found multiple transactions ids during restore for a partition.


Have we done something wrong? Was there a misunderstanding?

We are using VoltDB 3.7 and the OpenJDK Java 1.6 64-Bit.

Best regards
dremella
Nov 28, 2013
Hi,

Thanks for reaching out. I am sorry that you encountered a challenge in your testing. Since this is not the expected behavior, could you please send the VoltDB log files from all the machines in your cluster for us to take a look at to determine the root cause of the situation you are facing? You can reach out to me on my direct email address at dremella@voltdb.com.

Thanks.
Monika
Dec 2, 2013
Hi,

Thanks for your answer.
I am working together with phifty on that project.
Unfortunately we did not keep the log file. I will try to reproduce the error and send the log file to you.

Best regards