Forum: VoltDB Architecture

Post: Multi-partition transaction

Multi-partition transaction
sriramsrinivasan
Jun 5, 2010
Considering that each core runs a single mainloop, and that lock-less concurrency implies run-to-completion semantics, does it mean that while a multi-sited transaction is in progress, the node that is doing the transactions coordination cannot put that piece of work on hold and attend to another item in the queue? Which means the node is effectively blocked until the transaction finishes?


I'm expecting the answer to be "yes, but it is no big deal" for OLTP apps. Any clarifications will be appreciated. thanks much.
Your description is accurate.
rbetts
Jun 5, 2010
A lot of multi-site transactions we see are what we describe as "one shot" transactions. They need exactly one round trip from the coordinator to each participant node. There are several optimizations we are planning for these transactions.


For example, many one-shot transactions do not have transactional work to do after the round trip replies are gathered at the coordinator. For example, they may be doing a multi-partition read and then unioning, ordering, grouping, or limiting the intermediate results to produce the client's response. The final aggregation doesn't read or write a persistent table. Accordingly, the coordinator can advance to other transactions.


There are three or four other such situations that can be optimized to remove blocking.
If nodes are blocked while
eriohl
Jun 28, 2010
If nodes are blocked while waiting for other nodes, how do you prevent deadlocks?


I mean: If there are for example 2 nodes and 2 partitions and request to node A is blocked waiting for data from partition B and node B is blocked waiting for data from partition A.


Or are the nodes only blocked for executing stored procedures and can execute other queries while waiting or something like that?
Global ordering is key.
jhugg
Jun 28, 2010
If nodes are blocked while waiting for other nodes, how do you prevent deadlocks?


I mean: If there are for example 2 nodes and 2 partitions and request to node A is blocked waiting for data from partition B and node B is blocked waiting for data from partition A.


Or are the nodes only blocked for executing stored procedures and can execute other queries while waiting or something like that?


Since all transactions execute at each partition in a global order, if nodes A and B are both part of transactions X and Y, then both A and B will order X and Y exactly the same. If the order is XY, then X must complete before any work is done on Y. Thus there are no deadlocks in VoltDB.


There's a little white lie above though. If nodes are idle waiting for a response from a multi-partition transaction, sometimes they can sneak in a single partition transaction if they can verify it will not conflict. In the future, we plan to really improve these kinds of tricks to almost eliminate cpu idleness at partitions.
So then if all transactions
eriohl
Jun 28, 2010
Since all transactions execute at each partition in a global order, if nodes A and B are both part of transactions X and Y, then both A and B will order X and Y exactly the same. If the order is XY, then X must complete before any work is done on Y. Thus there are no deadlocks in VoltDB.


There's a little white lie above though. If nodes are idle waiting for a response from a multi-partition transaction, sometimes they can sneak in a single partition transaction if they can verify it will not conflict. In the future, we plan to really improve these kinds of tricks to almost eliminate cpu idleness at partitions.


So then if all transactions where multi-partitioned you would only be able to execute "instances of single partition" number of transactions at the same time (and would probably mean a lot of cpu idleness because you wouldn't know which queries to execute before running the stored procedure) right? Or is it more clever then that?
Pretty much.
jhugg
Jun 28, 2010
So then if all transactions where multi-partitioned you would only be able to execute "instances of single partition" number of transactions at the same time (and would probably mean a lot of cpu idleness because you wouldn't know which queries to execute before running the stored procedure) right? Or is it more clever then that?


At the moment, an all multi-partition workload would run one transaction at a time, and would have a fair bit of idleness, though for slightly different reasons than you suggest. This is a worst-case workload for VoltDB. Though we'll continue to improve the performance of multi-partition transactions, we don't expect our model to be a great fit for workloads that aren't mostly single-partition.


There may be additional details on ratios of single/multi workloads in this research paper: http://cs-www.cs.yale.edu/homes/dna/papers/hstore-cc.pdf
Ok. I understand. Thank you.
eriohl
Jun 28, 2010
At the moment, an all multi-partition workload would run one transaction at a time, and would have a fair bit of idleness, though for slightly different reasons than you suggest. This is a worst-case workload for VoltDB. Though we'll continue to improve the performance of multi-partition transactions, we don't expect our model to be a great fit for workloads that aren't mostly single-partition.


There may be additional details on ratios of single/multi workloads in this research paper: http://cs-www.cs.yale.edu/homes/dna/papers/hstore-cc.pdf


Ok. I understand. Thank you.
App Side Architecture
henning
Jul 6, 2010
Ok. I understand. Thank you.


I found the cost to be expected from this so high that I architected the app side in a way that it effectively emulates multi partition transactions by executing two 1-shot, single partition VoltDB-transactions and taking care that both are executed (thus 'emulating' it being a transaction).


In my case I can do this because I can 'domain-logically' be sure that both VoltDB partition transactions, on both partitions, will complete. VoltDB can't 'know' this - and as far as I see: could not in any case, by any logic, as it is true by implicit business logic.


I am not really building overhead in for this but merely separate one call into to. Obviously, the VoltDB API could do this for me, and I would appreciate that. It would have to accept my word for the fact that both parts of the transaction won't violate integrity.


Am I correct there is no switch or method for this to be handled by the VoltDB API yet? Will there be one?


Also, I'd propose that the essence of this thread should go into the VoltDB manual to clarify the basic mechanics and the status quo.