Forum: VoltDB Architecture

Post: Threading

Threading
monster
Sep 29, 2010
I remember reading something that said that VoltDB is "single-threaded"/serial, when it come to executing stored procedures. Are the stored procedures themselves running serially, or only the calls to the API, like voltExecuteSQL()? The point of my question is, if my stored procedure did IO between calls to the API, for example to store a BLOB to a file or something, would that block other stored procedures?
"Single threaded" procedure execution explained
rbetts
Sep 29, 2010
Single partiton stored procedures are run serially to completion by each partition replica ("site" in VoltDB parlance). Yes, if you block a stored procedure on IO, you will block all progress at that partition replica.


Note carefully: the java procedure runs at *each* replica. Procedures must be deterministic. We do not encourage users to write procedures with non-database side-effects. However, if such procedures are written, all side-effects must be idempotent.


Multi-partition procedures are run to completion at a single selected partition replica that coordinates the distributed transaction with the participating remote partition replicas. There are existing optimizations to eliminate blocking at the remote replicas (and more being investigated / under development). But in the most general case, multi-partition procedures will stall the cluster until committed.


My architectural suspicion is that if you want to couple a second data store with VoltDB, you may desire a looser coupling than direct access via VoltDB stored procedures.


Ryan
What about multi-transactions procedures?
monster
Oct 2, 2010
Single partiton stored procedures are run serially to completion by each partition replica ("site" in VoltDB parlance). Yes, if you block a stored procedure on IO, you will block all progress at that partition replica.


Note carefully: the java procedure runs at *each* replica. Procedures must be deterministic. We do not encourage users to write procedures with non-database side-effects. However, if such procedures are written, all side-effects must be idempotent.


Multi-partition procedures are run to completion at a single selected partition replica that coordinates the distributed transaction with the participating remote partition replicas. There are existing optimizations to eliminate blocking at the remote replicas (and more being investigated / under development). But in the most general case, multi-partition procedures will stall the cluster until committed.


My architectural suspicion is that if you want to couple a second data store with VoltDB, you may desire a looser coupling than direct access via VoltDB stored procedures.


Ryan


Firstly, I would like to say that I'm not sure if I understand the sentence "Single partiton stored procedures are run serially to completion by each partition replica ("site" in VoltDB parlance)." properly. Does this apply only to stored procedure that make changes, so that all replica keep in sync, or also to ones that only do selects? I cannot see a reason to perform the same select on the same data in several JVMs...


So, the idea is that stored procedure represents a single transaction. Since some processes are single-partition and others are not, and multi-partition transactions block the whole cluster, one might get a better performance by splitting multi-partition transactions in several single-partition transactions, if possible. Also, by running the whole stored procedure serially in a single-thread, you save yourself synchronization, which makes your code very fast, but only if the stored procedure does nothing that "takes long", and is kept to the absolute minimum.


Having stored procedures run in a (bound) thread pool, and allowing them to perform long-running operations would make stored procedure much more "versatile". It would save having many round-trips to the client, since socket communication is many times slower then in-process communication, even on the same machine. What I'm trying to achieve is some practical way of doing processing inside the VoltDB JVM, but outside of a transaction.


I think this could be best realized with a “second type” of stored procedures which would allow multiple transactions. Those stored procedures would run in a thread pool, outside of the VoltDB thread and the only API call they would be able perform would be to call other stored procedures, which would run in the VoltDB thread. Like that, they could perform pre-processing, post-processing, and even processing between transactions, without blocking the whole cluster. They could take long without hurting performance, but they would look like normal stored procedure to the client, just “slower”. They would also reduce the number of multi-partition transactions, therefore increasing the scalability. They could be defined “implicitly” by simply having no SQL statements in the class. Since all SQL has to be defined statically in the class, having no SQL is a hint that the stored procedure cannot perform anything in the VoltDB thread, and therefore can be run in some other thread pool.


They might be programmed in such a way that they take care of errors themselves, so it might then be interesting to also add the ability to call them without waiting for the results, if they don't have to return anything useful. After that, the next logical step would be to integrate a task scheduler (like Quartz) in VoltDB. It could store it's own internal state in VoltDB itself, and move the processing of tasks to the appropriate VoltDB partition, assuming the VoltDB API offers helpers for that. The tasks would be like agents that move to where the data is.


I think this would be generally useful, and is probably realizable with little code. What do you think?


One particular use would be a stored procedure that updates every row in a big table while the system is online. The work could be divided into many small "batches" that are performed by calls to a "real" stored procedure, while keeping the database responsive by allowing other stored procedures to execute between the batches. Is their some kind of cursor implementation that could be used?
I cannot see a reason to
rbetts
Oct 2, 2010
Firstly, I would like to say that I'm not sure if I understand the sentence "Single partiton stored procedures are run serially to completion by each partition replica ("site" in VoltDB parlance)." properly. Does this apply only to stored procedure that make changes, so that all replica keep in sync, or also to ones that only do selects? I cannot see a reason to perform the same select on the same data in several JVMs...


So, the idea is that stored procedure represents a single transaction. Since some processes are single-partition and others are not, and multi-partition transactions block the whole cluster, one might get a better performance by splitting multi-partition transactions in several single-partition transactions, if possible. Also, by running the whole stored procedure serially in a single-thread, you save yourself synchronization, which makes your code very fast, but only if the stored procedure does nothing that "takes long", and is kept to the absolute minimum.


Having stored procedures run in a (bound) thread pool, and allowing them to perform long-running operations would make stored procedure much more "versatile". It would save having many round-trips to the client, since socket communication is many times slower then in-process communication, even on the same machine. What I'm trying to achieve is some practical way of doing processing inside the VoltDB JVM, but outside of a transaction.


I think this could be best realized with a “second type” of stored procedures which would allow multiple transactions. Those stored procedures would run in a thread pool, outside of the VoltDB thread and the only API call they would be able perform would be to call other stored procedures, which would run in the VoltDB thread. Like that, they could perform pre-processing, post-processing, and even processing between transactions, without blocking the whole cluster. They could take long without hurting performance, but they would look like normal stored procedure to the client, just “slower”. They would also reduce the number of multi-partition transactions, therefore increasing the scalability. They could be defined “implicitly” by simply having no SQL statements in the class. Since all SQL has to be defined statically in the class, having no SQL is a hint that the stored procedure cannot perform anything in the VoltDB thread, and therefore can be run in some other thread pool.


They might be programmed in such a way that they take care of errors themselves, so it might then be interesting to also add the ability to call them without waiting for the results, if they don't have to return anything useful. After that, the next logical step would be to integrate a task scheduler (like Quartz) in VoltDB. It could store it's own internal state in VoltDB itself, and move the processing of tasks to the appropriate VoltDB partition, assuming the VoltDB API offers helpers for that. The tasks would be like agents that move to where the data is.


I think this would be generally useful, and is probably realizable with little code. What do you think?


One particular use would be a stored procedure that updates every row in a big table while the system is online. The work could be divided into many small "batches" that are performed by calls to a "real" stored procedure, while keeping the database responsive by allowing other stored procedures to execute between the batches. Is their some kind of cursor implementation that could be used?


I cannot see a reason to perform the same select on the same data in several JVMs


Running read only procedures everywhere is easier, does not harm performance much, allows consistency checking when responses are gathered, and simplifies node-failure scenarios.


Having stored procedures run in a (bound) thread pool, and allowing them to perform long-running operations would make stored procedure much more "versatile".


We've discussed this internally for multi-partition transactions. There are advantages, possibly, in the multi-partition case. In the single partition case, however, there are severe disadvantages. Single partition procedures have *very* efficient access to the SQL storage layer as implemented.


I think this could be best realized with a “second type” of stored procedures which would allow multiple transactions.


I understand this as meaning that you wish to run a longer-running junk of java in VoltDB's JVM that can issue multiple independent transactions. I'd measure multiple stored procedure invocations from a client first to determine if such a structure is really necessary.


Also, you'd have to consider how/if the state of this multi-transaction procedure would be replicated. What happens if its owning Java process fails or terminates?


This change would make VoltDB more of an "in-process" store (but a distributed one, which is a little strange to think about). We haven't really put a lot of thought into that paradigm.


Is there some kind of cursor implementation that could be used?


VoltDB does not include a cursor implementation.
"really, really fast" AND "scalable"
monster
Oct 2, 2010
I cannot see a reason to perform the same select on the same data in several JVMs


Running read only procedures everywhere is easier, does not harm performance much, allows consistency checking when responses are gathered, and simplifies node-failure scenarios.


Having stored procedures run in a (bound) thread pool, and allowing them to perform long-running operations would make stored procedure much more "versatile".


We've discussed this internally for multi-partition transactions. There are advantages, possibly, in the multi-partition case. In the single partition case, however, there are severe disadvantages. Single partition procedures have *very* efficient access to the SQL storage layer as implemented.


I think this could be best realized with a “second type” of stored procedures which would allow multiple transactions.


I understand this as meaning that you wish to run a longer-running junk of java in VoltDB's JVM that can issue multiple independent transactions. I'd measure multiple stored procedure invocations from a client first to determine if such a structure is really necessary.


Also, you'd have to consider how/if the state of this multi-transaction procedure would be replicated. What happens if its owning Java process fails or terminates?


This change would make VoltDB more of an "in-process" store (but a distributed one, which is a little strange to think about). We haven't really put a lot of thought into that paradigm.


Is there some kind of cursor implementation that could be used?


VoltDB does not include a cursor implementation.


allows consistency checking when responses are gathered


I didn't know you did that. Sounds like a good idea after all.


you wish to run a longer-running junk of java in VoltDB's JVM


My main use-case is simply to offload large BLOBs (images) to an external storage system. Memory might be cheaper nowadays, but it's still not that cheap. And since the images should be cached by the CDN, they will be accessed rarely and won't need to be wasting VoltDB's memory constantly. What I don't want is forcing the client code to deal with distributed transactions. I am assuming that you do not support the two-phase commit, which is required for distributed transactions anyway. Therefore it would be nice if this could be implemented as a kind of plug-in inside VoltDB. VoltDB already supports exporting, but what I want is a bidirectional, transparent and transactional access. I am trying to find a way of doing this in the least "intrusive" way possible. The way I see it, it's normally the database that is meant to implement all the ugly, mind-boggling code that insures ACID transactions, so that the client code can be kept clean and simple.


an "in-process" store (but a distributed one, which is a little strange to think about)


Yes! "in-process store" means really, really fast (compared to out-of-process memory-store, which is "just" really (once) fast), and "distributed" means scalable. Getting "really, really fast" AND "scalable" together is like having your birthday and Christmas at the same time. Now, add "flexible" to the mix, and we would be in Heaven. Terracotta "feels" like an "in-process" store, which makes it really pleasant to use, but it is only scalable in the commercial version. This is the main reason why I'm looking for alternatives to it.


We haven't really put a lot of thought into that paradigm.


I understand you. I might want to go against the flow, but I'm not trying to make anyone follow me against their will. I could just get the code, and bend it to my will, but this would mean that I would have to do a merge with every new version you bring out. Not so good. Maybe we can find a solution that appeals to everyone, and then it can be merged into the main codebase.
I could just get the code,
rbetts
Oct 3, 2010
allows consistency checking when responses are gathered


I didn't know you did that. Sounds like a good idea after all.


you wish to run a longer-running junk of java in VoltDB's JVM


My main use-case is simply to offload large BLOBs (images) to an external storage system. Memory might be cheaper nowadays, but it's still not that cheap. And since the images should be cached by the CDN, they will be accessed rarely and won't need to be wasting VoltDB's memory constantly. What I don't want is forcing the client code to deal with distributed transactions. I am assuming that you do not support the two-phase commit, which is required for distributed transactions anyway. Therefore it would be nice if this could be implemented as a kind of plug-in inside VoltDB. VoltDB already supports exporting, but what I want is a bidirectional, transparent and transactional access. I am trying to find a way of doing this in the least "intrusive" way possible. The way I see it, it's normally the database that is meant to implement all the ugly, mind-boggling code that insures ACID transactions, so that the client code can be kept clean and simple.


an "in-process" store (but a distributed one, which is a little strange to think about)


Yes! "in-process store" means really, really fast (compared to out-of-process memory-store, which is "just" really (once) fast), and "distributed" means scalable. Getting "really, really fast" AND "scalable" together is like having your birthday and Christmas at the same time. Now, add "flexible" to the mix, and we would be in Heaven. Terracotta "feels" like an "in-process" store, which makes it really pleasant to use, but it is only scalable in the commercial version. This is the main reason why I'm looking for alternatives to it.


We haven't really put a lot of thought into that paradigm.


I understand you. I might want to go against the flow, but I'm not trying to make anyone follow me against their will. I could just get the code, and bend it to my will, but this would mean that I would have to do a merge with every new version you bring out. Not so good. Maybe we can find a solution that appeals to everyone, and then it can be merged into the main codebase.


I could just get the code, and bend it to my will.


We absolutely hope others can take, use, bend, spindle and mutilate the VoltDB code base to their purposes. We encourage you to do so - and would be very happy to see the work put to good purpose.


Re-absorbing changes back into the VoltDB mainline depends on the quality and nature of the specific changes - we would evaluate that on a change-by-change basis.
re: Threading
tcallaghan
Sep 29, 2010
Stored procedures execute fully single-threaded within a VoltDB partition, not just portions of the stored procedure.


While a stored procedure is executing all other stored procedures for that partition are queued and waiting to execute. For that reason you should avoid operations that can cause the execution to stall, like file IO.


-Tim