Forum: Building VoltDB Applications

Post: Can I use serializable level transactions to mimic stored procedure local temp tables?

Can I use serializable level transactions to mimic stored procedure local temp tables?
sarikan
Dec 18, 2012
Greetings,
I've asked about temp tables created from stored procedures and was kindly informed that VoltDB does not support them at the moment.
Thinking on the issue a bit, it occured to me that I should be able to achieve the same outcome using the I of the ACID. My approach, which I've also posted as a question here ( http://dba.stackexchange.com/questions/30665/delete-all-rows-within-transaction-what-would-happen-to-other-transactions#comment53413_30665 ) is as follows:

I start a transaction in a stored procedure (or a sp may be a transaction automatically as in postgres)
I delete all rows in the table (called A) that all other executions of this sp will use
I insert some rows to A
I perform queries, and write results to another table (B)
I delete all the rows in A
I commit the transaction.

Would this approach give me a private/clean view of the table A for all executions of my stored procedure, effectively the same result as postgresql's function local temp tables?

Regards
Seref
awilson
Dec 18, 2012
Greetings,
I've asked about temp tables created from stored procedures and was kindly informed that VoltDB does not support them at the moment.
Thinking on the issue a bit, it occured to me that I should be able to achieve the same outcome using the I of the ACID. My approach, which I've also posted as a question here ( http://dba.stackexchange.com/questions/30665/delete-all-rows-within-transaction-what-would-happen-to-other-transactions#comment53413_30665 ) is as follows:

I start a transaction in a stored procedure (or a sp may be a transaction automatically as in postgres)
I delete all rows in the table (called A) that all other executions of this sp will use
I insert some rows to A
I perform queries, and write results to another table (B)
I delete all the rows in A
I commit the transaction.

Would this approach give me a private/clean view of the table A for all executions of my stored procedure, effectively the same result as postgresql's function local temp tables?

Regards
Seref


Assuming that you are using the same partition within the stored procedure then your idea should work well.
sarikan
Dec 18, 2012
Thanks. Actually, the original system design is very much focused on parallel processing, which seems like a perfect fit for VoltDB nodes with stored procedures working in their own partitions.

I have some uses cases though where I'm supposed to aggregate all outcomes written to table B by separate executions of the same sp (possibly running in multiple nodes). That would let me parallel process multiple chunks and then join all the results. With the approach above, would this be possible, would it cause a problem? (I can't see any)

In the aggregation scenario, I also have to clean outputs written to multi-partition table B once the complete outputs are read from B. I can use a process id column for B though, since it won't have any joins running on it.
awilson
Dec 18, 2012

I have some uses cases though where I'm supposed to aggregate all outcomes written to table B by separate executions of the same sp (possibly running in multiple nodes). That would let me parallel process multiple chunks and then join all the results. With the approach above, would this be possible, would it cause a problem? (I can't see any)

In the aggregation scenario, I also have to clean outputs written to multi-partition table B once the complete outputs are read from B. I can use a process id column for B though, since it won't have any joins running on it.


Hi,

Your aggregation process is probably fine. I would strive to limit multi-partition aggregations until you are ready to make a major rollup and only then do the multi-partition aggregation as infrequently as possible. However, this may be a case for a materialized view and then performing your multi-partition against the view.

Take a look at the voltdb-labs Demographics example on github. It is simulating a large number of user's looking at ad across multiple sites. It is able to perform real time analytics against a number of criteria, rolling up stats, using a few tables and a lot of materialized views. You may find that this example provides some ideas about how to go about collecting information in a very raw form and then continuously refining it to the point that you get precisely the information you are looking for. The sample stops short of creating segmentation groups and summary variable.

Thanks,
Andy
sarikan
Dec 18, 2012
Thanks Andy,
This is really helpful. Actually I have a meeting in the morning and thanks to your input, I can at least discuss this approach as a promising one then.
Is your advice regarding limiting multi-partition aggregations related to performance or issues of data integrity? As long as I don't have issues with data consistency/integrity, I'll certainly try various approaches and report back results here.
awilson
Dec 18, 2012
Hi,

Single partition queries impact performance only. All queries are ACID.
Single partition queries execute against only one partition. That means that the other partitions can continue executing in parallel.

A multi-partition query will block all the partitions until the query completes. Multi-partition queries are often unavoidable in a real-world application, the goal is to call them as infrequently as possible to maximize the work the cluster can perform.

A single-partition query may lookup a user name and password using the table's user_name as the partition column. If you had ten partitions then ten user lookup queries can run in parallel.

A multi-partition query may try to count all the user names that begin with the letter "w" in the table. That query would have to block all the other partitions to search for all those user names. That multi-partition query may still run very fast, but it prevents any other queries from executing until it is done.

I'm looking forward to hearing about how things go.

- Andy
sarikan
Dec 18, 2012
Brilliant. I'll certainly provide an update once I give this architecture a try. It'll be quite interesting to try to replace pgsql implementaiton of this mechanism with VoltDB.
Thanks again Andy.