Forum: Building VoltDB Applications

Post: Can a partitioned stored procedure run a query against data in all partitions

Can a partitioned stored procedure run a query against data in all partitions
pricesj
Apr 28, 2015
I know this example is contrived, this could be resolve by changing the partitioning strategy, but presume there are more business requirements to keep the partitioning different.

Consider a stored procedure "ReadAllOrderLinesForOrder" which is partitioned on ClientId, in the call we pass in the ClientId and OrderId, then run a select query against the OrderLine table, where the ClientId and OrderId match. If the OrderLine table is partitioned by StockId, the select query within the stored procedure would need to execute as a multi-partition query.

Is this possible?
Would it have any worrying impact?

Secondly, if say there was a table OrderLinesInOrder, which contains the StockId for the OrderLine, would it be possible to queue up individual queries for each OrderLine with StockId, ClientId and OrderId in the where clause, which effectively queues up a collection of partitioned select queries.

Would this be possible? To call a partitioned stored procedure which calls a collection of partitioned SQL statements onto other partitions?

Thanks in advance,
Stephen
bballard
Apr 28, 2015
Hi Stephen,

You cannot access data in other partitions from within a single-partition procedure. If you partitioned the procedure on ClientId, it runs only on the partition for the given ClientId value passed in. If within the procedure you run a query on a table that is partioned by StockId, you would get an incorrect response, only the subset of records that happen to be in the same partition.

There is an explanation about this, in Using VoltDB section 5.3.3.2 Queries in Single-Partitioned Stored Procedures (http://docs.voltdb.com/UsingVoltDB/designappprocinstall.php#designappsinglesitesp) that I think will be helpful to you.

You could choose to not partition the procedure, and then you could run multi-partition SQL in the procedure, however you want to minimize the percentage of your database workload that runs as multi-partition. Sometimes it may be better to break the work into separate single-partition transactions. For example, you might use a single-partition procedure to query Order and OrderLine partitioned by ClientID. That could return the list of StockIds for the Order lines. Then for each StockId, you might call a single-partition procedure partitioned by StockId to get more data about the stock.

You may have a use case where choosing the ideal partitioning columns is not easy and involves trade-offs. If you'd like to talk it over and get help choosing the best options, you can contact sales@voltdb.com to get connected with an engineer.

Best regards,
Ben
pricesj
Apr 29, 2015
Thanks for the clarification, it was really to help clarify the correct meaning of sentence from the "The Anatomy of a VoltDB Stored Procedure" chapter (http://docs.voltdb.com/UsingVoltDB/DesignProcAnatomy.php#DesignProcQuery)

"Queuing multiple SQL statements improves performance when the SQL queries execute because it minimizes the amount of network traffic within the cluster"

I presume that this statement only applies to none-partitioned stored procedures.

Thanks again,
Stephen