Forum: Building VoltDB Applications

Post: Third-party libs in stored procedures

Third-party libs in stored procedures
Anonymous
Apr 26, 2012
I'm finding that I routinely need to do in-memory aggregation of intermediate results in my stored procedures. I generally like to use the GNU Trove collections for this kind of thing, since they're faster, less memory-intensive, and somewhat easier to use than java.util.*. I found I could just drop the trove jar into $VOLTDB_HOME/lib and volt seems to be happy to load it up. Are there any hidden downsides to doing this?
Hi, Generally it is safe to
aweisberg
Apr 26, 2012
Hi,

Generally it is safe to do if the library is providing functionality that is deterministic (see http://www.afewmoreamps.com/2011/07/keep-those-procedures-deterministic.html). The only thing to keep in mind is that you can't update the library while the server is running and it is critical that the same version of the library is loaded at each server. If multiple versions of a library are in the classpath the order will matter.

You also shouldn't store any state in an instance or static field of a procedure since that will be non-deterministic in a variety of cases because it that state will span transactions.

-Ariel
Thanks, these constraints are
mkoehrsen
Apr 27, 2012
Hi,

Generally it is safe to do if the library is providing functionality that is deterministic (see http://www.afewmoreamps.com/2011/07/keep-those-procedures-deterministic.html). The only thing to keep in mind is that you can't update the library while the server is running and it is critical that the same version of the library is loaded at each server. If multiple versions of a library are in the classpath the order will matter.
-Ariel


Thanks, these constraints are completely in line with my usage.
example?
jpiekos
Apr 26, 2012
> I'm finding that I routinely need to do in-memory aggregation of intermediate results in my stored procedures
Can you post an example/description of the types of queries/operations that require you to do this routinely?
Thanks,
John Piekos
"Routinely" is maybe an
mkoehrsen
Apr 27, 2012
> I'm finding that I routinely need to do in-memory aggregation of intermediate results in my stored procedures
Can you post an example/description of the types of queries/operations that require you to do this routinely?
Thanks,
John Piekos


"Routinely" is maybe an overstatement since I've only written a half-dozen or so stored procedures so far, but I'm definitely seeing this as a regular need in order to work around a couple of limitations in Volt's SQL subset.

First, the lack of self-joins. My use-case intrinsically requires a self-join which I understand is not supported. So I'm doing what amounts to a nested-loops self-join in code and aggregating as I go.

Second, I've run into problems trying to join a partitioned table to a view derived from the same table. Details are slightly involved and maybe require a separate thread, but the upshot is I decided to maintain an aggregation table directly from the stored procedure that modifies the base table, so again I'm doing the aggregation in code in my stored procedure.