Forum: Building VoltDB Applications

Post: Helper functions

Helper functions
David Best
Sep 22, 2010
Several of our stored procedures share some common logic including some SQL statements. Is there a way to declare a function which all the procs can call?

Thanks
Yes and No
jhugg
Sep 23, 2010
Here are a couple of rules and approaches dealing with code re-use in VoltDB stored procedures.

1. Any SQL you want to call must be declared in the stored procedure calling it. This is due to the fact that our application catalog compiler reads each procedure file and scans it for SQL to optimize specifically for that procedure.

2. Beyond that restriction, you can call any code anywhere.

3. Note that our application migration tools that allow you to change procedures on a running system will transactionally update stored procedure code only. We can't update (transactionally) other java code in your classpath. If your procedure calls other java code, it will be difficult to modify that code without restarting VoltDB.

4. Ariel suggests that you can create a dummy procedure full of shared code that isn't ever actually called directly, but can be updated transactionally.

5. As always, whether your procedure calls shared code or not, all procedures should be deterministic and fast. They should avoid using timestamps generated internally (except by asking VoltDB), generating random numbers (except by asking VoltDB) and doing any kind of blocking I/O.

Making code reuse easier that what's described above will almost certainly come in a future VoltDB version, but it's not currently scheduled for a 2010 release.
Ok, but...
David Best
Sep 24, 2010
Here are a couple of rules and approaches dealing with code re-use in VoltDB stored procedures.

1. Any SQL you want to call must be declared in the stored procedure calling it. This is due to the fact that our application catalog compiler reads each procedure file and scans it for SQL to optimize specifically for that procedure.
...


That all makes sense...I hadn't thought about the upgrading issues.

4. So, I can create a stored procedure class with a dummy run() function and some helper functions. However, I can't call the functions because I don't have an object instance; and I can't create one because it requires the Volt runtime to do that. I suppose if there was a way to get access to the instance the runtime already has...
Java Witchcraft
jhugg
Sep 24, 2010
That all makes sense...I hadn't thought about the upgrading issues.

4. So, I can create a stored procedure class with a dummy run() function and some helper functions. However, I can't call the functions because I don't have an object instance; and I can't create one because it requires the Volt runtime to do that. I suppose if there was a way to get access to the instance the runtime already has...


This kind of stuff only reveals how much we need to have a better answer, but...
Each partition on a node has one instance of every procedure class, loaded by our classloader, which is transaction safe. These instances will be replaced on catalog migrations by our classloader.

So a few more thoughts:
1. Create a static nested class in one of your procedures and instantiate it from any of the other procedures. Since the procedures all use our classloader, this nested class should also be loaded from your catalog jar in a transactionally safe way.

2. [From Ariel] Create a base class that inherits from VoltProcedure and have other procedures inherit from it, deepening the hierarchy. You might be able to put SQL in the base class that would get compiled for each of the derived classes too.

I haven't personally testing any of this, so it's all educated conjecture.
Helper functions recap
David Best
Sep 25, 2010
This kind of stuff only reveals how much we need to have a better answer, but...
Each partition on a node has one instance of every procedure class, loaded by our classloader, which is transaction safe. These instances will be replaced on catalog migrations by our classloader...


1. I had already gone down this path a bit...doesn't work because voltQueueSQL/voltExecuteSQL can't be called from a static function. Perhaps you meant that I would pass this() to it from the instance procedure, so the static function could call m_this.voltQueueSQL etc. This might work if the scope of a static variable is the same as the scope of a partition. Also, compilation of the static SQL statements would have to be independent of the caller (single partition vs multi partition). Seems fragile assumptions so...

2. This works and handles my current needs.

Thank you.
Good to hear
jhugg
Sep 26, 2010
1. I had already gone down this path a bit...doesn't work because voltQueueSQL/voltExecuteSQL can't be called from a static function. Perhaps you meant that I would pass this() to it from the instance procedure, so the static function could call m_this.voltQueueSQL etc. This might work if the scope of a static variable is the same as the scope of a partition. Also, compilation of the static SQL statements would have to be independent of the caller (single partition vs multi partition). Seems fragile assumptions so...

2. This works and handles my current needs.

Thank you.


Good to hear that #2 works. We'll keep that in mind when we get around to solidifying a set of best practices on code re-use in stored procedures.
Improvements for 1.2
jhugg
Sep 29, 2010
1. I had already gone down this path a bit...doesn't work because voltQueueSQL/voltExecuteSQL can't be called from a static function. Perhaps you meant that I would pass this() to it from the instance procedure, so the static function could call m_this...Thank you.


In subversion revisions 1075/1076, I improved the ability to use inheritance for code reuse in stored procedures.

1. You can use non-public SQLStmts in VoltProcedure classes.
2. You can use public/protected/package-private SQLStmts from a superclass.
3. Your VoltProcedure-derived intermediate base class needs to have a run method, but it no longer needs to have the same signature as any derived stored procedure classes.
4. All stored procedures need to have a run() method; it cannot be inherited.

This code will be in the 1.2 release.
Timestamps
David Best
Sep 24, 2010
Here are a couple of rules and approaches dealing with code re-use in VoltDB stored procedures.

1. Any SQL you want to call must be declared in the stored procedure calling it. This is due to the fact that our application catalog compiler reads each procedure file and scans it for SQL to optimize specifically for that procedure...



Are Volt generated timestamps stable in the presence of K-safety? i.e. Are they guaranteed to be the same between the originating node and all the replicants?
For example, if I have a timestamp column as a primary key, and called Volt's current timestamp function, is this safe, or could this give different values on the replica's?
Yep.
jhugg
Sep 24, 2010
Are Volt generated timestamps stable in the presence of K-safety? i.e. Are they guaranteed to be the same between the originating node and all the replicants?
For example, if I have a timestamp column as a primary key, and called Volt's current timestamp function, is this safe, or could this give different values on the replica's?


VoltProcedure.getTransactionTime() and VoltProcedure.getSeededRandomNumberGenerator() are both deterministic. A procedure running on two replicas within a transaction will always get the same answer.