Forum: Building VoltDB Applications

Post: Conditional insert / update

Conditional insert / update
henning
Feb 8, 2010
How would I formulate the Java calls for

"if record of id = X does not exist, create one and set X.V = Y; if it does exist, set its X.V = X.V + Y"


Thanks!
Henning
Conditional Insert / Update Code
tcallaghan
Feb 8, 2010
Henning, the following code implements your requested insert/update logic. It first attempts to do the update, if the update fails (meaning a row with x.id does not exist) then it performs an insert.

*** BEGIN CODE ***

public class updateOrInsert extends VoltProcedure
{

public final SQLStmt updateRow = new SQLStmt("update x set x.v = x.v + ? where x.id = ?;"); public final SQLStmt insertRow = new SQLStmt("insert into x (id, v) values (?, ?);");

public VoltTable[] run( long id, long v

) {

// attempt to update row in x voltQueueSQL(updateRow, v, id); VoltTable results1[] = voltExecuteSQL();

if (results1[0].asScalarLong() == 0) { // no existing row in x, we need to insert voltQueueSQL(insertRow, id, v); VoltTable results2[] = voltExecuteSQL();

}

return null; }}

*** END CODE ***
Latency
henning
Feb 8, 2010
Henning, the following code implements your requested insert/update logic. It first attempts to do the update, if the update fails (meaning a row with x.id does not exist) then it performs an insert.

*** BEGIN CODE ***

public class updateOrInsert extends VoltProcedure
{...

Thanks a lot!

So this is one procedure with two statements.

In case I were waiting on the client side to check for success or failure (of the entire procedure) would this procedure show double latency compared to simpler two statement procedures, since there is a result tested midway?

If I need to be sure that these statement have been executed before triggering the next such statement, is there a mechanism in VoltDB to help me ensure sequentiality? Other than waiting for the procedure to return before triggering the next call?

Henning
Latency and dependent stored procedures
tcallaghan
Feb 10, 2010
Thanks a lot!

So this is one procedure with two statements.

In case I were waiting on the client side to check for success or failure (of the entire procedure) would this procedure show double latency compared to simpler two statement procedures, since there is a result tested midway?

If I need to be sure that these statement have been executed before triggering the next such statement, is there a mechanism in VoltDB to help me ensure sequentiality? Other than waiting for the procedure to return before triggering the next call?

Henning

In case I were waiting on the client side to check for success or failure (of the entire procedure) would this procedure show double latency compared to simpler two statement procedures, since there is a result tested midway?

Latency in VoltDB is much more a product of our optimizations for transactional throughput than the number of SQL statements you execute in a stored procedure. VoltDB is capable of executing many SQL statements in under a millisecond, so I doubt you'd be able to measure the latency difference if your stored procedure had a single SQL statement versus two SQL statements.

If I need to be sure that these statement have been executed before triggering the next such statement, is there a mechanism in VoltDB to help me ensure sequentiality? Other than waiting for the procedure to return before triggering the next call?

You have two options:

1. Place the second stored procedure call inside the callback from the first stored procedure invocation.
2. Place the additional statements inside a single stored procedure.

In VoltDB if you are calling stored procedures asynchronously they are not guaranteed to run sequentially unless they are initiated on the same node in the cluster (which is not how you would normally configure a VoltDB client application, you usually connect the client to all nodes). Even with that in mind, if the first invocation fails the second invocation would have no way of knowing this.

Synchronous stored procedure invocations can be used to ensure transactional ordering but they delivery far fewer transactions per second on a single client machine.
Latency Point
henning
Feb 10, 2010
In case I were waiting on the client side to check for success or failure (of the entire procedure) would this procedure show double latency compared to simpler two statement procedures, since there is a result tested midway?

Latency in VoltDB is much more a product of our optimizations for transactional throughput than the number of SQL statements you execute in a stored procedure. VoltDB is capable of executing many SQL statements in under a millisecond, so I doubt you'd be able to measure the latency difference if your stored procedure had a single SQL statement versus two SQL statements...

Thank you Tim!

The clarification regarding sequentially was enlightening.

With your example code as given above, at what point would latency occur,
* waiting for the individual voltExecuteSQL() to return, or
* waiting for the entire procedure to return ("after the return null"), and/or even
* wait time *before* the entire procedure is even entered?

Thanks!
Henning
latency
tcallaghan
Feb 10, 2010
Thank you Tim!

The clarification regarding sequentially was enlightening.

With your example code as given above, at what point would latency occur,
* waiting for the individual voltExecuteSQL() to return, or
* waiting for the entire procedure to return ("after the return null"), and/or even
* wait time *before* the entire procedure is even entered?

Thanks!
Henning

Henning,

The major component of latency in VoltDB comes from the fact that it is optimized for throughput over latency. Transactions are placed in a work queue at the partition and executed single threaded. It also takes time to move the transaction and its result through the network.
Outside the Procedure
henning
Feb 11, 2010
Henning,

The major component of latency in VoltDB comes from the fact that it is optimized for throughput over latency. Transactions are placed in a work queue at the partition and executed single threaded. It also takes time to move the transaction and its result through the network.

So I understand that it's "before" and "after" but not within the procedure that latency occurs.

Maybe the distinction between "outside" and "inside" the procedure is canceled out when digging deeper into how VoltDB works. For now it'll probably help me to think about what's possible.

Thanks!
Henning