Forum: Building VoltDB Clients

Post: Prepared Statements / Array binding / C/C++ Client API

Prepared Statements / Array binding / C/C++ Client API
Piotr Kozaczewski
Jun 1, 2010
Is it possible to use Prepared Statements in VoltDB?
To be more specific, I mean SQL statements defined at Client Side, prepared and held accessible during some longer period of time. Of particular interst for me, would be Array Binding, i.e. associating arrays of values with Placeholders/Parameters in SQL DML Statements.
I am not going to create/modify Schemas from Client Side, as I am also not doing this when working with Oracle/MSSQL/Sybase/Teradata etc. What I need, is to enable creating some objects, that encapsulate SQL Statements, deploy this object to my runtime environment and make them run. As part of my execution, I would like to prepare the statements and use Array Binding for DML Statements, which greatly accelerates their execution.
I would like to use VoltDB as a fast, in-memory, transactional caching layer, above a more traditional RDBMS.
Heavy OLTP load would be performed in-memory by VoltDB and the results would be then, asynchronously, off-loaded to that RDBMS.
In particular, the results of some reasonably wide time window would be easily accessible, as they would temporarily reside in memory.
Moreover, static or slowly-varying data, e.g. reference data, would be loaded by separate service to a Reference Data Schemas to accelerate their access by the OLTP part.
Is above scenario possible at all?
Is a C/C++ API planned?
Is a partial implementation of ODBC interface possible or something like prepareStatement / setParameter / execute / fetchData / get Result ?
Prepared statements. Array binding.
rbetts
Jun 1, 2010
Is it possible to use Prepared Statements in VoltDB?
VoltDB does not support prepared statements. VoltDB does not allow transactions that span multiple rounds of communication with a client. Any hypothetical prepared statement implementation would have to use auto-commit.
Part of VoltDB's underlying premise is that workloads are known in advance. We would structure these prepared statements as stored procedures.
I would like to prepare the statements and use Array Binding for DML Statements, which greatly accelerates their execution.
You can pass arrays to VoltDB stored procedures and then iterate those arrays in the stored procedure logic to supply different parameter sets to one of the stored procedure's SQL statements.
Executing multiple inserts in a loop in a single partition stored procedure is very fast. Note the caveat - the array(s) of parameters passed to the procedure must all correspond to a single partition. Mixing data for multiple partitions in the same array set will not result in a performant VoltDB solution.
Your desire to compress a lot of transactional work into one round trip from client to database is exactly the kind of observation that inspired VoltDB's architecture.
Is a C/C++ API planned?
We have not yet written a C/C++ API. One is needed. We would happily collaborate with a community member willing to start implementation. Or, raise the priority of this work internally if a missing C API was a roadblock to a promising VoltDB deployment.
ODBC?
We only support access to VoltDB via clients that implement our wire protocol. At the moment, there are two implementations: a Java client we wrote and an Erlang client a community member wrote.
The Getting Started Guide (section 3.5) describes using our Java client. You can access the guide via http://community.voltdb.com/sites/default/files/1.0.01-docs_pdf.zip
Heavy OLTP load would be performed in-memory by VoltDB and the results would be then, asynchronously, off-loaded to that RDBMS
VoltDB exposes an export connector for workload patterns like this. It is described in chapter 10 of the UsingVoltDB guide, available in the zip archive linked above.
Thank you for interest,
Ryan.
Prepared statements. Array binding. -- continued
Piotr Kozaczewski
Jun 2, 2010
Is it possible to use Prepared Statements in VoltDB?
VoltDB does not support prepared statements. VoltDB does not allow transactions that span multiple rounds of communication with a client. Any hypothetical prepared statement implementation would have to use auto-commit.
Part of VoltDB's underlying premise is that workloads are known in advance. We would structure these prepared statements as stored procedures.
I would like to prepare the statements and use Array Binding for DML Statements, which greatly accelerates their execution.
You can pass arrays to VoltDB stored procedures and then iterate those arrays in the stored procedure logic to supply different parameter sets to one of the stored procedure's SQL statements.
Executing multiple inserts in a loop in a single partition stored procedure is very fast. Note the caveat - the array(s) of parameters passed to the procedure must all correspond to a single partition. Mixing data for multiple partitions in the same array set will not result in a performant VoltDB solution.
Your desire to compress a lot of transactional work into one round trip from client to database is exactly the kind of observation that inspired VoltDB's architecture.
Is a C/C++ API planned?
We have not yet written a C/C++ API. One is needed. We would happily collaborate with a community member willing to start implementation. Or, raise the priority of this work internally if a missing C API was a roadblock to a promising VoltDB deployment.
ODBC?
We only support access to VoltDB via clients that implement our wire protocol. At the moment, there are two implementations: a Java client we wrote and an Erlang client a community member wrote.
The Getting Started Guide (section 3.5) describes using our Java client. You can access the guide via http://community.voltdb.com/sites/default/files/1.0.01-docs_pdf.zip
Heavy OLTP load would be performed in-memory by VoltDB and the results would be then, asynchronously, off-loaded to that RDBMS
VoltDB exposes an export connector for workload patterns like this. It is described in chapter 10 of the UsingVoltDB guide, available in the zip archive linked above.
Thank you for interest,
Ryan.


Ryan,
thank you for your answer.
In our software, we design/write/test-execute SQL statements AND stored procedures, which could be then deployed
as parts of Sources, Sinks and ReferenceObcects.
From our perspective, we could, theoretically, wrap these statements/procedures in appropriate java stuff and use the VoltDB tools to deploy them, am I right? This would be a custom, VoltDB-specific part of our deployment process,
wouldn't it?
You mentioned, that a transaction must not span multiple round trips. Does it mean, that a dataflow from Source to Sink, through multiple Graph Nodes (including Reference Objects that access VoltDB data somewhere between Source and Sink(s)) should be entirely moved to VoltDB, i.e. should be converted to Java and embedded in VoltDB Stored Procedures, unless it would not be transactional?
Transaction scope
rbetts
Jun 2, 2010
Ryan,
thank you for your answer.
In our software, we design/write/test-execute SQL statements AND stored procedures, which could be then deployed
as parts of Sources, Sinks and ReferenceObcects.
From our perspective, we could, theoretically, wrap these statements/procedures in appropriate java stuff and use the VoltDB tools to deploy them, am I right? This would be a custom, VoltDB-specific part of our deployment process,
wouldn't it?
You mentioned, that a transaction must not span multiple round trips. Does it mean, that a dataflow from Source to Sink, through multiple Graph Nodes (including Reference Objects that access VoltDB data somewhere between Source and Sink(s)) should be entirely moved to VoltDB, i.e. should be converted to Java and embedded in VoltDB Stored Procedures, unless it would not be transactional?


Yes, you can definitely create Java stored procedures that wrap a single sql statement. See section 3.2 of the UsingVoltDB.pdf document for more details.
We support a short hand for stored procedures that are exactly one SQL statement with no user-written Java - the statement can be written directly in the project configuration file without the Java source code boilerplate. Unfortunately, procedures written in this style would not support the array binding approximation - as that would require code to iterate the array. (See section 3.2.3 of the UsingVoltDB.pdf, "Shortcut for Defining Simple Stored Procedures".) Implementing array bindings for these simple procedures is an interesting idea. I captured that in our issue tracking system though it may be some time before it is prioritized. (https://issues.voltdb.com/browse/ENG-583).
A transaction must not span multiple round trips
That's correct. A stored procedure invocation is the unit of a transaction in VoltDB. I can't answer your specific question about graph manipulation without understanding your application more precisely. I can only offer a few general observations that might help.
1. If the entire dataflow must be atomic AND isoloated from other updates (ie, it can not be done in multiple commutative transactions), the only way to achieve this is to embed the full logic in a single stored procedure invocation.
2. If portions of the dataflow calculations are commutative with other concurrent transactions, then those portions can be broken into multiple VoltDB transactions by your application.
3. The vast majority (almost 100%) of your stored procedure invocations must be single partition for VoltDB to be useful to you. If, for example, you partitioned by graph node, updating multiple nodes in a single transaction/procedure will not be a single partition transaction.
Sorry for the naive and
itistoday
Jun 1, 2010
Sorry for the naive and ignorant question (as I haven't had time to research VoltDB in depth yet), but does lack of prepared statements mean that we have to make sure to do string escaping and all that to avoid SQL injection attacks?
If so that's rather disappointing...
SQL injection
rbetts
Jun 1, 2010
Sorry for the naive and ignorant question (as I haven't had time to research VoltDB in depth yet), but does lack of prepared statements mean that we have to make sure to do string escaping and all that to avoid SQL injection attacks?
If so that's rather disappointing...


SQL injection is not possible with VoltDB, at least to the extent that I can imagine possible attack vectors.
All VoltDB SQL is planned/processed/compiled when the VoltDB administrator prepares an application for deployment to a VoltDB cluster. It is not possible for a client or end-user to supply SQL to be interpreted by the VoltDB engine. In VoltDB, all SQL is contained in the parametrized stored procedures that compose a VoltDB application. Parameters to stored procedures are strongly typed.
If you are used to using a predictable set of prepared statements (that is, if you prepare the same set of statements each time you start your application), then you might find VoltDB's approach pretty natural.
If you have time, check out some of the VoltDB documentation. We have some surprisingly good docs for a project as young as VoltDB - a credit to Andrew, our resident author. Checkout the getting started guide. Or, perhaps one of the simple examples in the distribution kit. Hopefully these resources will help understand how VoltDB works without too much difficulty.
http://www.voltdb.com/community/downloads.php
If you read these, and have suggestions on how to better or more concisely explain the nature of VoltDB, feedback is always appreciated.
Thank you,
Ryan.
Thanks for the reply, and I
itistoday
Jun 2, 2010
SQL injection is not possible with VoltDB, at least to the extent that I can imagine possible attack vectors.
All VoltDB SQL is planned/processed/compiled when the VoltDB administrator prepares an application for deployment to a VoltDB cluster. It is not possible for a client or end-user to supply SQL to be interpreted by the VoltDB engine. In VoltDB, all SQL is contained in the parametrized stored procedures that compose a VoltDB application. Parameters to stored procedures are strongly typed.
If you are used to using a predictable set of prepared statements (that is, if you prepare the same set of statements each time you start your application), then you might find VoltDB's approach pretty natural.
If you have time, check out some of the VoltDB documentation. We have some surprisingly good docs for a project as young as VoltDB - a credit to Andrew, our resident author. Checkout the getting started guide. Or, perhaps one of the simple examples in the distribution kit. Hopefully these resources will help understand how VoltDB works without too much difficulty.
http://community.voltdb.com/downloads
If you read these, and have suggestions on how to better or more concisely explain the nature of VoltDB, feedback is always appreciated.
Thank you,
Ryan.


Thanks for the reply, and I will definitely check that documentation out, this just jumped out at me and I unfortunately am too swamped right now, I just wanted to make sure.
Thanks again!