Results 1 to 6 of 6

Thread: Stored procedure cross-partition results

  1. #1
    New Member
    Join Date
    Jan 2014
    Posts
    7

    Stored procedure cross-partition results

    Hi,
    I have a problem getting correct data from a Stored Procedure, using VoltDB 4.0 0-gab8313d-local on 64-bit Linux.
    I can't give you the real VoltDB configs because they can't be published, but they boil down to the following.
    The relevant table is:

    CREATE TABLE TABLE1 (
    DEVICE VARCHAR(23) NOT NULL,
    USER VARCHAR(23) NOT NULL);

    The table is configured with:
    <partition table="TABLE1" column="DEVICE"/>

    I then have a stored procedure which executes the following query:

    SELECT COUNT(DEVICE) FROM TABLE1 WHERE DEVICE=? AND USER=?
    having taken 2 String params.
    The procedure is declared with:
    PARTITION PROCEDURE MyProcedure ON TABLE TABLE1 COLUMN DEVICE;

    I then add some data:
    INSERT INTO TABLE1 VALUES('d1','u1');
    INSERT INTO TABLE1 VALUES('d2','u1');

    When I put this in a partitioned VoltDB cluster of 3 servers with K-Safety=0 and run the stored procedures as follows:
    exec MyProcedure 'd1' 'u1';
    exec MyProcedure 'd2' 'u1';
    I get a result of 0 for one and 1 for the other.

    Each of those rows has been put into a different partition (verified with "exec @Statistics TABLE 0"), which is fine, but I don't understand why I get a different result, given the query that it's running.
    If I change the data to force both rows to be in the same partition, I get result of 1 for both, which is what I would expect, but in a cluster it seems to behave differently.

    Sorry I can't give you the real data but does this suggest anything to anyone? Am I missing some subtlety about querying over partitioned databases?
    Thanks in advance,
    Paul

  2. #2
    Super Moderator
    Join Date
    Apr 2014
    Posts
    37
    Paul,

    Thanks for your question. I've engaged an engineer for some help on this issue. We are investigating and will get back to you shortly.
    As for 'are you missing anything?', you are definitely not.

    -Peter Zhao

  3. #3
    Super Moderator
    Join Date
    Feb 2012
    Posts
    42
    I am having trouble reproducing this.
    Here's the DDL-centered try I made:

    CREATE TABLE table1 (
    DEVICE VARCHAR(23) NOT NULL,
    USERNAME VARCHAR(23) NOT NULL,
    );

    PARTITION TABLE table1 ON COLUMN DEVICE;


    CREATE PROCEDURE DO AS SELECT COUNT(DEVICE) FROM table1 WHERE DEVICE=? and USERNAME=?;

    PARTITION PROCEDURE DO ON TABLE table1 COLUMN DEVICE;

    I got a volt compile error when I tried to use the SQL function name USER as a column name.
    Does the behavior correct itself when you rename the USER column to avoid the SQL keyword?
    Here's what I tried in sqlcmd:

    sqlcmd
    SQL Command :: localhost:21212
    1> insert into table1 values ('d1', 'u1');


    (1 row(s) affected)
    2> insert into table1 values ('d2', 'u1');


    (1 row(s) affected)
    3> exec DO 'd2' 'u1';
    C1
    ---
    1



    (1 row(s) affected)
    4> exec DO 'd1' 'u1';
    C1
    ---
    1



    (1 row(s) affected)

  4. #4
    New Member
    Join Date
    Jan 2014
    Posts
    7
    Hi
    thanks for your replies. The "USER" column name is not actually what I had in my schema, I was just trying to avoid posting the real DDL.
    Did you try to run this in a cluster, and confirm that the 2 rows were in different partitions?
    Also I notice you created the SP using "CREATE PROCEDURE AS..." - mine is created as a Java class and precompiled as part of a catalogue JAR...
    Sorry, I didn't mention that before, I don't know if it makes a difference to how SPs work. The source code for it would be:

    Code:
    [...]
    public class MyProcedure extends VoltProcedure {
    
        public final SQLStmt selectSQL = new SQLStmt("SELECT COUNT(DEVICE) FROM TABLE1 WHERE DEVICE=? AND USERNAME=?");
        public VoltTable[] run(String user, String device) throws VoltAbortException {
            voltQueueSQL(selectSQL,device,user);
            VoltTable[] selectedData = voltExecuteSQL(true);
            return selectedData;
        }
    }
    One thing I've just realized is that my params to the run() method are "user,device", but the SELECT params are "DEVICE" then "USERNAME".
    Could I be running into the note in section 3.24 "Partitioning Stored Procedures" on this page which states that
    "the PARTITION PROCEDURE statement assumes that the partitioning column value is the first parameter to the procedure"
    "DEVICE" is my partitioning column. I will rearrange my params to adhere to that. I had assumed this refers to the ordering of params to the SQL but maybe that's a wrong assumption.

    Any thoughts? Are there any Java SP configurations, hints or annotations I should be adding?
    Best regards
    Paul
    Last edited by pmilner; 06-17-2014 at 03:18 AM.

  5. #5
    New Member
    Join Date
    Jan 2014
    Posts
    7
    I've just rejigged things to take account of the thing mentioned above - that the partitioning column needs to be the FIRST param to the stored procedure (unless you specifically say otherwise in the PARTITION PROCEDURE statement). The problem has now gone away. Thanks repliers for highlighting this area and prompting a review of the Java stored procedure.

  6. #6
    Super Moderator
    Join Date
    Apr 2014
    Posts
    37
    Paul,

    Thats great! Just to clarify, the default partitioning column value is the first parameter. However, it does not need to be the first parameter.

    So in your original code,
    ...
    public VoltTable[] run(String user, String device) throws VoltAbortException {
    ...
    the partitioning column is user. Your PARTITION PROCEDURE statement in your DDL should look like the following (as taken from other Paul's code).
    PARTITION PROCEDURE DO ON TABLE table1 COLUMN DEVICE;
    OR
    PARTITION PROCEDURE DO ON TABLE table1 COLUMN DEVICE PARAMETER 0;
    In order to define the partitioning column other then the first parameter in your stored procedure,
    PARTITION PROCEDURE DO ON TABLE table1 COLUMN DEVICE PARAMETER 1;
    will allow for your original code to work properly. Just a reminder, the index used in the PARTITION PROCEDURE is zero-based index.

    If you need further assistance, please let us know.

    -Peter Zhao

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •