Announcement

Collapse
No announcement yet.

How does partitioning work

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How does partitioning work

    How does voltdb find out that a transaction is for a particular partition?

  • #2
    Single partition procedures

    Nitika,


    Please see:


    DesignProc#designappsinglesitesp


    There is some additional information on writing procedures here:


    UsingVoltDB/DesignProc


    In brief, when you define the procedure, you include metadata that identifies a partitioning parameter for the procedure. For example, if your database has a "flights" table partitioned on flights.flightId, your procedure might require a flightId parameter. You would identify flightId as the procedure's partitioning parameter. VoltDB would then route execution of an invocation of this procedure to the partition corresponding to the supplied flightId value.


    Thank you,
    Ryan

    Comment


    • #3
      choice of partition

      Ryan, Thanks for your reply.
      This explains how application can instruct that stored procedure is for single partition.
      What I want to know more is that incase a table is partitioned into 10 different partitions, now when we call a procedure with particular value
      how it knows that it has to go which partition out of the 10 partitions.
      1. Does SQL Engine evaluates at execution time or it is the job of optimizer?
      2. How this logic works?

      Comment


      • #4
        Dispatch of single partition procedures to partitions

        Originally posted by Nitika View Post
        Ryan, Thanks for your reply.
        This explains how application can instruct that stored procedure is for single partition.
        What I want to know more is that incase a table is partitioned into 10 different partitions, now when we call a procedure with particular value
        how it knows that it has to go which partition out of the 10 partitions.
        1. Does SQL Engine evaluates at execution time or it is the job of optimizer?
        2. How this logic works?
        Going back to my summary example:


        If your database has a "flights" table partitioned on flights.flightId, your procedure might require a flightId parameter. You would identify flightId as the procedure's partitioning parameter. VoltDB would then route execution of an invocation of this procedure to the partition corresponding to the supplied flightId value.


        The procedure identifies which of its parameters is the partitioning function input. In this example, let's say flightId is the first parameter to the stored procedure. F(flightId) = partition_id.


        When the procedure invocation arrives at the cluster, the cluster extracts the first parameter (flightId) from the stored procedure invocation. It calculates F(flightId) and routes the procedure to the replicas of the resulting partition_id.


        Is that clearer?
        Ryan.

        Comment


        • #5
          Getting into more internals

          I understood that there is function to which value of partitioning parameter is passed and you get the partition on which stored procedure needs to run.


          How does this function calculate?
          1. Is there a system table which maintains the lower and upper bounds of each partition tables?
          2. If yes, how these bounds are decided, does it get decided when the table gets created?

          Comment


          • #6
            Sorry, I misunderstood your initial question.

            Originally posted by Nitika View Post
            I understood that there is function to which value of partitioning parameter is passed and you get the partition on which stored procedure needs to run.


            How does this function calculate?
            1. Is there a system table which maintains the lower and upper bounds of each partition tables?
            2. If yes, how these bounds are decided, does it get decided when the table gets created?
            Currently, the mapping function is a simple hash.


            You can find the code here:
            https://source.voltdb.com/browse/Eng...r.java?hb=true


            ==
            EDIT: We've since switched to GitHub. New link: https://github.com/VoltDB/voltdb/blo...ashinator.java

            Comment


            • #7
              functioning of '>' operator

              Originally posted by rbetts View Post
              Currently, the mapping function is a simple hash.


              You can find the code here:
              https://source.voltdb.com/browse/Eng...r.java?hb=true


              ==
              EDIT: We've since switched to GitHub. New link: https://github.com/VoltDB/voltdb/blo...ashinator.java
              1. Understood that data will be distributed in partitions for all inserts if there in procedure.
              2. Now queries (Select statements) in Java procedure with '=' in where clause for Partition key can also select the particular partition.

              Doubts
              ------------
              1. How the partition selection decision is done for queries which use operaters like '>' or '<'. For '>' the data can span across partitions, so will the query has to span across multiple partitions?

              Amit Kapila

              Comment


              • #8
                re: How does partitioning work

                Originally posted by Nitika View Post
                1. Understood that data will be distributed in partitions for all inserts if there in procedure.
                2. Now queries (Select statements) in Java procedure with '=' in where clause for Partition key can also select the particular partition.

                Doubts
                ------------
                1. How the partition selection decision is done for queries which use operaters like '>' or '<'. For '>' the data can span across partitions, so will the query has to span across multiple partitions?

                Amit Kapila
                Amit,


                There is a good description of single-partition and multi-partition stored procedures in our online documentation at DesignProc#designappsinglesitesp


                Your SQL will determine if a stored procedure can be single-partition. When SELECTing from a partitioned table, if your SQL doesn't have an exact equality check in the WHERE clause ("... where partition_column = ? ...") VoltDB must check all partitions in the system. Therefore the '>' and '<' operators on partitioned tables will result in multi-partition stored procedures.


                -Tim

                Comment


                • #9
                  Originally posted by tcallaghan View Post
                  Amit,


                  There is a good description of single-partition and multi-partition stored procedures in our online documentation at DesignProc#designappsinglesitesp


                  Your SQL will determine if a stored procedure can be single-partition. When SELECTing from a partitioned table, if your SQL doesn't have an exact equality check in the WHERE clause ("... where partition_column = ? ...") VoltDB must check all partitions in the system. Therefore the '>' and '<' operators on partitioned tables will result in multi-partition stored procedures.


                  -Tim

                  So all the other SQL may be not that fast in large concurrent situation ? for example :group by ? order by ? join ?

                  Thank you .

                  Comment

                  Working...
                  X