Announcement

Collapse
No announcement yet.

How does partitioning work

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Nitika
    started a topic How does partitioning work

    How does partitioning work

    How does voltdb find out that a transaction is for a particular partition?

  • guotong1988
    replied
    Originally posted by tcallaghan View Post
    Amit,


    There is a good description of single-partition and multi-partition stored procedures in our online documentation at DesignProc#designappsinglesitesp


    Your SQL will determine if a stored procedure can be single-partition. When SELECTing from a partitioned table, if your SQL doesn't have an exact equality check in the WHERE clause ("... where partition_column = ? ...") VoltDB must check all partitions in the system. Therefore the '>' and '<' operators on partitioned tables will result in multi-partition stored procedures.


    -Tim

    So all the other SQL may be not that fast in large concurrent situation ? for example :group by ? order by ? join ?

    Thank you .

    Leave a comment:


  • tcallaghan
    replied
    re: How does partitioning work

    Originally posted by Nitika View Post
    1. Understood that data will be distributed in partitions for all inserts if there in procedure.
    2. Now queries (Select statements) in Java procedure with '=' in where clause for Partition key can also select the particular partition.

    Doubts
    ------------
    1. How the partition selection decision is done for queries which use operaters like '>' or '<'. For '>' the data can span across partitions, so will the query has to span across multiple partitions?

    Amit Kapila
    Amit,


    There is a good description of single-partition and multi-partition stored procedures in our online documentation at DesignProc#designappsinglesitesp


    Your SQL will determine if a stored procedure can be single-partition. When SELECTing from a partitioned table, if your SQL doesn't have an exact equality check in the WHERE clause ("... where partition_column = ? ...") VoltDB must check all partitions in the system. Therefore the '>' and '<' operators on partitioned tables will result in multi-partition stored procedures.


    -Tim

    Leave a comment:


  • Nitika
    replied
    functioning of '&gt;' operator

    Originally posted by rbetts View Post
    Currently, the mapping function is a simple hash.


    You can find the code here:
    https://source.voltdb.com/browse/Eng...r.java?hb=true


    ==
    EDIT: We've since switched to GitHub. New link: https://github.com/VoltDB/voltdb/blo...ashinator.java
    1. Understood that data will be distributed in partitions for all inserts if there in procedure.
    2. Now queries (Select statements) in Java procedure with '=' in where clause for Partition key can also select the particular partition.

    Doubts
    ------------
    1. How the partition selection decision is done for queries which use operaters like '>' or '<'. For '>' the data can span across partitions, so will the query has to span across multiple partitions?

    Amit Kapila

    Leave a comment:


  • rbetts
    replied
    Sorry, I misunderstood your initial question.

    Originally posted by Nitika View Post
    I understood that there is function to which value of partitioning parameter is passed and you get the partition on which stored procedure needs to run.


    How does this function calculate?
    1. Is there a system table which maintains the lower and upper bounds of each partition tables?
    2. If yes, how these bounds are decided, does it get decided when the table gets created?
    Currently, the mapping function is a simple hash.


    You can find the code here:
    https://source.voltdb.com/browse/Eng...r.java?hb=true


    ==
    EDIT: We've since switched to GitHub. New link: https://github.com/VoltDB/voltdb/blo...ashinator.java

    Leave a comment:


  • Nitika
    replied
    Getting into more internals

    I understood that there is function to which value of partitioning parameter is passed and you get the partition on which stored procedure needs to run.


    How does this function calculate?
    1. Is there a system table which maintains the lower and upper bounds of each partition tables?
    2. If yes, how these bounds are decided, does it get decided when the table gets created?

    Leave a comment:


  • rbetts
    replied
    Dispatch of single partition procedures to partitions

    Originally posted by Nitika View Post
    Ryan, Thanks for your reply.
    This explains how application can instruct that stored procedure is for single partition.
    What I want to know more is that incase a table is partitioned into 10 different partitions, now when we call a procedure with particular value
    how it knows that it has to go which partition out of the 10 partitions.
    1. Does SQL Engine evaluates at execution time or it is the job of optimizer?
    2. How this logic works?
    Going back to my summary example:


    If your database has a "flights" table partitioned on flights.flightId, your procedure might require a flightId parameter. You would identify flightId as the procedure's partitioning parameter. VoltDB would then route execution of an invocation of this procedure to the partition corresponding to the supplied flightId value.


    The procedure identifies which of its parameters is the partitioning function input. In this example, let's say flightId is the first parameter to the stored procedure. F(flightId) = partition_id.


    When the procedure invocation arrives at the cluster, the cluster extracts the first parameter (flightId) from the stored procedure invocation. It calculates F(flightId) and routes the procedure to the replicas of the resulting partition_id.


    Is that clearer?
    Ryan.

    Leave a comment:


  • Nitika
    replied
    choice of partition

    Ryan, Thanks for your reply.
    This explains how application can instruct that stored procedure is for single partition.
    What I want to know more is that incase a table is partitioned into 10 different partitions, now when we call a procedure with particular value
    how it knows that it has to go which partition out of the 10 partitions.
    1. Does SQL Engine evaluates at execution time or it is the job of optimizer?
    2. How this logic works?

    Leave a comment:


  • rbetts
    replied
    Single partition procedures

    Nitika,


    Please see:


    DesignProc#designappsinglesitesp


    There is some additional information on writing procedures here:


    UsingVoltDB/DesignProc


    In brief, when you define the procedure, you include metadata that identifies a partitioning parameter for the procedure. For example, if your database has a "flights" table partitioned on flights.flightId, your procedure might require a flightId parameter. You would identify flightId as the procedure's partitioning parameter. VoltDB would then route execution of an invocation of this procedure to the partition corresponding to the supplied flightId value.


    Thank you,
    Ryan

    Leave a comment:

Working...
X