Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Benchmark: MULTI HOSTS IS SLOWER THAN SINGLE HOST??

  1. #1
    New Member
    Join Date
    Feb 2014
    Posts
    7

    Benchmark: MULTI HOSTS IS SLOWER THAN SINGLE HOST??

    Hi all,
    i am a newbie of VoltDB.

    I write a new Java app, just like helloworld example, which insert data to a table, and search data from this table.
    - Table: SUB_INFOR (SUB_ID, ISDN, PRODUCT_CODE, STATUS), partition by ISDN. Table has 100.000 rows
    - A text file with 100.000 ISDN, to search additional information of this ISDN from above table.

    With a single host A, time is ~30 seconds
    With 3 hosts (with hardware same as A), time is ~100 seconds

    I tried both single-partition and multi-partition of table SUB_INFOR, the result is same :(

    Deployment file:
    <?xml version="1.0"?>
    <deployment>
    <cluster hostcount="3" sitesperhost="2" kfactor="0" />
    <httpd enabled="true">
    <jsonapi enabled="true" />
    </httpd>
    </deployment>


    Can anyone explain why??

    Thanks in advance!

  2. #2
    Super Moderator
    Join Date
    Feb 2010
    Posts
    95
    Hi Leobon,

    It looks like that the procedure you use may have not been partitioned. Were you using a stored procedure or were you using adhoc queries? Can you share the queries?

    What were you measuring in the 30 second and 100 second duration?

    Thanks,
    Ning

  3. #3
    New Member
    Join Date
    Feb 2014
    Posts
    7
    Hi nshi,
    Here is my ddl:
    CREATE TABLE SUB_INFOR (
    SUB_ID VARCHAR(10),
    ISDN VARCHAR(15) NOT NULL,
    PRODUCT_CODE VARCHAR(20) NOT NULL,
    STATUS VARCHAR(1),
    PRIMARY KEY (ISDN)
    );

    PARTITION TABLE SUB_INFOR ON COLUMN ISDN;

    CREATE PROCEDURE FROM CLASS voltdbapptest.procedures.Insert;
    CREATE PROCEDURE FROM CLASS voltdbapptest.procedures.Select;
    CREATE PROCEDURE FROM CLASS voltdbapptest.procedures.Delete;

    PARTITION PROCEDURE Insert ON TABLE SUB_INFOR COLUMN ISDN;
    PARTITION PROCEDURE Select ON TABLE SUB_INFOR COLUMN ISDN;


    I used procedures, eg: response = client.callProcedure("Select", isdn);

    Here is my queries:
    INSERT INTO SUB_INFOR VALUES (?, ?, ?, ?);
    SELECT SUB_ID, PRODUCT_CODE, STATUS FROM SUB_INFOR WHERE ISDN = ?;


    Time is counted from before-accessing VoltDB to end-accessing VoltDB, (including time to read from file and write to file, but this time isn't problem).
    Last edited by leobon; 02-17-2014 at 09:46 PM.

  4. #4
    Super Moderator
    Join Date
    Feb 2010
    Posts
    95
    Hi leobon,

    The procedures and tables are properly partitioned, so they should be fine.

    Quote Originally Posted by leobon View Post
    I used procedures, eg: response = client.callProcedure("Select", isdn);
    You were calling the procedure synchronously, which means that the call will block until the response is received. If your client only has one thread calling Select, this essentially limit the throughput to 1 call at a time. The database is idle most of the time in this case. For example, if the latency for calling Select once is 1ms, then you can only to 1 second / 1 ms = 1000 calls in a second at most.

    With multiple hosts, some percentage of your request may be rerouted to different hosts. This adds another network round-trip to each rerouted request, thus increasing the latency by a little bit. I think that is why your client got slower with multiple hosts.

    There are two things you can do to improve the performance dramatically,
    1. use asynchronous invocation or multiple threads on the client, e.g. http://voltdb.com/docs/PerfGuide/Hello2Async.php
    2. connect to all hosts using the same client instance. This will route requests to the proper hosts, saving a network round-trip.
    Ning

  5. #5
    New Member
    Join Date
    Feb 2014
    Posts
    7
    Thanks for your response, nshi!

    Quote Originally Posted by nshi View Post
    There are two things you can do to improve the performance dramatically,
    1. use asynchronous invocation or multiple threads on the client, e.g. http://voltdb.com/docs/PerfGuide/Hello2Async.php
    2. connect to all hosts using the same client instance. This will route requests to the proper hosts, saving a network round-trip.
    For:
    1. I'll try
    2. What do you mean "connect to all hosts using the same client instance"??

  6. #6
    Super Moderator
    Join Date
    Feb 2010
    Posts
    95
    Quote Originally Posted by leobon View Post
    2. What do you mean "connect to all hosts using the same client instance"??
    That means calling createConnection multiple times on the same client instance to connect to all the hosts in your cluster. You can see an example at https://voltdb.com/docs/PerfGuide/Hello2Connect.php.
    Ning

  7. #7
    New Member
    Join Date
    Feb 2014
    Posts
    7
    Hi nshi,
    i tried two ways you've recommended.
    The result is good.
    Thanks!

  8. #8
    New Member
    Join Date
    Mar 2014
    Posts
    6
    Hello,

    the thread was really helpful to me.
    But my question is: Why is asynchronous so much faster than synchronous with also one server? I have 12.199 Transactions/s with Async and 2.325 Transactions/s with Sync. It's a simple Java programm which imports ~900.000 data items. In the asynchronous program I'm also waiting till the last callback receives the client, because otherwise some data is lost. But I don't understand the great time difference.

    Thanks,
    Sabrina

  9. #9
    Super Moderator
    Join Date
    Dec 2011
    Posts
    105
    Hi Sabrina,

    There is an explanation of Synchronous vs. Asynchronous procedure calls in the Performance Guide here: https://voltdb.com/docs/PerfGuide/Hello2Async.php

    Take a look especially at Figures 2.1 and 2.2. Synchronous calls are blocking so there is a complete round trip before the next call is sent (unless you have a multi-threaded client, but even then there is waiting and it can take a lot of threads to generate a continous high velocity stream of requests to the database). Asynchronous calls allow a single-threaded client to do just that, to continuously send requests, and continuously receive responses on the callback thread.

    Ben

  10. #10
    New Member
    Join Date
    Mar 2014
    Posts
    6
    Hello Ben,

    thanks for your reply. Your explanation is clear, but when I am using 1 server, with 3 partitions, I only can be 3 times faster with async, or? But in my case the asynchronous is 6 times faster than the synchronous? ;) It is noch visible for me.

    Can you give me a detailed explanation?

    Thanks,
    Sabrina

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •