Forum: Building VoltDB Applications

Post: Problem with voltdb response time for Client

Problem with voltdb response time for Client
Oleg
Jan 28, 2013
We develop pilot of high perfomance system with VoltDB database.
We have problem with VoltDB perfomance.
A task get object from queue and send it to voltdb.
First we generate Id for object as recommended in this post http://www.simonbuckle.com/2012/04/29/simulating-auto-increment-in-voltdb/.
And then insert it to VoltDB (with generated Id).
We print response time.
When the task started perfomance is very good,
Response time in msec

37 - db ID gen: 0
37 - db ID gen: 0
37 - db ID gen: 1

but some time later we see long response time.
For example (in msec):

37 - db ID gen: 1910
37 - db ID gen: 1890
37 - db ID gen: 1860

Sometime response time is bigger than 10 or 20 sec.
In this time we can't get response from voltDb from other applications (Aqua Data Studio for example). I mean simple SQL "select count(*) from"

We try sync and async client call and result is same.

Our VoltDb configuration:
OS: RedHat 6

VoltDB Version 3.0
Buildstring: voltdb-3.0-0-g77209f0
Running on a cluster of 1 hosts with 8 sites (8 per host).
System has 64392,0 mb of memory.
Using 4,2% of system memory for VoltDB
20,5% of system memory has been reserved for VoltDB
Rss: 2722,5mb
Java Max Heap 14563,6mb
Java Used: 777,7mb
Java Claimed: 4101,8mb
rbetts
Jan 28, 2013
We develop pilot of high perfomance system with VoltDB database.
We have problem with VoltDB perfomance.
A task get object from queue and send it to voltdb.
First we generate Id for object as recommended in this post http://www.simonbuckle.com/2012/04/29/simulating-auto-increment-in-voltdb/.

Oleg,

Can you share your DDL, deployment and project file? I'm curious if you are using partitioned tables? Also, you can verify that your queries are planned as you expect by using the @Explain system procedure as described here: http://voltdb.com/docs/UsingVoltDB/sysprocexplain.php

I read through that example - it appears correct at first glance.

Ryan.
Oleg
Jan 28, 2013
We develop pilot of high perfomance system with VoltDB database.
Oleg,

Can you share your DDL, deployment and project file? I'm curious if you are using partitioned tables? Also, you can verify that your queries are planned as you expect by using the @Explain system procedure as described here: http://voltdb.com/docs/UsingVoltDB/sysprocexplain.php

I read through that example - it appears correct at first glance.

Ryan.


Ryan,
Thank you!

I shared my DDL, deployment and project file.
In the table about 56 columns now and it has two related tables. I don't sure, that this is the best practics.
deployment.xml

<?xml version="1.0"?>^M
<deployment>^M
    <cluster hostcount="1" sitesperhost="8" />^M
    <httpd enabled="true">^M
        <jsonapi enabled="true"/>^M
    </httpd>^M
</deployment>^M


project.xml

<?xml version="1.0"?>
<project>
    <database>
        <schemas>
            <schema path='esf.sql'/>
        </schemas>
        <procedures>
            <procedure class='ru.uss.esf.voltdb.GenerateUniqueIdentifier'/>
            <procedure class='ru.uss.esf.voltdb.insert.InsertInvoice'/>
            <procedure class='ru.uss.esf.voltdb.insert.InsertService'/>
            <procedure class='ru.uss.esf.voltdb.insert.InsertParticipant'/>
            <procedure class='ru.uss.esf.voltdb.select.SelectInvoice'/>
            <procedure class='ru.uss.esf.voltdb.select.SelectService'/>
            <procedure class='ru.uss.esf.voltdb.select.SelectParticipant'/>
        </procedures>
    </database>
</project>


DDL

CREATE TABLE INVOICE (
   ID BIGINT DEFAULT 0 NOT NULL,
   NUM VARCHAR(200),
   DATE VARCHAR(200),
   TURN_DATE VARCHAR(200),
   CORRECT_NUM VARCHAR(200),
   CORRECT_DATE VARCHAR(200),
   CORRECT_COUNT VARCHAR(200),
   ADD_NUM VARCHAR(200),
   ADD_DATE VARCHAR(200),
   SEL_NAME VARCHAR(200),
   SEL_STATUS VARCHAR(200),
   SEL_TIN VARCHAR(200),
   SEL_ADDRESS VARCHAR(1000),
   SEL_CERT_SERIES VARCHAR(200),
   SEL_CERT_NUM VARCHAR(200),
   SEL_SHIP_DOC_NUM VARCHAR(200),
   SEL_SHIP_DOC_DATE VARCHAR(200),
   SEL_TRAILER VARCHAR(1000),
   PAY_KBE VARCHAR(200),
   PAY_BIK VARCHAR(200),
   PAY_IIK VARCHAR(200),
   PAY_BANK VARCHAR(200),
   DEL_CONT_NUM VARCHAR(200),
   DEL_CONT_DATE VARCHAR(200),
   DEL_TERM VARCHAR(200),
   DEL_DEST VARCHAR(200),
   DEL_WARRANT VARCHAR(200),
   DEL_EXC_WAY VARCHAR(200),
   DEL_DOC_NUM VARCHAR(200),
   DEL_DOC_DATE VARCHAR(200),
   CONSR_NAME VARCHAR(200),
   CONSR_TIN VARCHAR(200),
   CONSR_ADDRESS VARCHAR(1000),
   CONSE_NAME VARCHAR(200),
   CONSE_TIN VARCHAR(200),
   CONSE_ADDRESS VARCHAR(1000),
   CUST_NAME VARCHAR(200),
   CUST_STATUS VARCHAR(200),
   CUST_TIN VARCHAR(200),
   CUST_ADDRESS VARCHAR(1000),
   CUST_TRAILER VARCHAR(1000),
   PO_POCODE VARCHAR(200),
   PO_CODE VARCHAR(200),
   PO_SUBDIV VARCHAR(200),
   PO_BIK VARCHAR(200),
   PO_IIK VARCHAR(200),
   PO_FUND_SOURCE VARCHAR(200),
   PO_BUDG_TYPE VARCHAR(200),
   PO_KBK_CHARGE VARCHAR(200),
   PO_GOODS_CODE VARCHAR(200),
  PO_GOODS_CODE VARCHAR(200),
   PO_PAY_STATUS VARCHAR(200),
   PO_PAY_PURPOSE VARCHAR(1000),
   PO_KNP VARCHAR(200),
   PO_KBK_INPAY VARCHAR(200),
   ADD_INF VARCHAR(1000),
   APPROVED SMALLINT,
   INPUTDATE TIMESTAMP,
   PRIMARY KEY (ID)
);

CREATE TABLE SERVICE (
   INVOICE_ID BIGINT NOT NULL,
   DESCRIPTION VARCHAR(200),
   UNIT_CODE VARCHAR(200),
   UNIT_NUMENC VARCHAR(200),
   QUANTITY VARCHAR(200),
   UNITPRICE VARCHAR(200),
   PRICE_NO_TAX VARCHAR(200),
   EXCISE_RATE VARCHAR(200),
   EXCISE_AMOUNT VARCHAR(200),
   TURN_SIZE VARCHAR(200),
   NDS_RATE VARCHAR(200),
   NDS_AMOUNT VARCHAR(200),
   PRICE_WITH_TAX VARCHAR(200),
   ADDITIONAL VARCHAR(200)
);

CREATE TABLE PARTICIPANT (
   INVOICE_ID BIGINT NOT NULL,
   PARTICIPANTS VARCHAR(200),
   SHARE VARCHAR(200),
   PRICE_NO_TAX VARCHAR(200),
   EXCISE_RATE VARCHAR(200),
   EXCISE_AMOUNT VARCHAR(200),
   TURN_SIZE VARCHAR(200),
   NDS_RATE VARCHAR(200),
   NDS_AMOUNT VARCHAR(200),
   PRICE_WITH_TAX VARCHAR(200)
);

CREATE TABLE IDENTIFIER (
   TABLE_NAME VARCHAR(100) NOT NULL,
   CURRENT_VALUE INTEGER DEFAULT 1 NOT NULL,
   PRIMARY KEY (TABLE_NAME)
);

PARTITION TABLE IDENTIFIER ON COLUMN TABLE_NAME;
PARTITION TABLE INVOICE ON COLUMN ID;
PARTITION TABLE SERVICE ON COLUMN INVOICE_ID;
PARTITION TABLE PARTICIPANT ON COLUMN INVOICE_ID;
Oleg
Jan 29, 2013
My test showed, VoltDB has good perfomance on little data in inserting row (5 columns in each row) and hasn't *good perfomance for big data (50-60 columns in each row).
We planned to recomend VoltDB Enterprise Version for our customers.
But we don't sure now.

May be our config for VoltDB not optimal for this task.
Can help anyone?
rbetts
Jan 29, 2013
My test showed, VoltDB has good perfomance on little data in inserting row (5 columns in each row) and hasn't *good perfomance for big data (50-60 columns in each row).
We planned to recomend VoltDB Enterprise Version for our customers.
But we don't sure now.

May be our config for VoltDB not optimal for this task.
Can help anyone?


A table with 50-60 varchar(200) is not optimal -- we store Varchars out-of-line with the table data and so looking up each varchar is a level of indirection. If you can type your columns, you might see an improvement.

Your SERVICE and PARTICIPANT tables are unindexed -- if they have a moderate number of rows, you will be doing table scans against them, which will also not be fast.

Can you give us some idea of what your benchmark is doing? Or perhaps share the benchmark? I'd be interested in running it locally. If you don't wish to post it the forums, you can email me at rbetts@voltdb.com. I'd certainly be interested in knowing the performance numbers you're measuring and some of the details of your measurement.

The wide table rows look like they'd be roughly 15k each; Does your benchmark submit fullsized varchar fields? If so, if you are using gigabit ethernet, you will saturate an ethernet link at roughly 100MB/s, or approximately ~6.5k inserts per second.

Ryan.
Oleg
Jan 29, 2013
Ryan, thank you.
As I've written above we have a good perfomance for inserting data to table with 5-6 columns and have a problem with inserting data to table with 50-60 columns.
Now I try don't use relational tables and just put object as JSON to single table.
What do you think about store object as JSON?

Briefly about the pilot:
I use 4 servers for this task with gigabit ethernet.
The meaning of this task is transfer invoices from one customer to other.
Apache Tomcat as web layer.
Mule ESB as integration layer.
VoltDB as operation database.
Vertica as storage of invoices.
I use JMeter for high perfomance test. JMeter generates http requests to the web server, and web server transfers invoices to mule. Mule put invoices to temporary queue (Apache ActiveMQ).
Consumer of queue (only single consumer now) gets invoice from queue and transfers it to VoltDB through async client.
First async client generates ID and in the callback method calls insert proc.
In other words I test insert only. And I have problem with insert. We measure response time for generate ID and insert row in VoltDB.

For making test for you I should make locally insert test. I'll make it today and send it for you.

I know about unindexed tables, but now we have problem with insert proc.
Oleg
Jan 29, 2013
___________
Oleg
Jan 29, 2013

The wide table rows look like they'd be roughly 15k each; Does your benchmark submit fullsized varchar fields? If so, if you are using gigabit ethernet, you will saturate an ethernet link at roughly 100MB/s, or approximately ~6.5k inserts per second.

Ryan.


Ryan, no we don't submit fullsizing varchar fields in the test. And our network isn't saturated