Forum: Building VoltDB Applications

Post: VoltDB performance issue between development and production servers

VoltDB performance issue between development and production servers
cfuser
Feb 20, 2013
I'm trying to figure out why I'm seeing such a disparity of performance between my local Mac and our production servers (where the test is running significantly slower).

Running locally (using the modified AsyncBenchmark)...


Running benchmark...
00:00:05 Throughput 8260/s, Aborts/Failures 0/0, Avg/95% Latency 8.52/34ms
00:00:10 Throughput 8919/s, Aborts/Failures 0/0, Avg/95% Latency 5.00/8ms


This is a single node.

Running on a server class machine (8 core hyperthreaded), single node


Running benchmark...
00:00:05 Throughput 2067/s, Aborts/Failures 0/0, Avg/95% Latency 989.42/2000ms
00:00:10 Throughput 1826/s, Aborts/Failures 0/0, Avg/95% Latency 1662.62/2000ms


Running with a three node cluster


Running benchmark...
00:00:05 Throughput 154/s, Aborts/Failures 0/0, Avg/95% Latency 2327.57/2000ms
00:00:10 Throughput 159/s, Aborts/Failures 0/0, Avg/95% Latency 7005.41/2000ms


My k factor is 0 in all cases. I noodled with the sites/host and didn't see any dramatic improvement.

The DDL is as follows

create TABLE TEST_DATA (
  time_long BIGINT NOT NULL,
  dim1  INTEGER NOT NULL,
  dim2  INTEGER NOT NULL,
  dim3  INTEGER NOT NULL,
  dim4  INTEGER NOT NULL,
  dim5  INTEGER NOT NULL,
  dim6  INTEGER NOT NULL,
  dim7  INTEGER NOT NULL,
  measure1 BIGINT NOT NULL,
  measure2 BIGINT NOT NULL,
  measure3 BIGINT NOT NULL,
  measure4 BIGINT NOT NULL,
  measure5 BIGINT NOT NULL,
  measure6 BIGINT NOT NULL,
  measure7 BIGINT NOT NULL
);


PARTITION TABLE TEST_DATA ON COLUMN time_long;
CREATE INDEX IDX_TIME_LONG ON TEST_DATA(time_long);

: .. procs



So, I'm not really sure how to best troubleshoot this in order to figure out what I'm doing wrong (as, clearly, I am). Any advice?
rbetts
Feb 20, 2013
Is it possible that you are running all multi-partition procedures? Can you post the output of the voltdb compile (catalog creation) command?

Ryan.
rbetts
Feb 20, 2013
Is it possible that you are running all multi-partition procedures? Can you post the output of the voltdb compile (catalog creation) command?

Ryan.
cfuser
Feb 21, 2013
Thanks for the reply. Note that for now, I'm just loading data that is randomly distributed over the past 7 days. Although this is not necessarily how it would be loaded in my production environment, it is consistent between the different test environments I'm using.



Successfully created benchmark.jar
Includes schema: testData.sql


[MP][RW] InsertTestData
  insert into TEST_DATA  (time_long, dim1, dim2, dim3, dim4, dim5,...

[SP][RW] TEST_DATA.insert
  INSERT INTO TEST_DATA VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)...




Even if it was multi-partitioned, shouldn't I expect a more powerful machine to perform at least as well as my laptop when running in single-node mode vs. 25% as well?
rbetts
Feb 22, 2013
My experiment of reply from my phone has failed, apparently...

[MP][RW] InsertTestData -- this insert is Multi-partition (that's the MP abbreviation in the report ... sorry there isn't a legend on this report yet.)

Note that we auto-generate CRUD procedures for partitioned tables. You can use the insert we auto-generate instead (which is single partition) if you don't have any business logic in your InsertTestData procedure beyond the straight insert.

[SP][RW] TEST_DATA.insert

MP transactions aren't bottle-necked on CPU; faster CPU won't really matter. Use the SP insert and you should see the result you expect.

Thanks,
Ryan.
rbetts
Feb 22, 2013
My experiment of reply from my phone has failed, apparently...

[MP][RW] InsertTestData -- this insert is Multi-partition (that's the MP abbreviation in the report ... sorry there isn't a legend on this report yet.)

Note that we auto-generate CRUD procedures for partitioned tables. You can use the insert we auto-generate instead (which is single partition) if you don't have any business logic in your InsertTestData procedure beyond the straight insert.

[SP][RW] TEST_DATA.insert

MP transactions aren't bottle-necked on CPU; faster CPU won't really matter. Use the SP insert and you should see the result you expect.

Thanks,
Ryan.
cfuser
Feb 26, 2013
I've waited to respond as I wanted to try this on my server machines, but the autogenerated one definitely it runs more quickly than my stored procedure on my laptop (Now, I'm having an issue starting enterprise manager to test this out across a cluster, but that likely is a different thread). My table is defined as single partitioned, however, I missed the fact that--by default--the procedures are multi.

As a general rule, our data is never updated after insertion and sequential by timestamp (for the most part) and our searches will almost always include timestamp (but include the other dimensions as well). As a 'best practice', does it make sense to keep this data in a single partition?

Thanks for your help.
ChrisJacob
Jan 3, 2014
Is it a common issue with multi-partition procedures? I am also experiencing VoltDB performance issues between the servers. As rbetts asked I am also running with multi-partition procedures. I have never read anywhere voltDB has such an issue. Is there a way to fix this?








For more help
Windows Live help now
jpiekos
Jan 3, 2014
Multi-part procedures run on all partitions, at the same time. Thus the throughput is usually lower than a single-partition workload (where single partition transactions can be/are executed in parallel). Our free short video training lessons at Volt University help illustrate the differences (see here: http://voltdb.com/resources/volt-university/tutorials/).

Ideally you would partition your data such that your high velocity (high volume) operations can be single-partition transactions. This would achieve optimal performance. If you'd like some help, or review, with your application design, drop a note to info@voltdb.com.

John