Forum: Managing VoltDB

Post: Performance crushed when second node introduced.

Performance crushed when second node introduced.
karlh
Feb 3, 2012
Hi,

After introducing ksafety, with a 2 node cluster my TPS rates I had achieved with a single node dropped dramatically.

I have been running some benchmarks on the community edition using the voltcache example. I have 2 HP ProLiant DL380's with dual 6 core processors with hyperthreading. The deployment.xml is
On a single node, I tweaked the client and the server settings to get a max TPS rate of approx 50,000.

I then updated the deployment.xml file to:
and copied the entire voltdb folder to the second server.
The deployment started up fine, saw the peer negotiate with the leader and kicked off the client again. I monitored both instances with the volt studio. The TPS rate was approximately 7,000.
Such a dropoff in performance does not seem like something that would be explained by introducing ksafety.

Any ideas?
Thanks, Karl.
Apologies, I should have
karlh
Feb 3, 2012
Apologies, I should have previewed. The deployment.xml file had the following config with the 1 node tests:

cluster hostcount="1" sitesperhost="22" kfactor="0"
and had the following config when I moved to 2 nodes.
cluster hostcount="2" sitesperhost="22" kfactor="1"
Hi Karl,22 sites per host is
aweisberg
Feb 6, 2012
Hi Karl,

22 sites per host is a very large number of sites per host. We are working on improving scale up right now. I would start by trying 6-8 sites per host and then revisit with the next release.

What application are you using to measure performance? Do you get the same dropoff when using an example application like VoltKV or voter?
The two things you have to be aware of when stepping up to a multi-node cluster is that latency increases so synchronous clients will see a drop in performance because they don't generate enough load. Generating load asynchronously (see 3.3.3 DesignAppSync ) helps with this. You also need to make sure that NTP is running and that the clocks are synced (see Using NTP to Manage Time)

The strict dependency on NTP synchronization is also something we are working on removing.

-Ariel
Thank you for the response.
karlh
Feb 7, 2012
Hi Karl,

22 sites per host is a very large number of sites per host. We are working on improving scale up right now. I would start by trying 6-8 sites per host and then revisit with the next release.



Thank you for the response. Having investigated the issue a
bit further, it would appear that the voltcache example uses the
synchronous interface from the client, and this does not give good
throughput when replicated (as you suggested). I tested the voter
example, using the asynchronous client mode, on one node and got 200,000
TPS.

When I moved it out onto two nodes with the asynchronous client it
was doing approximately 80,000 TPS, which is more like the kind of drop
off you would expect to see once replication is introduced. I will work
on writing my own example with an asynchronous client and see how that
goes.

Thanks,

Karl.
Hi Karl,The voter example
aweisberg
Feb 7, 2012
Hi Karl,

The voter example uses small stored procedure invocations. VoltCache can be bandwidth limited depending on what settings you are using. Have you tried VoltKV? That uses the asynchronous interface. Both VoltCache and VoltKV allow you to tune payload and key size.

You can also try adding more threads to VoltCache by modifying the values in run.sh.

-Ariel
Ariel, Using the cache
karlh
Feb 7, 2012
Ariel,

Using the cache example, I added the second node to the servers list in the client parameters so that requests would be sent to both nodes. TPS went up to 25,000 Tps.

Reducing the SitesPerHost also got it going a little faster, to 27,000 TPS.
Thanks for your help today.

Karl.