Forum: VoltDB Architecture

Post: What if the number of cores exceeds the value of sites per host?

What if the number of cores exceeds the value of sites per host?
beilei sun
Jan 14, 2017
Hi,

I tested VoltDB on a two socket machine, whose specification is as the following:

MEMORY
Kingston DDR3, 1066MHz/72bits, 124GBytes
PROCESSOR
2 SMT/Core 4 Core/Socket 2 Socket
Xeon E5620 @ 2.4GHz
L1d: 32KB L1i: 32KB
L2: 256KB L3: 2288KB


The stored procedure is defined as one INSERT and ten SELECT operations.
Then I saw the TPS and latency of my voltdb benchmark is shown as follows:
114

As the advice of the "Voltdb, Guide to performance and Customization" document, the number of sites per host should be around the number of cores (or hyper threads), and should not exceeds 24.
However, I have seen the performance increases as the increase of the number of sites per host, and start to decrease when the number of sites per host is over 64. (Is the result reasonable?)
My questions are as follows:
Each site ( or partition) is maintained by a thread, what if the number of sites per host is larger than the number of cores (or hyper threads)?
In this case, if a core has to generate serveral threads to for each partition, then does that influence the execution order of transactions and cause overhead due to the thread contention?
Perhaps, the NUMA architecture makes the performance of VoltDB in a server unpredictable?

Thanks a lot.
bballard
Jan 16, 2017
Hi,

Thanks for sharing this. The answer really depends on the use case or workload, as well as the hardware. The rule of thumb in the documentation certainly has exceptions. There is essentially a trade-off between the benefits of additional parallel sites, and the CPU core context-switching between threads, but CPUs have gotten much better at this. Also, many VoltDB workloads are not CPU-bound.

What I tell customers is to start at 8 sitesperhost, because even on a low end machine the additional parallelism outweighs the cost of context-switching. Running only 1 or 2 sitesperhost is just not enough parallelism. But I always recommend running tests as you've done to find the optimum setting.

I would still give pause to setting it as high as 32 or 64, as it will require a larger heap as you add complexity to the schema, and if the workload evolves, the optimum setting may become a lower number of sites. Even in your example, 24 sitesperhost is very close to the optimum point, so beyond that you are getting very diminished returns.

Best regards,
Ben
beilei sun
Jan 17, 2017
Thanks, Ben. Your reply helps a lot.

I am going to run VoltDB on a 16 Socket machine. I would suspect that the NUMA architecture will be the performance bottleneck since VoltDB is NUMA-unaware...