Forum: Managing VoltDB

Post: How to see the data distribution?

How to see the data distribution?
tony7889
Aug 5, 2011
Hi,

I tried to insert a huge sample dataset to voltdb, but it is weird that I found only CPU reached the top, the others stay on idle. I am wondering this is my data's partition key is not equally distributed among the voltdb partitions. My question is how can I verify this? any statistics information can be provided?

Thanks a lot,
Tony
Hi Tony, By calling the
aweisberg
Aug 5, 2011
Hi Tony,

By calling the @Statistics system procedure you can retrieve several statistics that will help you determine where your data is, as well as where most work is taking place.

With the "PROCEDURE" parameter passed to @Statistics you will get a table back with a row for each procedure/partition pair showing how many invocations there are. If the invocations aren't evenly distributed you may have a hot spot where more procedures are executing is going.

With the "TABLE" parameter passed to @Statiscs you will get a table that describes the number of rows in each table at each partition. This will show you what kind of data distribution you have.

Data and work can be skewed independently, but they usually go together.
You can read about statistics in "Using VoltDB"

-Ariel