Forum: Building VoltDB Applications

Post: Replicated table on millions of "users"

Replicated table on millions of "users"
javadevmtl
Aug 18, 2016
Hi so i have 2 type of users... Merchants and shoppers.

Merchants are in the millions but shoppers are in the 100 of millions. Currently Im partitioning by shoppers and trx based on the "shopper" key this works fantastic and doing 40k business trx per second. But i need to get some config data from "merchant" table is it ok if theres 5-10 million merchants in replicated table? Its just standard config like merchant name, merchant currency. It may get updated once in a while but its not a high rate table. I.e Each merchant may change their config one a while but nothing crazy...
javadevmtl
Aug 21, 2016
Could this work?
rmorgenstein
Aug 23, 2016
There are 2 concerns - it will be slower to update than the partitioned shoppers table. If you only update at a low rate, this isn't a big deal. The other concern is that it may take a lot of memory - it will be replicated on every site on every host. If you have enough memory, then give that a try.

Ruth
javadevmtl
Aug 24, 2016
According to the size work sheet for 10 million "merchants" it's 5g. Not so bad. But I don't think it will be even close to 10 million. But it will be in millions.

Btw is insertion into a replicated table slower? It's managing about 2K INSERTS/sec. While on a partitioned table it's 6K INSERTS/SELECTS/sec (Full business logic). This is on my iMac quad core with everything running on it.

Though 2K is still pretty good. Just curious...
rmorgenstein
Aug 24, 2016
Just remember that it is 5G (or whatever the real # is) X sitesperhost - because it is replicated to each site.

Replicated table inserts on 1 node are pretty fast, but they will slow down as you add servers and have to coordinate with extra network hops. But it's fine to use - as long as it is less frequent than the single-partition (i.e. scalable) part of the workload.

Are you working with one of our sales engineers? They might be able to help you come up with a better scheme for your proof of concept.

Ruth
javadevmtl
Aug 24, 2016
Almost. Not at that step yet.

Just remember that it is 5G (or whatever the real # is) X sitesperhost - because it is replicated to each site.

Replicated table inserts on 1 node are pretty fast, but they will slow down as you add servers and have to coordinate with extra network hops. But it's fine to use - as long as it is less frequent than the single-partition (i.e. scalable) part of the workload.

Are you working with one of our sales engineers? They might be able to help you come up with a better scheme for your proof of concept.

Ruth