Forum: Building VoltDB Applications

Post: Partition column 'design'

Partition column 'design'
chbussler
Apr 6, 2010
Hi,

one possibility when implementing a multi-tenant system is to add a 'tenant' column to every table so that all data access is qualified per tenant. This means that every primary key will be composite.


As only one column per table can be a partition column, I think the documents suggest to maybe combine two or three columns that 'usually' would be separate in a 'traditional' RDBMS to one column and make that the partition column.


As a concrete example, lets assume a user table with a tenant identifier column and a user identifier column, both together being the primary key. Tenant identifiers are 1, 2, 3, ... and for each tenant the user identifiers start with 1, 2, 3, ... So each tenant has a user identifier 1, user identifier 2, etc.


Then there are two possibilities: (a) concatenate tenant identifier with user identifier, or (b) concatenate user identifier with tenant identifier.


My question is, does it matter?


Thanks,


Christoph
Partition Column + Multi-Tenant Systems
tcallaghan
Apr 7, 2010
Christoph,

There are important differences between the partition column and primary key selection:


  • You choice of partition column (which can be a composite of multiple columns merged into a single column in your table) is used to spread out your workload in the VoltDB cluster. This selection ensures that for a given value of this column all rows of data for the table will exist in a particular VoltDB partition. No index is created for this selected column.
  • Your primary key (or other supporting indexes) can be made up of one ore more columns.


In a multi-tenant system, as you described, if you select "tenantId" as the partition column for your tables then a particular tenant/customer will have all their transactions executing in a single partition. Depending on your customer base and workload this may be a good choice. If a single tenant/customer does multiple operations at once then you may want to partition on some other column in your table.

Let me know if this helps.

-Tim
Good clarification
chbussler
Apr 8, 2010
Christoph,

There are important differences between the partition column and primary key selection:


  • You choice of partition column (which can be a composite of multiple columns merged into a single column in your table) is used to spread out your workload in the VoltDB cluster. This selection ensures that for a given value of this column all rows of data for the table will exist in a particular VoltDB partition. No index is created for this selected column.
  • Your primary key (or other supporting indexes) can be made up of one ore more columns.


In a multi-tenant system, as you described, if you select "tenantId" as the partition column for your tables then a particular tenant/customer will have all their transactions executing in a single partition. Depending on your customer base and workload this may be a good choice. If a single tenant/customer does multiple operations at once then you may want to partition on some other column in your table.

Let me know if this helps.

-Tim

Hi Tim,

thanks; that clarifies and confirms what I was expecting, thanks a lot.

Christoph
Sabrina
Aug 31, 2015
Hello, I'm looking for some information about multi tenant and VoltDB - this was the only post I found. Has VoltDB no multi-tenant implementation (different databases for different tenants)?

Thanks a lot.

Sabrina
bballard
Aug 31, 2015
Hi Sabrina,

You can have different databases for different tenants. Typically this would be deployed on separate hardware or VMs as with any database. With containers such as Docker, this is much easier to manage today.

Another option for multi-tenancy would be to use a shared set of tables where the tenant ID could be a column in the tables. In this scenario, essentially the application has a single database behind it, and manages the tenants internally.

Best regards,
Ben