Forum: Building VoltDB Applications

Post: Is voltDB suitable for OLAP cube implemetation

Is voltDB suitable for OLAP cube implemetation
adileshem
Nov 2, 2015
Hi,

I have a a bunch of files I want to serve, on-demand, to a UI, and use them as "small" OLAP cubes. When a user opens a file in a UI, I want to load via voltDB ASAP and summarize it for the user.
The cube has several dimensions (i.e. columns in the file), and I want to expose each dimension as a filtering criteria that can be mixed to his liking.
​
When the user moves to another file, I want to free that cube from the memory and move to loading another cube.

From the documentation it isn't clear to me if voltDB is suitable for this – performance and functionality wise.

My main 2 questions are:
1. Is voltDB a good technology for this use-case in terms of on-demand performance?
2. How to perform such queries that return multiple dimensions and aggregations?


For example, if my data consists of the following lines:

time c-ip method cs-host cs-port cs-fullpath cs-content-type cs-user-agent
12:00:00 10.1.1.1 GET foo.com 80 /path/index.html text/html Mozilla 4.0
12:00:00 10.2.2.2 GET foo.com 80 /path/index.html text/html Mozilla 4.0
12:00:02 10.3.3.3 GET foo.com 80 /path/index.html text/html Mozilla 4.0
12:00:02 10.1.1.1 GET bar.com 80 /path/index.html text/html Mozilla 4.0
12:00:02 10.3.3.3 GET bar.com 80 /path/index.html text/html Chrome 80.1
12:00:59 10.1.1.1 GET bar.com 80 /path/index.html text/html IE 10
12:00:59 10.3.3.3 GET bar.com 80 /path/index.html text/html Mozilla 4.0

As such, I’d like each column to be a widget of unique values and the count – derived from the filter – of how many times it appeared. For example, based on the data above I’d like voltDB to return:
For each `c-ip` – [{“10.1.1.1”: 3}, {“10.2.2.2”: 1}, {“10.3.3.3”: 3}]
for `cs-host` – [{“foo.com”: 3}, {“bar.com”: 4}]
… same for all other columns

Upon applying a filter, e.g. `method==“GET" AND cs-host==“foo.com”, voltDB should return as above, but only for rows matching the conditions. Even better if I could make it return everything, but `count` would be zero where it didn’t match the filter, e.g. for `cs-host`, after the filter, I’d ideally want: [{“foo.com”: 3}, {“bar.com”: 0}], and for for `cs-ip`: [{“10.1.1.1”: 1}, {“10.2.2.2”: 1}, {“10.3.3.3”: 1}]


Thank you,
Adi