Forum: Building VoltDB Applications

Post: Counters for a live statistics dashboard

Counters for a live statistics dashboard
lerchmo
Jul 3, 2011
I would like to use VOLTDB to keep realtime counters for a live (Streaming with socket.io) dashboard.

The easiest way to do this for me, is to create materialized view's on an event table. But once I store that event data in my historical data, DB. I would like to keep the materialized views.. materialized.

So basically I want my actual event table to drain itself into the historical data table (either through voltdb export or outside of voltdb, yet not reflect the Deletion's in the materialized view.

So say... 5 minutes after an event is recorded, it's gone from voltdb, but all of the materialized views still count it.
Would be great to have an answer
max
Jul 5, 2011
Would be great to have an answer from developers because we have the same issue. We already have some thoughts but developers insight about the problem described above would be very useful.
re: counters for a dashboard
tcallaghan
Jul 5, 2011
You can simulate materialized views in regular VoltDB tables, even adding more functionality such as "last value", "last update date", and running average.

Can you provide the schema for your event table and desired materialized views so I can reply with specific code samples?

-Tim
Sure, as an example, I would
lerchmo
Jul 5, 2011
Sure, as an example, I would like to keep aggregates based on the lowest granularity of dimentional data to pivot on, and index each dimention.

So for example:

dimensions:
publisher|domain|placement|advertisement|zipcode

metrics
impressions|clicks|conversions|revenue

Events would come in:
time:timestamp,publisher:xxxmedia, domain: xxxmedia1.com, placement:topbanner1, advertisement:yellowbanner34, zipcode:60657, impression:1,click:0,conversion:0,revenue:0

And I would create a materialized view based on the lowest possible granularity:
hash(hour+publisher+domain+placement+advertisement+zipcode), hour, publisher,domain,placement,advertisement,zip, impressions:103, clicks:33,conversions:21, revenue:0

Than index publisher's, domains, placements.

I have decided against this particular data structure due to how much memory it takes up as the dimentions and cardinality stack up.

But even for simple counters like I am using now:

CREATE TABLE pub_day(
publisher_id INTEGER NOT NULL,
group_day INTEGER NOT NULL,
impressions INTEGER NOT NULL DEFAULT 0,
clicks INTEGER NOT NULL DEFAULT 0,
conversions INTEGER NOT NULL DEFAULT 0,
revenue FLOAT NOT NULL DEFAULT 0.00,
primary key (publisher_id)
);

I would like to fill it with materialized views, instead of 20 queued updates (if possible).
"Snapshot" and re-aggregation
sebc
Jul 29, 2011
Sure, as an example, I would like to keep aggregates based on the lowest granularity of dimentional data to pivot on, and index each dimention.

So for example:...


Hey,

I think your best shot at doing this is to "snapshot" the view data into a separate table right before and right after you delete the records, creating a "delta" table into which you dump the values. In a stored procedure you can then pull the data from the live view and the materialized view and re-aggregate them together to feed the historical data to the client?

More work than you'd want to do, but that's unfortunately a very classic data warehousing problem. If I just lost you, BTW, let me know - I'll be happy to provide you with a basic sample of what I mean.

Cheers,
Seb