Forum: Building VoltDB Applications

Post: How does voltdb utilize partitions for non-singled sited query?

How does voltdb utilize partitions for non-singled sited query?
hayatoa
Dec 4, 2011
Hello, I would like to know how voltdb utilize partitions for non-singled sited query?

Say, I have a following table and I partition the table by category_id.

CREATE TABLE sample (
created_timestamp without time zone not null,
item_id integer not null,
category_id integer not null
);
Suppse I have following configuration.

<?xml version="1.0"?>
<deployment>
<cluster hostcount="1"
sitesperhost="4"
/>
<httpd enabled="true">
<jsonapi enabled="true" />
</httpd>
</deployment>


If I query non-singled partition query like bellow,

select category_id, count(*) from sample group by category_id;

how does the task distributed among partition?

Does each parition tries to compute count(*) for each category_id and join the result at the end?

or

Something else?

I would like to know how voltdb utilize partition for non-single sited query such as above case.

Hayato
I think it does what you expect.
jhugg
Dec 4, 2011
"Does each parition tries to compute count(*) for each category_id and join the result at the end?"

Yes. VoltDB will do the group by on each node, then aggregate the partitioned results on a single node before sending the result to the client.
Relevant blog post
rbetts
Dec 5, 2011
Thank you!
hayatoa
Dec 5, 2011
You might also see: http://blog.voltdb.com/optimizing-distributed-read-operations-voltdb/


Thank you for replies.

Especially, the URL you mentioned helps a lot :)

I have another question regarding to sql planner. I will post in different entry.