Forum: Managing VoltDB

Post: More than 100 MB of temp table memory used while executing SQL. Aborting.'

More than 100 MB of temp table memory used while executing SQL. Aborting.'
pavangadiya
Mar 10, 2015
Hello All,

I have one table in my VoltDb database. After insertingf the data from csv file the table size is now 1.8 GB.

I have partitioned the table according to a particular column. (However, I am working on a single machine & single node.. I don't know partition will be helpful in my scenario ore not ...since I have a single table in database ).

My SELECT statement query is working properly for other DBs like MySql and PostgreSql. But, when I run the same query in voltdb ..I am getting the error of temp table memory.

Here is my query :

select col1, col2, col3, col4, col5 from TABLE order by col1, col5, col2+col3 ASC;

All columns are of Integer data type.

Here is the Error :

VOLTDB ERROR: SQL ERROR
More than 100 MB of temp table memory used while executing SQL. Aborting.


I have gone through the similar questions in the forum. Some people have suggested to use VIEW ..but, using VIEW is not the good idea in my scenario since VIEWS are materlized and consume more time. In my scenario, I want to execute the database query statements with faster execution time since working in the performance scenario.

So,please tell me the solution to resolve this error.

If I have to break the query ...then how should I breaking up my query ? Does breaking up the query will degrade the performance?

Hoping for immediate reply!!

Thanks!!
pzhao
Mar 10, 2015
Pavangadiya,

Currently, VoltDB has limitations on large data sets but will improve in the coming months. As a workaround you can break up the query and divide the data using filters. Breaking up the query will not degrade performance.

Peter Zhao
pzhao
Mar 10, 2015
Pavagadiya,

In addition to using filters, you can also use limit/offset to divide up the data. Here is the documentation.
pavangadiya
Mar 11, 2015
Hello pzhao,

Thanks a lot for the quick reply! :)

Now, I am using limit/offset to breaking up the data.

I have around 6M rows in the table. So, I am using the query like this.(Please suggest me the LIMIT and OFFSET which will be the best in my scenario)

I am using the query like this :

SELECT col1, col2, col3, col4, col5 FROM TABLE ORDER BY col1, col5, col2+col3 ASC LIMIT 100000 OFFSET 0 ;

I have one more query ..how will I iterate the query in the loop (in order to perform all the data).. ?

I am using Java (JDBC connectivity) to connect with the VOLTDB. So, In my ResultSet I need to read all the rows & have to perform some operations on Resultset.

Thank You!!
bballard
Mar 11, 2015
Hi Pavagadiya,

The basic way to iterate with OFFSET and LIMIT is to increment the OFFSET by the amount of the LIMIT. For example:
int recordcount;
//query the count(*) of the table first to get "recordcount"
int limit = 100000;
for (int i=0; i<recordcount+limit; i+=limit) {
// pass in i as the OFFSET value
}

However, you may run into the same temp table limit as you did before, or simply find that this query gets slower as the OFFSET increases.

There are more efficient ways to break up a query using ranges of actual values on an indexed column, but the client-side code will be a little more complex than OFFSET and LIMIT.

But before you get into all of that, are you sure you want to retrieve the entire contents of the table back to the client? What is the purpose of doing that? If it is for ETL, you could use an export table and do ETL of records on the fly as they are ingested rather than as a one-time batch operation. Or you might find it easier to take a snapshot in CSV format and read the contents of the table from a set of CSV files. More context would help us to advise you on the best approach.

Thanks,
Ben
pavangadiya
Mar 11, 2015
Hi Ben,

Thanks for the quick reply!

Yes, I want to retrieve the entire contents of the table back to the client because I have to perform some operations on the data of the columns. (In my case, there are 5 columns ,on these column's data I have to perform some comparison operations like some mathematical comparison..)

So, I need a ResultSet which will consist the data of all the 5 columns for each row of the table. Then, I can perform my operations by looping on the Resultset (After storing the ResultSet into List).

I hope u would able to get my context. So, please advise me accordingly! :)

As you are suggesting that the query which I used gets slower as the OFFSET increases then please suggest the other alternative.

In the basic way to iterate with OFFSET and LIMIT .. How would I use my desired query statement? (Would you please explain more?)

Thank You!!
bballard
Mar 11, 2015
Hi Pavangadiya,

Why not do the mathematical comparison of columns within a query on the database itself? Also, after this comparison is done, what happens with the results, do you need to do something with every row of data, or are you filtering for a sub-set of the records, or adding it up to some aggregate value(s)?

If there's a way to do it within the database, that would be much faster I think.

Thanks,
Ben
pavangadiya
Mar 12, 2015
Hi Ben,

I can't do the operations in the database itself because first I have to fetch the columns and then I have to perform operation on the data of the columns.In comparison operation ,I will have to store some intermediate values in variables (The values will come from the list of Resultset) & then I will compare those variables to conclude my results).

Well, I am iterating my querying by the variables of limit & offset.

int count;
//Select COUNT(*) FROm Table first to get "count"

int limit = 100000;

for (int i=0; i<count+limit; i+=limit) {
// Here is my String query : SELECT col1, col2, col3, col4, col5 FROM TABLE ORDER BY col1, col5, col2+col3 ASC LIMIT "+limit+" 100000 OFFSET "+i+" ";
}
After iterating for 7 times.. it loop gets break with the same error : VOLTDB ERROR: SQL ERROR
More than 100 MB of temp table memory used while executing SQL. Aborting


The table is containing around 6M lines.. & it looks like the loop is breaking on the Limit 100000 & Offset 600000 ..

So ,Can u please suggest me the alternate solution ? or any effective select query statement?

Thank You!!
bballard
Mar 13, 2015
I put up an example in a branch of one of our demos: https://github.com/VoltDB/app-nbbo/tree/playback

Here is the example java client: https://github.com/VoltDB/app-nbbo/blob/playback/client/src/nbbo/PlaybackClient.java

This is the core loop that retrieves the data in chunks. The steps before that are to estimate the offset value, which you may choose to do manually during your development rather than at runtime.

for (long t=startTime; t<endTime; t+=microsOffset) {
long t1 = t+microsOffset;
String sql = "SELECT * FROM ticks WHERE time >= TO_TIMESTAMP(MICROS," + t + ") AND time < TO_TIMESTAMP(MICROS," + t1 + ") ORDER BY time";
VoltTable v = client.callProcedure("@AdHoc",sql).getResults()[0];
// read the table...
}