Forum: Building VoltDB Applications

Post: MAX() returning more than 100 MB of temp memory used

MAX() returning more than 100 MB of temp memory used
cfuser
Mar 1, 2013
From studio, I'm running
select max(time_long) from TABLE TEST_DATA;


And I receive
VOLTDB ERROR: SQL ERROR More than 100 MB of temp table memory used


Count works fine. I looked at
https://forum.voltdb.com/showthread.php?34-Error-More-than-100-MB-of-temp-table-memory-used&highlight=VOLTDB+ERROR%3A+SQL+ERROR+temp+table+memory
and it was mentioned that AVG isn't distributed--Is MAX not, either?


The DDL is

create TABLE TEST_DATA (
  time_long BIGINT NOT NULL,
  dim1  INTEGER NOT NULL,
  dim2  INTEGER NOT NULL,
  dim3  INTEGER NOT NULL,
  dim4  INTEGER NOT NULL,
  dim5  INTEGER NOT NULL,
  dim6  INTEGER NOT NULL,
  dim7  INTEGER NOT NULL,
  measure1 BIGINT NOT NULL,
  measure2 BIGINT NOT NULL,
  measure3 BIGINT NOT NULL,
  measure4 BIGINT NOT NULL,
  measure5 BIGINT NOT NULL,
  measure6 BIGINT NOT NULL,
  measure7 BIGINT NOT NULL
);


PARTITION TABLE TEST_DATA ON COLUMN time_long;
CREATE INDEX IDX_TIME_LONG ON TEST_DATA(time_long);

and the deployment file (as generated by enterprise manager) is

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<deployment>
    <cluster kfactor="0" sitesperhost="4" hostcount="3"/>
    <paths>
        <voltdbroot path="/export1/cfuser/voltdb/83886090"/>
        <snapshots path="snapshots"/>
        <exportoverflow path="export_overflow"/>
        <commandlog path="command_log"/>
        <commandlogsnapshot path="command_log_snapshot"/>
    </paths>
    <partition-detection enabled="true">
        <snapshot prefix="voltdb_partition_detection"/>
    </partition-detection>
    <admin-mode adminstartup="false" port="21211"/>
    <heartbeat timeout="10"/>
    <httpd port="8080">
        <jsonapi enabled="true"/>
    </httpd>
    <commandlog logsize="1024" enabled="true" synchronous="false">
        <frequency transactions="2147483647" time="200"/>
    </commandlog>
    <systemsettings>
        <temptables maxsize="100"/>
        <snapshot priority="6"/>
    </systemsettings>
</deployment>



TIA
pmartel
Mar 1, 2013
The output of:

EXPLAIN select max(time_long) from TEST_DATA;

shows that MAX is being distributed.

The problem appears to be with the scalability of the "distributed" part of the plan.
The query planner is failing to recognize the opportunity to leverage the index in this case.
I believe it to be an unfortunate side effect of a previously reported issue in the query planner.
It is logged as issue ENG-1565.
I have updated the record for that issue to highlight the fact that it can lead to query failure at higher scale -- as this was not obvious from the original report.
Do you know the approximate row count at which the query begins to fail?

If you are using a recent enough version of VoltDB, you should be able to use the workaround described in that ticket, as in:

select time_long from TEST_DATA ORDER BY time_long DESC limit 1;

Here is what the EXPLAIN command shows (for versions that support this workaround):

RETURN RESULTS TO STORED PROCEDURE
LIMIT 1
ORDER BY (SORT)
RECEIVE FROM ALL PARTITIONS
SEND PARTITION RESULTS TO COORDINATOR
INDEX SCAN of "TEST_DATA" using "IDX_TIME_LONG" (for sort order only)
inline (LIMIT 1)

Thank you for bringing this significant aspect of the issue to our attention.
I hope that the workaround proves useful to you until we can make a fix available.

Paul Martel
VoltDB, Inc.