Announcement

Collapse
No announcement yet.

VoltDB native memory is wasted on invalid SP inserts

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • VoltDB native memory is wasted on invalid SP inserts

    Hi,
    I've encountered quite strange behaviour that makes me lost.
    Preconditions
    • OS: Ubuntu 14.04.3 LTS
    • Transparent huge pages: off
      Code:
      $ cat /sys/kernel/mm/transparent_hugepage/enabled
      always madvise [never]
      $ cat /sys/kernel/mm/transparent_hugepage/defrag
      always madvise [never]
    • VoltDB sources revision: tag 'voltdb-5.5'
    • VoltDB build line: ant clean dist -Djmemcheck=NO_MEMCHECK
    • VoltDB schema:
      Code:
      CREATE TABLE TABLE2 (
          id INTEGER NOT NULL,
          data VARCHAR(65535) NOT NULL,
          CONSTRAINT pk_TABLE2 PRIMARY KEY (id)
      );
      PARTITION TABLE TABLE2 ON COLUMN id;


    Scenario
    1. Execute VoltDB (1 node, 2 partitions, catalog with schema mentioned above)
    2. Execute iteratively in 10 threads (assuming voltClient is initialized) the following code:
      Code:
              int key = random.nextInt(1000000);
              voltClient.callProcedure("TABLE2.insert", key, UUID.randomUUID().toString());


    Result
    After table is filled (@Statistics TABLE reports about 500000 tuples in each partition) RES memory reported by 'top' continues to grow gradually.
    If I leave machine overnight untouched, the VoltDB process is killed by OOM killer.

    Expected Result
    RES memory should not grow after table is filled.

    Workaround
    Interestingly, but if I execute "select count(*) from TABLE2" from sqlcmd, the extra memory is reclaimed!

    I noticed some amount of unreacheable DirectByteBuffer instances in the heap dump (it was reduced after calling 'select count...' above). Probably it is connected somehow but do not sure, cause as I know VoltDB calls buffer's cleaner explicitly...

    Did anyone notice similar behavior of have any explanation of what happens?
    Thanks!
    Last edited by Dmtry; 09-02-2015, 05:34 AM.

  • #2
    Some new details:
    Part of memory is reclaimed each time when in course of MP request ExecutionEngine.nativeExecutePlanFragments method is executed against each individual partition. Looks like some per-partition swollen buffers are cleared.

    Comment


    • #3
      When you left the server running overnight, were there any transactions, either MP or SP, running periodically?

      Did you have auto-snapshot turned on?

      What was the RSS after inserting all the tuples and what was the RSS when the process was killed by the OOM killer?
      Ning

      Comment


      • #4
        Originally posted by nshi View Post
        When you left the server running overnight, were there any transactions, either MP or SP, running periodically?
        Yes, 10 threads continually executed SP inserts to the same partitioned table. Please note that table has 'data' VARCHAR column. When I tried to insert to the table that has NO any column except primary 'id', I did not note the problem with memory.

        Originally posted by nshi View Post
        Did you have auto-snapshot turned on?
        I do not see 'snapshot' section in deployment.xml I used so seems no, there were no snapshots.

        Originally posted by nshi View Post
        What was the RSS after inserting all the tuples and what was the RSS when the process was killed by the OOM killer?
        Well, could not remember exact figures but top MEM column showed about 3.7% out of 16 GB after inserting of 1000000 unique records. Cannot find at the moment record regarding OOM in the system log (it happend quite long time ago) but if it is needed I can try to reproduce OOM condition again. Should I? If so, what are other metrics can I measure to help with the problem analysis?

        Comment

        Working...
        X