Announcement

Collapse
No announcement yet.

JDBCLoader out of java heap memory

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • JDBCLoader out of java heap memory

    I have a table with 90 million rows and each row takes 60 bytes in postgres.When I try to migrate it from postgres to voltdb using jdbcloader, I found the network in use at the first time and then, the datasource exception, out of java heap memory, occurs.Does it mean the I must modify the tool replace

    stmt = conn.prepareStatement( "select * from " + m_config.jdbctable , resultSetType, ResultSet.CONCUR_READ_ONLY );

    with

    stmt = conn.prepareStatement("select * from "+m_config.jdbctable+" order by "+ indexColumn+" limit "+limit+" offset "+offset+";"); ?

    Are their any better methods to solve it? Thank you
    Last edited by richard; 05-11-2017, 09:38 PM.

  • #2
    Hi Richard,

    The jdbcloader included with VoltDB in the bin directory has a --fetchsize parameter. This limits how many rows it fetches from the source database, which should prevent it from running out of heap. By default the fetchsize is 100 rows, so that would be only 6KB at a time in your case. Are you by any chance setting the fetchsize much higher?

    Thanks,
    Ben

    Comment


    • #3
      Originally posted by bballard View Post
      Hi Richard,

      The jdbcloader included with VoltDB in the bin directory has a --fetchsize parameter. This limits how many rows it fetches from the source database, which should prevent it from running out of heap. By default the fetchsize is 100 rows, so that would be only 6KB at a time in your case. Are you by any chance setting the fetchsize much higher?

      Thanks,
      Ben
      Hi Ben, I print the m_config.fetchsize, it show the fetchsilze is 100 rows

      Comment


      • #4
        Originally posted by bballard View Post
        Hi Richard,

        The jdbcloader included with VoltDB in the bin directory has a --fetchsize parameter. This limits how many rows it fetches from the source database, which should prevent it from running out of heap. By default the fetchsize is 100 rows, so that would be only 6KB at a time in your case. Are you by any chance setting the fetchsize much higher?

        Thanks,
        Ben
        By the way, I run the jdbcLoader on windows, and the jdbc driver is postgresql-9.4.1212

        Comment


        • #5
          Are you getting out of memory on Postgres or on jdbcloader?

          if it is jdbcloader, what is the heap size? It should not need to be very high given the fetchsize which limits the number of rows stored in memory after it is fetched from Postgres, but it might not hurt to increase it.

          Comment


          • #6
            Originally posted by bballard View Post
            Are you getting out of memory on Postgres or on jdbcloader?

            if it is jdbcloader, what is the heap size? It should not need to be very high given the fetchsize which limits the number of rows stored in memory after it is fetched from Postgres, but it might not hurt to increase it.
            I get out of memory on jdbcloader and the heap size more than 2g,actually I check the memory, before it crash, it takes about 3.5g.
            And I use jdbcloader to load another table with about 3 million rows, it works perfectly
            Last edited by richard; 05-12-2017, 12:07 AM.

            Comment


            • #7
              You may be running into something similar to this Stack Overflow question: http://stackoverflow.com/questions/1...-out-of-memory

              Here are guidelines from Postgres for ensuring it uses a cursor to retrieve the query, otherwise, even if you have set fetch size, it will retrieve the entire results set in one go and send it to the jdbcloader, which would then need a very large heap to receive it.

              At this point, without digging into the jdbcloader code I don't know if it's compliant with all of those guidelines. It may require a feature request to address this, otherwise as a workaround you would need to either use a very large heap, or use some alterate means of getting the data as a CSV or TSV file and then load it into VoltDB using csvloader.

              One other alternative would be you could try an older prototype jdbc loader which is open source, and is a lot simpler (more primitive) than the jdbcloader. That might be easier to set the right parameters for Postgres to honor the fetch size, and then it would just use individual TABLENAME.insert calls to insert the records into VoltDB. You can take a look at it here: https://github.com/benjaminballard/voltdb-jdbc-loader

              Comment


              • #8
                Here are the Postgres guidelines: https://jdbc.postgresql.org/document...ry-with-cursor

                Comment


                • #9
                  The problem is sloved.It seems that setFetchSize depend on the driver provider.In postgres jdbc, I must turn off autocommit and then the cursor come effect. However, the db2 needn't to do that

                  Comment

                  Working...
                  X