Announcement

Collapse
No announcement yet.

Dynamic SQLStmt inside stored procedure for internal compilation of SQL queries, dynamic reports and explanation of execution (theoretical discussion)

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dynamic SQLStmt inside stored procedure for internal compilation of SQL queries, dynamic reports and explanation of execution (theoretical discussion)

    From my own tests, when creating an SQLStmt inside procedure run code (this is a contrived example)

    Code:
    SQLStmt query;
            try {
                 query = new SQLStmt("select * from depart where segments LIKE '%-"+via+"-%';");
            } catch (Exception e) {
                throw new VoltAbortException(e);
            }
    
            voltQueueSQL(query, EXPECT_NON_EMPTY);
            return voltExecuteSQL();
    you get a NullPointerException in VoltDB somewhere when parsing arguments for executing this statement. Which kind of misleads the user into thinking there's a bug in VoltDB itself.

    From thread "How do you create a sql query of dynamic fields"
    I know VoltDB was designed in a way to declare all SQL statements beforehand.

    I also know about the @ExplainProc function that provides a neat output of all those planned executions.

    When a plan is not correct, the user gets an error at start-up when it's being registered.


    But, the function can do each of those SQL executions an arbitrary number of ways, based on results from previous executions; so this analysis is only partial.

    Questions:
    1. What is the rationale behind such design decision, if the analysis can not show the number of executions for each statement?

      I'm not suggesting implementing complicated code analysis, of course.
      It is possible to implement detection of dynamic SQLStmt creation inside procedure and provide some "dynamic" warning into @ExplainProc output.
      • Is it possible to implement some [lightweight] dynamic registration of SQLStmt into VoltDB at runtime?
        Throwing a separate exception class if unsuccessful.


    2. Profiling
      I'm also suggesting a kind of "profiling" functionality (client.callProcedureEx or @execEx) that returns information about how many executions of each pre-declared SQLStmt were made in this particular execution of a stored procedure. If designed as an internal SQL function, the profiler can report statistics plus @ExplainProc-like output of all dynamically registered SQLStmts.




    Some more background:

    In one of my projects I was using a DB abstraction layer that generates SQL from a metadata description. The DB was a simple MySQL which stored different types numbers, sets, enums and JSON as text. This "DBCachedObject" layer had a description which strings should be converted from JSON, checked if stored value is a member of this enum/set and checked if look-ups can be made from one key to another. Thus, metadata copied a part of DB schema to make simple runtime checks of operations and generate look-up and other fetch-save queries dynamically. Such schema replication was useful at uncovering bugs too - when a new developer added a new table but forgot to add an index, and another developer was trying to make a look-up in his new method, he got an exception. Otherwise it would have been a slow down which could have stayed unnoticed.

    Virtually every "framework" kind of project has a similar abstraction layer now. This is application layer, of course.

    What if I wanted to put similar functionality into a lower layer, i.e. into a smart stored procedure? Is it an overshoot?

    I know that in fast data approach stored procedures should stay thin to reduce latency, but what if application requirements allow unpredictably slow requests (up to seconds), even though data ingestion is at >500 rps.

  • #2
    Originally posted by Artem Pylypchuk View Post
    From my own tests, when creating an SQLStmt inside procedure run code (this is a contrived example)

    you get a NullPointerException in VoltDB somewhere when parsing arguments for executing this statement. Which kind of misleads the user into thinking there's a bug in VoltDB itself.
    The fact that you don't get a better error seems like a bug, perhaps a regression. We'll look into it.

    Originally posted by Artem Pylypchuk View Post
    Questions:
    1. What is the rationale behind such design decision, if the analysis can not show the number of executions for each statement?
    We can still check what tables are going to read from or written to. In fact, we can tell if the procedure can modify state at all.

    We can also identify if any of the SQL is likely to take a while, or if it's simpler stuff.

    There's a lot you can do with even just this information.

    Originally posted by Artem Pylypchuk View Post
    • Is it possible to implement some [lightweight] dynamic registration of SQLStmt into VoltDB at runtime?
      Throwing a separate exception class if unsuccessful.
    We added a "VoltProcedure::voltQueueSQLExperimental" method for this purpose some years ago, but there are a few corner case bugs remaining, as well as some real performance issues.

    The biggest issue with this approach is that planning can block execution of other work, and planning SQL can take much longer than executing it in VoltDB land. This gets better with caching, but there are some tradeoffs we need to figure out there as well. I suspect these problems can be overcome, but I'm not sure when we'll revisit this feature. I will note your post as a +1 for the feature.

    Originally posted by Artem Pylypchuk View Post
    • Profiling
      I'm also suggesting a kind of "profiling" functionality (client.callProcedureEx or @execEx) that returns information about how many executions of each pre-declared SQLStmt were made in this particular execution of a stored procedure. If designed as an internal SQL function, the profiler can report statistics plus @ExplainProc-like output of all dynamically registered SQLStmts.
    That might be interesting.

    Originally posted by Artem Pylypchuk View Post
    I know that in fast data approach stored procedures should stay thin to reduce latency, but what if application requirements allow unpredictably slow requests (up to seconds), even though data ingestion is at >500 rps.
    I think this is interesting. Today, we're not very focused on thicker layers between VoltDB and the application. In practice, it seems that you end up making a lot of compromises, and you might be better off with a more traditional system. But as I said before, in the long-view, we'd like to get there.

    Comment

    Working...
    X