Announcement

Collapse
No announcement yet.

Csvloader gives error: Error connecting to the servers: localhost

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Csvloader gives error: Error connecting to the servers: localhost

    Hi there,
    I got the error msg "Error connecting to the servers: localhost" when I try to load a csv file. The database is setup on a three host cluster (server1, server2, server3).

    This is the command I use: "csvloader --skip 1 -r ../a/b/20160130 --file xxx.csv table1". I ran this command from server2.
    I could ssh to "localhost" without a problem from this host. The only way I could got the loader to work now is I specify another host on the cluster by using "-s server3".

    What am I missing here?

  • #2
    energyd,
    If you've specified the internal and external interfaces to hostname or ip when you created a new VoltDB database, localhost/loopback interface is no longer used used to connect to the database. Could you try using '-s server2'?
    Peter Zhao

    Comment


    • #3
      Hi Peter,
      Thanks for the reply. I tried "-s server2" but got the same "ERROR: Error connecting to the servers: server2"

      Comment


      • #4
        energyd,
        Try server2's ip instead of server2. Can you share the startup command for server2?
        Peter Zhao

        Comment


        • #5
          Originally posted by pzhao View Post
          energyd,
          Try server2's ip instead of server2. Can you share the startup command for server2?
          Peter Zhao
          Hi Peter,
          I tried with the ip address, still got the same error.

          Here is the command line:
          [main] HOST: Command line arguments: org.voltdb.VoltDB recover deployment /path/to/custom_deployment.xml placementgroup 0 host server1 license /path/to/license

          Comment


          • #6
            energyd,
            Unfortunately, I am unable to reproduce this. Could you provide VoltDB version you are running on?
            Could you try running sqlcmd on any server and run the following command:
            exec @SystemInformation overview;
            The output should contain the row HOSTNAME server2. The line above it should have IPADDRESS associated with server2. Please try this ipaddress with csvloader.
            Peter Zhao

            Comment


            • #7
              Originally posted by pzhao View Post
              energyd,
              Unfortunately, I am unable to reproduce this. Could you provide VoltDB version you are running on?
              Could you try running sqlcmd on any server and run the following command:
              exec @SystemInformation overview;
              The output should contain the row HOSTNAME server2. The line above it should have IPADDRESS associated with server2. Please try this ipaddress with csvloader.
              Peter Zhao
              Hi Peter,
              I actually found that I could not even run sqlcmd on server2. So I ran @SystemInformation from server1 and tried using the server2 ip address listed there to run sqlcmd on server2 like this "sqlcmd --servers=10.xx.xxx.xxx" but it shows connection refused.

              I'm running VoltDB 5.8.0.

              ADD: running lsof -iTCP:21212 on server2 I could see something like "TCP localhost:21212->localhost:46006 (CLOSE_WAIT)"
              So the server is not properly listening to 21212? How come?

              Another minor thing I've noticed that I'm unsure if it relates to this is Volt is still logging into "volt.log.2016-02-01" instead of into an existing "volt.log". In volt.log, it's using eastern time but in volt.log.2016-02-01 it's using cst(I believe). This happens only on server2. Right now the volt.log on server2 only contains several lines which are output of csvLoader I ran last night.
              Last edited by energyd; 02-02-2016, 03:29 PM.

              Comment


              • #8
                energyd,
                That's a lot of useful information. Let's learn about the state of your cluster.
                To understand what nodes are VoltDB cluster:
                Try running SystemInformation command as instructed above and take a look at what at how many unique HOST_ID (first column) their are. From your details, there should be 3. Alternatively, check 'ps -ef | grep java' on each host to check if the VoltDB process is running.
                To check connectivity to each node:
                Run sqlcmd to each node. You should be able to connect. If the process is running but you're not able to connect, the node could be hung. If the process is not running, you can rejoin the node with 'voltdb rejoin' command.

                Since you've tried sqlcmd to server2 and was unsuccessful, can you check if the process is running with the process above? Furthermore, you can check the VoltDB logs, <working directory>/log/volt.log.

                ADD: running lsof -iTCP:21212 on server2 I could see something like "TCP localhost:21212->localhost:46006 (CLOSE_WAIT)"
                So the server is not properly listening to 21212? How come?
                Port 21212 should be listening and you should see a line like this with 'lsof -iTCP:21212':
                java 2759 pzhao 76u IPv6 0xe4059d6292e4b447 0t0 TCP *:21212 (LISTEN)
                What you're pointing out appears to be a client connection that was established to port 21212 (VoltDB default client port) and in the process of closing the TCP connection.
                Peter Zhao

                Comment


                • #9
                  Originally posted by pzhao View Post
                  energyd,
                  That's a lot of useful information. Let's learn about the state of your cluster.
                  To understand what nodes are VoltDB cluster:
                  Try running SystemInformation command as instructed above and take a look at what at how many unique HOST_ID (first column) their are. From your details, there should be 3. Alternatively, check 'ps -ef | grep java' on each host to check if the VoltDB process is running.
                  To check connectivity to each node:
                  Run sqlcmd to each node. You should be able to connect. If the process is running but you're not able to connect, the node could be hung. If the process is not running, you can rejoin the node with 'voltdb rejoin' command.

                  Since you've tried sqlcmd to server2 and was unsuccessful, can you check if the process is running with the process above? Furthermore, you can check the VoltDB logs, <working directory>/log/volt.log.


                  Port 21212 should be listening and you should see a line like this with 'lsof -iTCP:21212':
                  java 2759 pzhao 76u IPv6 0xe4059d6292e4b447 0t0 TCP *:21212 (LISTEN)
                  What you're pointing out appears to be a client connection that was established to port 21212 (VoltDB default client port) and in the process of closing the TCP connection.
                  Peter Zhao
                  Hi Peter,
                  Thanks for your response.

                  I used @SystemInfo to learn the status of the cluster and VoltDB process is running normally on server2. Also according to the Management Center DB Monitor it's getting its shares of data. I think it's just that the client port(21212) is not listening for some reason. The log also did not say anything useful. (and the log is not working properly per my last post)
                  Last edited by energyd; 02-02-2016, 05:56 PM.

                  Comment


                  • #10
                    energyd,
                    You should be able to connect to server2 via sqlcmd. Can you try pointing to the VMC on server2, i.e. server2:8080 on a browser and possible running a simple sql statement, select count(*) from table?
                    Alternatively, if your ksafety is larger than 0, then we can potentially end the VoltDB process on server2 and rejoin it back to the cluster.
                    Peter Zhao

                    Comment


                    • #11
                      Originally posted by pzhao View Post
                      energyd,
                      You should be able to connect to server2 via sqlcmd. Can you try pointing to the VMC on server2, i.e. server2:8080 on a browser and possible running a simple sql statement, select count(*) from table?
                      Alternatively, if your ksafety is larger than 0, then we can potentially end the VoltDB process on server2 and rejoin it back to the cluster.
                      Peter Zhao
                      Peter,
                      I tried using server2:8080 in the browser and could run sql statement w/o an issue. Looks like it's just the sqlcmd interface not working.

                      Comment


                      • #12
                        energyd,
                        Looks sounds good! So server2 is up and running but it appears to be an issue with trying to connect to it. 'sqlcmd' utilizes port 21212, by default. Can you verify that port 21212 on server2 is listening, via 'netstat -antplo'?
                        Peter Zhao

                        Comment


                        • #13
                          Originally posted by pzhao View Post
                          energyd,
                          Looks sounds good! So server2 is up and running but it appears to be an issue with trying to connect to it. 'sqlcmd' utilizes port 21212, by default. Can you verify that port 21212 on server2 is listening, via 'netstat -antplo'?
                          Peter Zhao
                          Got this:
                          tcp 55 0 127.0.0.1:21212 127.0.0.1:46006 CLOSE_WAIT 47678/java off (0.00/0/0)

                          Comment


                          • #14
                            energyd,
                            I've sent you a separate email regarding log collection for server2. After gathering the logs, we suggest you try to rejoin server2 (skipping step2), if k safety > 0. Alternatively, you can save a snapshot and restore the data after restarting the cluster. These instructions can be found here.
                            Peter Zhao

                            Comment


                            • #15
                              Originally posted by pzhao View Post
                              energyd,
                              I've sent you a separate email regarding log collection for server2. After gathering the logs, we suggest you try to rejoin server2 (skipping step2), if k safety > 0. Alternatively, you can save a snapshot and restore the data after restarting the cluster. These instructions can be found here.
                              Peter Zhao
                              Hi Peter,
                              Unfortunately, our firm security policy prevented me from sharing doc to outside address at the moment. I already stopped and rejoined Server2 and sqlcmd is working properly again. So I guess problem solved right now. Thanks for your help.

                              Comment

                              Working...
                              X