Forum: Managing VoltDB

Post: Loadbalancing VoltDB using haproxy

Loadbalancing VoltDB using haproxy
tequilamaya
Mar 13, 2012
Hi folks,
Bit of a newbie to loadbalancing databases but was wondering if
anyone could shed any light on a suitable haProxy config so that the
load to the voltdb cluster, which is using stored proc calls, is rotated
amongst the nodes. Might be more of an haProxy mailing-list topic but
thought i'd give it a go on the volt forums first just for some initial
advice/thoughts.

My starting config is shown below. I'm getting the following error
and can't find much help online regarding this kinda setup - i'm kinda
just bumbling in the dark at the moment. Any help most appreciated.

haproxy log file entry:

Mar 13 16:13:41 localhost haproxy[11649]: 192.168.0.42:40596
[13/Mar/2012:16:13:41.912] main voltcluster/ -1/-1/0 187 PR 0/0/0/0/0
0/0

haproxy config:

global

log 127.0.0.1 local2

chroot /var/lib/haproxy

pidfile /var/run/haproxy.pid

maxconn 15000

user haproxy

group haproxy

daemon

stats socket /var/lib/haproxy/stats

defaults

mode http

log global

option httplog

option dontlognull

option redispatch

retries 3

maxconn 3000

contimeout 5000

clitimeout 50000

srvtimeout 50000

frontend main

bind *:21212

mode tcp

option tcplog

default_backend voltcluster

backend voltcluster

balance roundrobin

server voltserver1 192.168.0.81:21212 maxconn 512 check

server voltserver2 192.168.0.82:21212 maxconn 512 check
Log file entry: NOSRV
tequilamaya
Mar 13, 2012
Mar 13 16:13:41 localhost haproxy[11649]: 192.168.0.42:40596 [13/Mar/2012:16:13:41.912] main voltcluster/<NOSRV> -1/-1/0 187 PR 0/0/0/0/0 0/0
LB configuration
rbetts
Mar 13, 2012
I'm not familiar with haproxy configuration -- I need to make time to play with it -- looks really awesome.

I noticed you don't have a 'listen' section. I wonder (an uninformed guess on my part) if this is problematic? I think something similar to this blog post (http://tenfourty.com/2011/04/09/how-to-load-balance-tcp-connections-with-haproxy/) should work. (I'm not implying that you need to use JDBC or ODBC - just as an example TCP only configuration).

Please follow up if this doesn't answer your question and I can do some more concrete reading / investigation.

Thanks,
Ryan.
LB config & error
tequilamaya
Mar 13, 2012
Hi Ryan,

Thanks for the reply. Yeah that link was the very same one that i used as a starting point (http://tenfourty.com/2011/04/09/how-to-load-balance-tcp-connections-with-haproxy/).

It does indeed look like it would be awesome if i can get it to work -
just don't want to keep hitting the same node from my client app each
time - this seems to offer a handy solution.

Thanks for pointing the "listen" section out. I've modified the
config as you said (shown below; global section same as previous post):

defaults

mode http

log global

option httplog

option dontlognull

option redispatch

retries 3

maxconn 3000

contimeout 5000

clitimeout 50000

srvtimeout 50000

listen main

bind *:21212

mode tcp

option tcplog

balance roundrobin

server voltserver1 192.168.0.81:21212 maxconn 512 check

server voltserver2 192.168.0.82:21212 maxconn 512 check

My setup is Railo on Glassfish 3.1.2 which then calls VoltDB. I've
found that I can register and validate the datasource ok in the Railo
Administrator using the following settings:

Driver class: org.voltdb.jdbc.Driver

Connection String: jdbc:voltdb://192.168.0.61:21212

However, when i run my client app i get the following error:

"ERROR","web-0","03/14/2012","16:23:47","","s1000

at org.voltdb.jdbc.SQLError.get(SQLError.java:50):50

at org.voltdb.jdbc.JDBC4Statement.execute(JDBC4Statement.java:406):406

at org.voltdb.jdbc.JDBC4PreparedStatement.execute(JDBC4PreparedStatement.java:101):101

at railo.runtime.tag.StoredProc.doEndTag(StoredProc.java:471):471

at index_cfm$cf.call(/home/volt/glassfish3/glassfish/domains/domain1/applications/railo/index.cfm:12):12

at railo.runtime.PageContextImpl.doInclude(PageContextImpl.java:734):734

at railo.runtime.listener.ModernAppListener._onRequest(ModernAppListener.java:179):179

at railo.runtime.listener.MixedAppListener.onRequest(MixedAppListener.java:23):23

at railo.runtime.PageContextImpl.execute(PageContextImpl.java:1991):1991

at railo.runtime.PageContextImpl.execute(PageContextImpl.java:1958):1958

at railo.runtime.engine.CFMLEngineImpl.serviceCFML(CFMLEngineImpl.java:297):297

at railo.loader.servlet.CFMLServlet.service(CFMLServlet.java:32):32

at javax.servlet.http.HttpServlet.service(HttpServlet.java:770):770

at org.apache.catalina.core.StandardWrapper.service(StandardWrapper.java:1542):1542

at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:281):281

at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175):175

at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:655):655

at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:595):595

at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:161):161

at org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:331):331

at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:231):231

at com.sun.enterprise.v3.services.impl.ContainerMapper$AdapterCallable.call(ContainerMapper.java:317):317

at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:195):195

at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:849):849

at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:746):746

at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1045):1045

at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228):228

at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137):137

at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104):104

at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90):90

at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79):79

at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54):54

at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59):59

at com.sun.grizzly.ContextTask.run(ContextTask.java:71):71

at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532):532

at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513):513

at java.lang.Thread.run(Thread.java:722):722

"
Railo and VoltDB
rbetts
Mar 14, 2012
Hi Ryan,

Thanks for the reply. Yeah that link was the very same one that i used as a starting point (http://tenfourty.com/2011/04/09/how-to-load-balance-tcp-connections-with-haproxy/).

It does indeed look like it would be awesome if i can get it to work -
just don't want to keep hitting the same node from my client app each
time - this seems to offer a handy solution.

Thanks for pointing the "listen" section out. I've modified the
config as you said (shown below; global section same as previous post):


I have, in the not-so-distant past, used railo and VoltDB
together with the same driver configuration you use here. Can you get
railo to work without haproxy in between it and VoltDB?
Some more testing
rbetts
Mar 14, 2012
I have, in the not-so-distant past, used railo and VoltDB
together with the same driver configuration you use here. Can you get
railo to work without haproxy in between it and VoltDB?


I re-verified that railo can (at least in a trivial example) call a VoltDB stored procedure.

I also installed haproxy with your configuration (I disabled the server 'check' option, though) and verified that our shipping example clients can run if pointed at the proxy listener instead of the database port (all running together on localhost). I also ran the jdbc example client that's part of examples/voter.

I then configured railo to point at the proxy and this worked, too. I did make one configuration error here, though - I pointed railo at 127.0.0.1 instead of localhost... which for some inexplicable reason caused railo to not connect.
Unfortunately, the JDBC error logging is not particularly informative. I think there are a few possible errors:

(1) Railo can't connect to your haproxy listener. I couldn't find a good log message for this. Mostly I debugged this by setting VoltDB's log level to debug (using the voltdb/log4j.xml configuration file in the voltdb distribution tarball and restarting the voltdb server).

(2) Haproxy can't connect to volt; presumably haproxy would log this somewhere.

(3) Your stored procedure invocation (I'm using ) is invalid and you are getting an error response that JDBC driver presents as ... s1000.
Here's the simple query I was testing with, in combination with the voter example from the distribution:
<cfstoredproc procedure="Vote" datasource="voltdb">
<cfprocparam type="IN" cfsqltype="cf_sql_integer" value="508405">
<cfprocparam type="IN" cfsqltype="cf_sql_integer" value="1">
<cfprocparam type="IN" cfsqltype="cf_sql_integer" value="1">
</cfstoredproc>
VoltDB, haproxy and railo now talking (sometimes)
tequilamaya
Mar 15, 2012
I re-verified that railo can (at least in a trivial example) call a VoltDB stored procedure.

I also installed haproxy with your configuration (I disabled the server 'check' option, though) and verified that our shipping example clients can run if pointed at the proxy listener instead of the database port (all running together on localhost). I also ran the jdbc example client that's part of examples/voter.




Hi Ryan,

I've had 'some' success after reading your comments. I decided to have another look at my stored procedure invocation and added type="in" (seems it's optional but better to make it clearer i suppose). Before that it seemed (or maybe it was just my tired and weary imagination) to connect the first time then subsequent calls created the error page.

Also had a look at the haproxy documentation and it seems that some of the timeout settings in the defaults section are deprecated so i've updated those as well. Below is my reworked config. As long as i can get it working in a basic setup then i'm happy and i can spend more time getting to know how to tweak it better (and understand it a bit better!).

Finding tho that it seems to work sometimes, then when i change a value in the database it falls over.

defaults
mode http
log global
option httplog
option dontlognull
option redispatch
retries 3
maxconn 3000
timeout connect 5000
timeout client 50000
timeout server 50000
listen main
bind *:21212
mode tcp
option tcplog
balance roundrobin
server voltserver1 192.168.0.81:21212 maxconn 512
server voltserver2 192.168.0.82:21212 maxconn 512
Railo and VoltDB
tequilamaya
Mar 15, 2012
Hi Ryan,

I've had 'some' success after reading your comments. I decided to have another look at my stored procedure invocation and added type="in" (seems it's optional but better to make it clearer i suppose). Before that it seemed (or maybe it was just my tired and weary imagination) to connect the first time then subsequent calls created the error page.


Yeah, Railo and VoltDB play ok together - not an issue there. It's just when I put haproxy in between them that it goes wonky.
Studio.web issue
tequilamaya
Mar 15, 2012
Hi Ryan,

After a bit of frustration thinking i was seeing things (haproxy working with voltdb) i've stumbled upon something.

Like i said in the previous post i had some success:

Railo could successfully register the haproxy datasource

I could sometimes see the page i want; other times i get the s1000 error.

Then i realised that i've got Studio.web running in the background against one of my VoltDB instances just so that i can update database values for test purposes. The main thing is that I've never really thought about unticking the "Allow Admin Mode operations" checkbox - until now!! Unticking this and the errors go away.

Unfortunately, if i decide to update a value in the database via Studio.web just to see if it's reflected in the test page it falls over again. Any ideas? I could workaround it by not using Studio.web but it's too handy
Issue report
rbetts
Mar 15, 2012
Hi Ryan,

After a bit of frustration thinking i was seeing things (haproxy working with voltdb) i've stumbled upon something.

Like i said in the previous post i had some success:

Railo could successfully register the haproxy datasource

I could sometimes see the page i want; other times i get the s1000 error.



Glad to hear you're making a little progress.
If you write down a series of steps that cause railo, webstudio and the database to fail, I will file a defect. Also, when you say "the database falls over," what do you mean explicitly? Does your client no longer connect? Or do you mean that the database process terminates?
(You can file a defect directly at http://issues.voltdb.com if you prefer - but I'm happy to do the Jira clicking if you can describe the reproduction steps.)
I'd love to get this working together smoothly and blog/write-up some configuration guidance.
Thanks,
Ryan.
It's going pear-shaped again
tequilamaya
Mar 15, 2012
Glad to hear you're making a little progress.
If you write down a series of steps that cause railo, webstudio and the database to fail, I will file a defect. Also, when you say "the database falls over," what do you mean explicitly? Does your client no longer connect? Or do you mean that the database process terminates?


I thought i had it down to a consistent, and reproducible,
failed database call, i.e s1000 error, (due to Studio.web running on one
of the VoltDB instances) but now it's failing all the time (without
Studio.web running and after restarting all processes). I need to spend a
wee bit of time trying to get my head round this cos it's no longer
reproducible. If i can get something to be consistent then i'll pop the
instructions over and i'll let you stick it into Jira. Maybe i've
changed something in my architecture without realising.

BTW apologies for my lack of clarity, the database didn't go down at
all - the VoltDB process has always been up - no worries there :o) What i
meant was just the s1000 error causing the client to not connect.

Sorry!
Closing channel precedes s1000 error
tequilamaya
Mar 16, 2012
I thought i had it down to a consistent, and reproducible,
failed database call, i.e s1000 error, (due to Studio.web running on one
of the VoltDB instances) but now it's failing all the time (without
Studio.web running and after restarting all processes).



Hi Ryan,

I've had a bit of a strange time with this so I decided to take haproxy out of the equation. By just making Railo hit the VoltDB direct it seemed to be working ok, for a time - and then i noticed the following error messages popping up from the Volt log4j:

[Volt Network] DEBUG NETWORK - Closing channel org.voltdb.network.VoltPort@61ffb7dd:192.168.0.42/192.168.0.42:42580
[Volt Network] DEBUG NETWORK - Closing channel org.voltdb.network.VoltPort@7d8bf453:192.168.0.41/192.168.0.41:50359

Immediately following these messages i then get the s1000 error when Railo tries to connect to VoltDB. Trouble is i'm not 100% sure what's causing the channel close. Any ideas? Could this be an external factor to volt or internal??

I have two Railo boxes (192.168.0.41 and 192.168.0.42) which both directly connect to the VoltDB instance (192.168.0.81).
I played around with the JDBC
rbetts
Mar 19, 2012
Hi Ryan,

I've had a bit of a strange time with this so I decided to take haproxy out of the equation. By just making Railo hit the VoltDB direct it seemed to be working ok, for a time - and then i noticed the following error messages popping up from the Volt log4j:




I played around with the JDBC client some more testing
various connect / disconnect scenarios. I agree with you that there is
some undesirable interaction between railo and the jdbc driver around
connection timeouts and retries.

I have filed an issue in our backlog to re-test and improve this more methodically (https://issues.voltdb.com/browse/ENG-2650). I'll keep this thread up to date as we prioritize that work.

Thanks,

Ryan.
VoltDB/Railo adhoc sql
tequilamaya
Mar 19, 2012
Cheers Ryan,

Thanks for the update. I also realised the other day that i might have some issue, in the short term hopefully, with using Railo and VoltDB as I expected. Railo has a handy feature whereby when you declare the datasource you can also choose to use it as an automatic session store (really want to use that). Only problem is Railo uses adhoc sql to implement the session storage facility. I'm going to get in contact with the Railo team to see if they can buy into the idea of supporting a Voltdb driver along with "Stored Proc enabled" session storage. I'll let you know how i get on.

RJ
Railo+VoltDB update
scooper
Apr 25, 2012
Hi,

Just letting you know that there has been some work to make the JDBC error messages better than plain s1000. Connection problems now get their own distinct errors and messages. Generic s1000 errors will have a more informative message.

I also added some documentation on setting up and using the Railo/VoltDB pair.

https://github.com/VoltDB/voltdb/wiki/Getting-Started-With-VoltDB,-JDBC-and-Railo

Cheers,
Steve
Railo+VoltDB update
tequilamaya
May 2, 2012
Hi,

Just letting you know that there has been some work to make the JDBC error messages better than plain s1000. Connection problems now get their own distinct errors and messages. Generic s1000 errors will have a more informative message.



Thanks Steve, i'll have a play with it to see what it now generates
No More Railo Connection Issues
tequilamaya
May 7, 2012
Thanks Steve, i'll have a play with it to see what it now generates


Hi Ryan/Steve,

Thanks for all the help, support and hard work on such a great product. I'm no longer having the connection issues as before. Just need to re-introduce haproxy back into the equation and hopefully it'll all be ok as far as loadbalancing the database goes. Keep up the good work.

Cheers
RJ