Forum: VoltDB Architecture

Post: the architecture of http json api

the architecture of http json api
aris_sety
Jan 19, 2011
is http json api just a wrapper to native api? how it's architecture?


how about make json api atachable to another webserver than just jetty web server (eg: lighttpd, nginx)? so I can choose my favorite web server and easy to integrate with my web via ajax call (then I can bypass the web server).


how is the performance of jsonapi compared to native api? any benchmark?
Essentially, yes
jhugg
Jan 19, 2011
VoltDB embeds the Jetty web server which does wrap the native java client for VoltDB. It caches clients by authenticated role, so it can maintain persistent connections. Also, the client only connects to the local machine, not to the cluster as a whole.


I imagine it wouldn't be too hard to replicate this functionality in other webservers outside of the VoltDB process. Our JSON<->Native conversion code is currently in Java though, so that may be somewhat limiting. You could use a proxy server of course with almost no changes, but I'm not sure that's what you're looking for. If you're serious about getting it working with another external server, we may be able to provide some developer support.


The performance is currently limited by the performance of Jetty, so you can usually get 500 to several thousand requests per machine, depending on your hardware.
I want it faster
aris_sety
Jan 20, 2011
VoltDB embeds the Jetty web server which does wrap the native java client for VoltDB. It caches clients by authenticated role, so it can maintain persistent connections. Also, the client only connects to the local machine, not to the cluster as a whole.


I imagine it wouldn't be too hard to replicate this functionality in other webservers outside of the VoltDB process. Our JSON<->Native conversion code is currently in Java though, so that may be somewhat limiting. You could use a proxy server of course with almost no changes, but I'm not sure that's what you're looking for. If you're serious about getting it working with another external server, we may be able to provide some developer support.


The performance is currently limited by the performance of Jetty, so you can usually get 500 to several thousand requests per machine, depending on your hardware.


> You could use a proxy server of course with almost no changes, but I'm not sure that's what you're looking for. If you're serious about getting it working with another external server, we may be able to provide some developer support.

Thank's.

I have two candidate of implementation other than jetty, but I still have discussion in these.


Is "JSON<->Native conversion" which have implemented, common JSON conversion or it have special treatment?


-Aris
More on JSON
jhugg
Jan 21, 2011
> You could use a proxy server of course with almost no changes, but I'm not sure that's what you're looking for. If you're serious about getting it working with another external server, we may be able to provide some developer support.

Thank's.

I have two candidate of implementation other than jetty, but I still have discussion in these.


Is "JSON<->Native conversion" which have implemented, common JSON conversion or it have special treatment?


-Aris


As for performance, one of our developers has done some tuning with Jetty and TCP/IP and thinks that the bottleneck is opening sockets. He thinks that, with some tuning of kernel settings, it could get much much faster. He's done some tests and it's probably doable.


We're also looking at supporting persistent HTTP connections that might make this better.


I suspect these optimizations will make a larger difference than switching to another webserver, but it's hard to know for sure.


As for the JSON, it's fairly straightforward to convert between the native rep and JSON, but it's a fair bit of boilerplate code.
Thanks for this initial information
aris_sety
Jan 22, 2011
As for performance, one of our developers has done some tuning with Jetty and TCP/IP and thinks that the bottleneck is opening sockets. He thinks that, with some tuning of kernel settings, it could get much much faster. He's done some tests and it's probably doable.


We're also looking at supporting persistent HTTP connections that might make this better.


I suspect these optimizations will make a larger difference than switching to another webserver, but it's hard to know for sure.


As for the JSON, it's fairly straightforward to convert between the native rep and JSON, but it's a fair bit of boilerplate code.


> As for performance, one of our developers has done some tuning with Jetty and TCP/IP and thinks that the bottleneck is opening sockets. He thinks that, with some tuning of kernel settings, it could get much much faster. He's done some tests and it's probably doable.


Is that about blocking network syscalls?
Can I get the test/benchmark result?
How do I can repeat the test?


> I suspect these optimizations will make a larger difference than switching to another webserver, but it's hard to know for sure.


I will use different webserver and programming language. How do you think?


I will allocate a time for this and will back with some number, so we can have more to discuss.
Linux TCP tunables
aweisberg
Jan 24, 2011
Hi Aris,


The primary limitation with TCP accepts and Linux is that the number of available ports between any two IPs is limited. Another issue is that TCP doesn't release a port for reuse by new connections immediately after the connection is closed. In an environment like a web server where requests are being served to many different IPs this is a non-issue, but when you are benchmarking against localhost it ends up being a serious bottleneck. A trivial app (sorry don't have it anymore) I wrote to create connections as fast as possible can only do around 500 connects with the default settings, but can do around 17k with a few tweaks. I haven't benchmarked to see how this improves the performance of the JSON api.


We found out about these tunables from http://code.mixpanel.com/gevent-the-good-the-bad-the-ugly/


Specifically "echo -e '1024\t65535' | sudo tee /proc/sys/net/ipv4/ip_local_port_range" and "echo 1 | sudo tee /proc/sys/net/ipv4/tcp_tw_recycle" made a big difference.


We don't have a well maintained benchmark for JSON API performance. We do have some code in that direction. See https://source.voltdb.com/browse/Engineering/trunk/tests/frontend/org/voltdb/utils/HTTPDBenchmark.java?hb=true
I created https://issues.voltdb.com/browse/ENG-966 to track making a JSON API benchmark and documenting these tunables.


I suspect that the JSON API will always be at least a digit slower than the native (C++, Java) APIs unless you use persistent connections when benchmarking against localhost. If you want to run a JSON benchmark comparing persistent vs. non-persistent connections with that code as a base that would be awesome.


-Ariel


==
EDIT: We've since moved our source code to GitHub, so the source link may no longer work. Reply to this post if you'd like more details.