Forum: Building VoltDB Clients

Post: Unable to retrieve non-Latin characters via JSON

Unable to retrieve non-Latin characters via JSON
jhugg
Jul 22, 2010
I don't think this is something we tested and I apologize. I've created a ticket and will try to get it tested and fixed for our next release.
https://issues.voltdb.com/browse/ENG-666
Let us know how much this is holding you up and we'll factor that into our timing.
-John
Unable to retrieve non-Latin characters via JSON
junjun
Jul 22, 2010
It's me again. This time, I'm not sure if I'm doing something wrong.
I try to insert some Japanese characters via the JSON interface but I can't get them back out again.
Say, using the example Hello World db, I call the Insert procedure using curl:
curl -i -H"content-type: text/plain; charset=utf-8" "http://localhost:8080/api/1.0/?Procedure=Insert&Parameters=%5B%22%5Cu3053%5Cu3093%5Cu306b%5Cu3061%5Cu306f%22%2C%22%5Cu4e16%5Cu754c%22%2C%22Japanese%22%5D"
This returns ok. And using a Ruby wire protocol client I wrote, I can select that row and see the Japanese characters perfectly.
But when I try to retrieve it using the JSON interface such as here:
curl -i -H"content-type: text/plain; charset=utf-8" "http://localhost:8080/api/1.0/?Procedure=Select&Parameters=%5B%22Japanese%22%5D"
all I get are just question marks for those non-Latin characters.
HTTP/1.0 200 OK
Content-Type: text/plain
Date: Thu, 22 Jul 2010 15:21:47 GMT
{"status":1,"appstatus":-128,"statusstring":null,"appstatusstring":null,"exception":null,"results":[{"status":-128,"schema":[{"name":"HELLO","type":9},{"name":"WORLD","type":9}],"data":[["?????","??"]]}]}
Would like to know if anyone has made something like what I'm doing above work for their case.
Thanks like always!
Junjun
Thanks for the reply, John.
junjun
Jul 25, 2010
It's me again. This time, I'm not sure if I'm doing something wrong.
I try to insert some Japanese characters via the JSON interface but I can't get them back out again.
Say, using the example Hello World db, I call the Insert procedure using curl:
curl -i -H"content-type: text/plain; charset=utf-8" "http://localhost:8080/api/1.0/?Procedure=Insert&Parameters=%5B%22%5Cu3053%5Cu3093%5Cu306b%5Cu3061%5Cu306f%22%2C%22%5Cu4e16%5Cu754c%22%2C%22Japanese%22%5D"
This returns ok. And using a Ruby wire protocol client I wrote, I can select that row and see the Japanese characters perfectly.
But when I try to retrieve it using the JSON interface such as here:
curl -i -H"content-type: text/plain; charset=utf-8" "http://localhost:8080/api/1.0/?Procedure=Select&Parameters=%5B%22Japanese%22%5D"
all I get are just question marks for those non-Latin characters.
HTTP/1.0 200 OK
Content-Type: text/plain
Date: Thu, 22 Jul 2010 15:21:47 GMT
{"status":1,"appstatus":-128,"statusstring":null,"appstatusstring":null,"exception":null,"results":[{"status":-128,"schema":[{"name":"HELLO","type":9},{"name":"WORLD","type":9}],"data":[["?????","??"]]}]}
Would like to know if anyone has made something like what I'm doing above work for their case.
Thanks like always!
Junjun


Thanks for the reply, John. As for my need, I guess it could wait a while. I'm in Hong Kong so the project I'm starting to work on would have to eventually handle Chinese characters. Please lemme know when you push the feature into trunk and I'll use that. I don't even have to wait for the next official release.
Thanks again.
I've reproduced the issue.
jhugg
Jul 26, 2010
Thanks for the reply, John. As for my need, I guess it could wait a while. I'm in Hong Kong so the project I'm starting to work on would have to eventually handle Chinese characters. Please lemme know when you push the feature into trunk and I'll use that. I don't even have to wait for the next official release.
Thanks again.


I've reproduced the issue in JSON and with our Java client library in a new test case. I've updated the bug report https://issues.voltdb.com/browse/ENG-666 with these comments. Hopefully I can get a fix onto trunk this week if it isn't a huge issue. I'll hack a bit on it in the morning. It seems like a real bug.
Hopefully fixed
jhugg
Jul 27, 2010
Hopefully this got fixed in r833. The limitation is that http variables must be UTF-8 encoded, no matter what the HTTP request says. Let me know if this fixes your problem.
Ticket that will be closed once you say it works for you.
https://issues.voltdb.com/browse/ENG-666
New ticket to detect incoming encoding and not assume UTF-8.
https://issues.voltdb.com/browse/ENG-670
Again, hopefully fixed.
jhugg
Aug 4, 2010
Hi Junjun,
r867 has additional UTF-8 fixes for the JSON return path. Give it a shot and let me know if it fixes your problem. Thanks for being patient.
-John
Works now! Thanks a lot,
junjun
Aug 6, 2010
Works now! Thanks a lot, John!
Awesome.
jhugg
Aug 6, 2010
Thanks for your help tracking this issue down Junjun.
-John