The connector appears to be able to read CJK/unicode characters in text retrieved from queries correctly, if they were entered via MySQL query browser into the database properly as unicode. But writing CJK/unicode to the database (insert, update) using the connector using either a normal SQL statement or setString() with a PreparedStatement appeared to translate the CJK characters to a single ? (code 3F).
It appeared that I had to write the text data as bytes instead to get them written properly as unicode. I chose the encoding UTF-8. Then I changed the column specific encoding and charset for the text field to hold the CJK text as "utf8_ci" and "utf8_charset". Then to write the text in an insert operation using a PreparedStatement:
here we assume we already have the database Connection conn established, and we are trying to insert text from String cjkword:
PreparedStatement stmt = conn.prepareStatement("INSERT INTO cjktable (cjktext) VALUES ( ? )");
try {
stmt.setBytes(1, cjkword.getBytes("utf8"));
} catch(UnsupportedEncodingException e) {
stmt.setBytes(1, cjkword.getBytes());
}
the try block is to catch the exception (java.io.UnsupportedEncodingException), which occurs if you have specified an unsupported encoding. UTF8 should be supported (by Sun JDK anyway) so it should not occur.
--Edited per comments