How to support UTF8 encoding format?

Hey @JoyJava could you please share the query profile for this query?
Click on Download Profile on the screen you pasted above to get the profile.

This has nothing to do with the query plan, it is caused by calcite, I tried the following behavior, but no success

We added the file saffron.properties to the \ org \ apache \ calcite \ util directory of calcite-core: 1.12.0-201711022309440460-bd4149e.jar

saffron.properties file as follows:
saffron.default.charset = UTF-8
saffron.default.nationalcharset = UTF-8
saffron.default.collation.name = UTF-8 $ zh_CN

thx

Glad you were able to work around the issue!
FYI, query profiles include full stack traces. We suspected the issue was in Calcite, so one of our engineers (who happens to be a Calcite committer) asked that we verify using the full trace – hence our ask.

Thanks,
Can

Yes, if it is for users in China and around the world, I recommend that all encoding formats for dremio use the UTF-8 encoding format

We follow this line of thought, did not solve the problem, we need to consult calcite community?

@JoyJava my bad, I misread your earlier comment. It looks like we don’t have the Calcite fix that allows you to set the encoding using the parameters you specified. We’ve filed an internal ticket for this and will reach out when this is available.

It seems we can only modify the Calcite source

So what’s the result now?

Hi,

You can change calcite default encoding via custom java option in dremio-env configuration file.
Here is example how set encoding to UTF-16:
DREMIO_JAVA_SERVER_EXTRA_OPTS=’-Dsaffron.default.charset=UTF-16LE -Dsaffron.default.nationalcharset=UTF-16LE -Dsaffron.default.collation.name=UTF-16LE$en_US’

4 Likes

@asm It can resolve the problem,thank you very much.