How to support UTF16 encoding format?

We found that calcite does not support the UTF8 encoding format, and now we want to try the UTF16 encoding format, will dremio how to operate to support the UTF16 encoding format

@JoyJava Dremio only supports UTF-8 encoding for VARCHARs. We’re introducing a few improvements make it easy for users to strip away or substitute non UTF-8 characters in our upcoming release to ensure queries work without issues.

We have modified the calcite source to support utf-8 encoding

Thank you for your answer

Wondering if this has been commited/fixed, as i am getting a LOT of utf8 related errors anything i do in dremio 2.x?

You may want to check CONVERT_FROM or IS_UTF8 functions we’ve added to deal with this:
http://docs.dremio.com/sql-reference/sql-functions/type-conversion.html

Hi, please: how did you do it? UTF8 support. Is it possible at the Windows version?

Hi, IMHO this is not a solution at all.

I think I’m facing the same UTF8 problem with the REGEXP_SPLIT function and portuguese strings. If the problem is with Calcite, why not to implement a solution like this:

https://issues.apache.org/jira/browse/DRILL-5772

Hi, any news regarding the UTF query support?
Our BI tool has started failing a lot after we started dealing with “non-western” languages…

Hi,

You can change calcite default encoding via custom java option in dremio-env.
Here is example how set encoding to UTF-16:
DREMIO_JAVA_SERVER_EXTRA_OPTS=’-Dsaffron.default.charset=UTF-16LE -Dsaffron.default.nationalcharset=UTF-16LE -Dsaffron.default.collation.name=UTF-16LE$en_US’

It solved an issue with non-latin characters in queries in my case.

Hi, I tried as you suggested (in dremio-env file) and it worked!

Thank you

edit:
Unfortunately, while UTF queries now work, Dremio now fails to open many datasets and gives an error similar to this:

SYSTEM ERROR: AssertionError: mismatched type $8 VARCHAR(65536)

(com.dremio.exec.work.foreman.ForemanException) Unexpected exception during fragment initialization: mismatched type $8 VARCHAR(65536)
com.dremio.exec.work.foreman.AttemptManager.run():340
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
Caused By (java.lang.AssertionError) mismatched type $8 VARCHAR(65536)
org.apache.calcite.rex.RexUtil$FixNullabilityShuttle.visitInputRef():2546
org.apache.calcite.rex.RexUtil$FixNullabilityShuttle.visitInputRef():2524
org.apache.calcite.rex.RexInputRef.accept():112
org.apache.calcite.rex.RexShuttle.visitList():153
org.apache.calcite.rex.RexShuttle.visitCall():102
org.apache.calcite.rex.RexShuttle.visitCall():36
org.apache.calcite.rex.RexCall.accept():107
org.apache.calcite.rex.RexShuttle.visitList():153
org.apache.calcite.rex.RexShuttle.visitCall():102
org.apache.calcite.rex.RexShuttle.visitCall():36
org.apache.calcite.rex.RexCall.accept():107
org.apache.calcite.rex.RexShuttle.apply():284
org.apache.calcite.rex.RexShuttle.mutate():243
org.apache.calcite.rex.RexShuttle.apply():261
org.apache.calcite.rex.RexUtil.fixUp():1652
com.dremio.exec.planner.acceleration.normalization.rules.FilterIntoJoinOnlyRule.onMatch():114
org.apache.calcite.plan.AbstractRelOptPlanner.fireRule():317
org.apache.calcite.plan.hep.HepPlanner.applyRule():556
org.apache.calcite.plan.hep.HepPlanner.applyRules():415
org.apache.calcite.plan.hep.HepPlanner.executeInstruction():280
org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute():74
org.apache.calcite.plan.hep.HepPlanner.executeProgram():211
org.apache.calcite.plan.hep.HepPlanner.findBestExp():198
com.dremio.exec.planner.acceleration.normalization.RuleAndShuttleNormalizer.normalize():67
com.dremio.exec.planner.acceleration.normalization.ChainedNormalizer.normalize():41
com.dremio.exec.planner.acceleration.substitution.DremioSubstitutionProvider.normalizeTarget():280
com.dremio.exec.planner.acceleration.substitution.DremioSubstitutionProvider.findSubstitutions():129
com.dremio.exec.planner.acceleration.substitution.AccelerationAwareSubstitutionProvider.findSubstitutions():67
com.dremio.exec.planner.DremioVolcanoPlanner.registerMaterializations():70
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():599
org.apache.calcite.tools.Programs$RuleSetProgram.run():368
com.dremio.exec.planner.sql.handlers.PrelTransformer.transform():451
com.dremio.exec.planner.sql.handlers.PrelTransformer.convertToDrel():220
com.dremio.exec.planner.sql.handlers.PrelTransformer.convertToDrel():278
com.dremio.exec.planner.sql.handlers.query.NormalHandler.getPlan():47
com.dremio.exec.planner.sql.handlers.commands.HandlerToExec.plan():69
com.dremio.exec.work.foreman.AttemptManager.run():292
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748