Connection String Options for Hive

I setup a Hive Source but believe I need to add Connection String Options in order to make it functional. In the source definition window there is a note:

These options will be added to your Hive connection string. Please see the Dremio documentation for a list of commonly used connection string options

I have searched in vein for this list; no where to be found.

The reason I believe I need to set some of these options is “No Data” is displayed in the main window once the source is defined. I’ve tried running queries against the data source and they do not error out, but instead just appear to be taking up cycles as I get a message with the timer that the query is running.

New to Dremio and trying to build some knowledge step by step. Any help here would be most appreciated.

Usually the only information that needs to be provided is:
“host”. See

But it really depends what additional properties you set/use to connect to HiveMetastore (either through hive-site.xml or on command line) for your other applications

For instance when I connect to Hive through DBViz I specify just the Host, Port, and my credentials. I am guessing there are Connection string options that would allow me to include my Userid and PW. Just find it odd I’m not getting an authentication error when trying to connect.

Dremio is a service, so it is not going to connect to HiveMetastore as an end user logged into dremio, but rather as a user that is running dremio and it will be using impersonation.
So as a user that logged in to dremio you may not have permissions to view certain (or none) data.

One another thing to point out is Dremio connects to Hive Metastore service and not the HiveServer2 service which can accepts username/password as credentials. Hive Metastore can accept only kerberos credentials or no credentials. username/password is not option when connecting to Hive metastore. You can check out your hive-site.xml for the hostname of Hive metastore in property hive.metastore.uris.

yufeldman and vvv, greatly appreciate your replies as you were right on target. I changed my connection string to use the HiveMetastore Port (9083); once I did this I was able to see the directory tree in Hadoop and navigate to the folder that contained the Hive Tables I was interested in. Now I just need to figure out why Dremio can’t read these tables; yufeldman you alluded to this possibility of not having permissions to view certain data going this route. What confuses me is that since one does not connect with their own credentials to the HiveMetastore how would access to tables be enabled?

Glad you were able to connect.

When you are saying:

Do you see error while trying to access data or something else?

Dremio Err

Could you look under “Jobs” (in example image it is in the top menu) and you will probably see failed job marked with red hexagon and next to it more detailed page. If you could click on “Profile” (highlighted in red)

You will see something similar to:

Please examine “Verbose Error Message”

This is what I see when I examine the Verbose Error Message:

DATA_READ ERROR: Failure while attempting to read metadata for table ad_users

This leads credence to my suspicion that I would somehow need to pass my credential as a connections string option to get this to work…

Here is the rest of the verbose error message:

Sql Query SELECT *
FROM “Hive (dev)”.adv_analytics.ad_users
(java.lang.IllegalArgumentException) java.net.UnknownHostException: edwbidevwar
org.apache.hadoop.security.SecurityUtil.buildTokenService():417
org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol():130
org.apache.hadoop.hdfs.DFSClient.():343
org.apache.hadoop.hdfs.DFSClient.():287
org.apache.hadoop.hdfs.DistributedFileSystem.initialize():156
org.apache.hadoop.fs.FileSystem.createFileSystem():2811
org.apache.hadoop.fs.FileSystem.access$200():100
org.apache.hadoop.fs.FileSystem$Cache.getInternal():2848
org.apache.hadoop.fs.FileSystem$Cache.get():2830
org.apache.hadoop.fs.FileSystem.get():389
org.apache.hadoop.fs.Path.getFileSystem():356
com.dremio.exec.store.dfs.FileSystemWrapper.get():128
com.dremio.exec.store.hive.DatasetBuilder.addInputPath():634
com.dremio.exec.store.hive.DatasetBuilder.buildSplits():455
com.dremio.exec.store.hive.DatasetBuilder.buildIfNecessary():298
com.dremio.exec.store.hive.DatasetBuilder.getDataset():217
com.dremio.exec.store.SimpleSchema.getTableFromDataset():283
com.dremio.exec.store.SimpleSchema.getTableWithRegistry():252
com.dremio.exec.store.SimpleSchema.getTable():345
org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable():67
org.apache.calcite.jdbc.CalciteSchema.getTable():219
org.apache.calcite.prepare.CalciteCatalogReader.getTableFrom():117
org.apache.calcite.prepare.CalciteCatalogReader.getTable():106
org.apache.calcite.prepare.CalciteCatalogReader.getTable():73
org.apache.calcite.sql.validate.EmptyScope.getTableNamespace():71
org.apache.calcite.sql.validate.DelegatingScope.getTableNamespace():189
org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():104
org.apache.calcite.sql.validate.AbstractNamespace.validate():84
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():910
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():891
org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():2859
org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():2844
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3077
org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
org.apache.calcite.sql.validate.AbstractNamespace.validate():84
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():910
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():891
org.apache.calcite.sql.SqlSelect.validate():208
org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():866
org.apache.calcite.sql.validate.SqlValidatorImpl.validate():577
com.dremio.exec.planner.sql.SqlConverter.validate():187
com.dremio.exec.planner.sql.handlers.PrelTransformer.validateNode():167
com.dremio.exec.planner.sql.handlers.PrelTransformer.validateAndConvert():155
com.dremio.exec.planner.sql.handlers.query.NormalHandler.getPlan():43
com.dremio.exec.planner.sql.handlers.commands.HandlerToExec.plan():66
com.dremio.exec.work.foreman.AttemptManager.run():285
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():748
Caused By (java.net.UnknownHostException) edwbidevwar
org.apache.hadoop.security.SecurityUtil.buildTokenService():417
org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol():130
org.apache.hadoop.hdfs.DFSClient.():343
org.apache.hadoop.hdfs.DFSClient.():287
org.apache.hadoop.hdfs.DistributedFileSystem.initialize():156
org.apache.hadoop.fs.FileSystem.createFileSystem():2811
org.apache.hadoop.fs.FileSystem.access$200():100
org.apache.hadoop.fs.FileSystem$Cache.getInternal():2848
org.apache.hadoop.fs.FileSystem$Cache.get():2830
org.apache.hadoop.fs.FileSystem.get():389
org.apache.hadoop.fs.Path.getFileSystem():356
com.dremio.exec.store.dfs.FileSystemWrapper.get():128
com.dremio.exec.store.hive.DatasetBuilder.addInputPath():634
com.dremio.exec.store.hive.DatasetBuilder.buildSplits():455
com.dremio.exec.store.hive.DatasetBuilder.buildIfNecessary():298
com.dremio.exec.store.hive.DatasetBuilder.getDataset():217
com.dremio.exec.store.SimpleSchema.getTableFromDataset():283
com.dremio.exec.store.SimpleSchema.getTableWithRegistry():252
com.dremio.exec.store.SimpleSchema.getTable():345
org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable():67
org.apache.calcite.jdbc.CalciteSchema.getTable():219
org.apache.calcite.prepare.CalciteCatalogReader.getTableFrom():117
org.apache.calcite.prepare.CalciteCatalogReader.getTable():106
org.apache.calcite.prepare.CalciteCatalogReader.getTable():73
org.apache.calcite.sql.validate.EmptyScope.getTableNamespace():71
org.apache.calcite.sql.validate.DelegatingScope.getTableNamespace():189
org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():104
org.apache.calcite.sql.validate.AbstractNamespace.validate():84
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():910
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():891
org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():2859
org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():2844
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3077
org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
org.apache.calcite.sql.validate.AbstractNamespace.validate():84
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():910
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():891
org.apache.calcite.sql.SqlSelect.validate():208
org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():866
org.apache.calcite.sql.validate.SqlValidatorImpl.validate():577
com.dremio.exec.planner.sql.SqlConverter.validate():187
com.dremio.exec.planner.sql.handlers.PrelTransformer.validateNode():167
com.dremio.exec.planner.sql.handlers.PrelTransformer.validateAndConvert():155
com.dremio.exec.planner.sql.handlers.query.NormalHandler.getPlan():43
com.dremio.exec.planner.sql.handlers.commands.HandlerToExec.plan():66
com.dremio.exec.work.foreman.AttemptManager.run():285
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():748

Is the host resolvable?

Doesn’t the fact I’m seeing the HIVE Folders in Dremio an indication that the Host is resolvable? I am only running into this issue trying to access the HIVE Tables.

Showing Hive folders and tables (from HiveMetadata) is not the same as fetching the real data from NameNode/DataNodes. And looks like exception is related to NameNode host resolution.

Just a an idea:
Could you include following as an additional property while configuring HiveSource:
name: fs.defaultFS
value: hdfs://<namenode_host>:8020

Seems not the solution, its been trying to load for last 20 min.

I did not mean instead of HiveMetastore, I meant adding it under “AddProperty”

I did add it as a new Property, Hive Metastore entry was not changed.

Hey @LewG you need to be able to resolve the NameNode from all your nodes. I’d recommend configuring this either:

  • At the DNS level
  • By adding entries to /etc/hosts file on all nodes
1 Like

I think problem could be with namenode HA.
I have similar problem.
Caused by: java.net.UnknownHostException: hdfs1
… 49 common frames omitted

Where hdfs1 is cluster name.

@maver1ck I had the same problem and updating the /etc/hosts file like Can mentioned was the solution.

@maverick if your HDFS cluster uses HA, you would have to provide the fully qualified hostname for the master namenode.
Alternatively It is probably possible to use the cluster name, but you would need to add your hdfs configuration (hdfs-site.xml or similar) under <DREMIO_HOME>/conf in order for Dremio to get the HA configuration for the HDFS cluster.