Could not access tables inside hive source

I have installed latest version(12.0.0) of dremio. When i am trying to add hive, hdfs or mysql inside dremio, I am getting below error:
Something went wrong. Please check the log file for details, see Logs · Dremio.
java.net.UnknownHostException: HAhdfs

I have deployed dremio on CDH6.3.2.

Please let me know if dremio supports deployment on CDH.

@akanksha

it most likely looks like the core-site.xml fs.defaultFS is different, have you copied hdfs-site.xml, core-site.xml, hive-site.xml into the Dremio conf folder?

You also have to do the below for Hive source

http://docs.dremio.com/data-sources/hive.html#hive-configuration

@balaji.ramaswamy
I tried the above solution but still i am getting the same error.
image
Also, i am not getting sharing option inside Dremio in HDFS configuration.

@akanksha

Where id you see the sharing option? Thats an Enterprise only feature

BTW, What is the latest error you are getting? Can you please share server.log and the name of the source you tried?

Thanks
Bali

@balaji.ramaswamy
I have tried hdfs,hive and mysql sources.
server.zip (16.0 KB)

I have deployed dremio latest version on CDH6.3.2.

@akanksha

I see a few issues,

First Dremio starts up

2021-01-18 07:29:10,856 [main] INFO c.dremio.exec.catalog.PluginsManager - Result of storage plugin startup:
hive: success (294ms). Healthy
INFORMATION_SCHEMA: success (0ms). Healthy
__jobResultsStore: success (134ms). Healthy
__logs: success (97ms). Healthy
__support: success (179ms). Healthy
__datasetDownload: success (132ms). Healthy
sys: success (0ms). Healthy
$scratch: success (133ms). Healthy
test1: success (101ms). Healthy
Samples: success (569ms). Healthy
test3: success (101ms). Healthy
__home: success (140ms). Healthy
hdfs: success (712ms). Healthy
mysql: success (101ms). Healthy
__accelerator: success (138ms). Healthy

2021-01-18 07:41:19,837 [main] INFO com.dremio.dac.server.DremioServer - Started on http://localhost:9047

We can see that the source plugins are healthy

Then we see a Hive metastore exception

2021-01-18 07:55:18,500 [source-management14] WARN c.d.exec.store.hive.HiveClientImpl - Failure to run Hive command. Will retry once.
org.apache.hadoop.hive.metastore.api.MetaException: Could not connect to meta store using any of the URIs provided. Most recent failure: com.dremio.hive.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)

Are we able t connect to Hive shell or Beeline from the Dremio coordinator?

Then we see ZK connection issues. Wondering if the coordinator is doing a Full GC?

Can we check gc logs?

2021-01-18 07:57:32,087 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to SUSPENDED
2021-01-18 07:57:37,096 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState - Connection timed out for connection string (10.3.2.13:2181) and timeout (5000) / elapsed (5007)
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss

Have we made sure we are entering the Hivemetastore port in the source settings?

https://community.datafabric.hpe.com/s/question/0D50L00006BItKGSA1/not-able-to-connect-hive-in-a-secure-environment

@balaji.ramaswamy
Are we able to connect to Hive shell or Beeline from the Dremio coordinator? YES
Can we check gc logs?server (2).zip (8.3 KB)
Have we made sure we are entering the Hivemetastore port in the source settings? YES

@akanksha

I can see “Connection refused”, if you are able to send your hive-site, core-site and hdfs-site xml files along with the dremio.conf and screenshot of the source settings, we can review

Are you able to send this?

Thanks
Bali

@balaji.ramaswamy
I will not be able to send configuration files.
Can we have a quick call to check the issues?

@akanksha

I am unable to have a call but let’s make sure of the below

  • Open hive-site.xml on /etc/hadoop or /opt/hadoop and see the value for “hive.metastore.uris”
  • Make sure this is what is entered in Dremio source
  • Make sure we have the right port too
  • can u send me output of below command?
    ls -ltrh /conf
    ls -ltrh /plugins/connectors/hive2-ee.d
    ls -ltrh /plugins/connectors/hive3-ee.d

@balaji.ramaswamy
Yes “hive.metastore.uris” value is same in both the configuration.
can u send me output of below command? YES
ls -ltrh /conf
-rw-r–r-- 1 root root 5.6K Jan 18 19:10 yarn-site.xml
-rw-r–r-- 1 root root 315 Jan 18 19:10 ssl-client.xml
-rw-r–r-- 1 root root 5.3K Jan 18 19:10 mapred-site.xml
-rw-r–r-- 1 root root 314 Jan 18 19:10 log4j.properties
-rw-r–r-- 1 root root 2.9K Jan 18 19:10 hdfs-site.xml
-rw-r–r-- 1 root root 617 Jan 18 19:10 hadoop-env.sh
-rw-r–r-- 1 root root 5.6K Jan 18 19:10 core-site.xml
-rw-r–r-- 1 root root 70 Jan 18 19:10 cloudera_metadata
-rw-r–r-- 1 root root 21 Jan 18 19:10 cloudera_generation
-rw-r–r-- 1 root hadoop 662 Jan 18 19:12 topology.map
-rwxr-xr-x 1 root hadoop 1.6K Jan 18 19:12 topology.py

I could not find both the files
ls -ltrh /plugins/connectors/hive2-ee.d
ls -ltrh /plugins/connectors/hive3-ee.d

Instead of that i got these files inside
/opt/dremio/plugins/connectors
hive-exec-log4j2.properties
hive-log4j2.properties
hive.proto

@akanksha

Sorry, my bad, you are CE edition, can you please send me output of

ls -ltrh /plugins/connectors/hive2.d/conf
ls -ltrh /plugins/connectors/hive3.d/conf

if they do not exists, then do below

cd /opt/dremio/plugins/connectors
mkdir hive2.d (if source is Hive 2.x)
mkdir hive3.d (if source is Hive 3.x)

If Hive 2.x source then,

cd hive2.d
ln -s /opt/dremio/conf /opt/dremio/plugins/connectors/hive2.d/conf (if Hive 2.x)

If Hive 3.x source then,

cd hive3.d
ln -s /opt/dremio/conf /opt/dremio/plugins/connectors/hive3.d/conf (if Hive 3.x)

@balaji.ramaswamy
ls -ltrh /plugins/connectors/hive2.d/conf

total 44K
-rw-r–r-- 1 root root 2.7K Dec 22 08:54 dremio-env
-rw-r–r-- 1 root root 7.5K Dec 22 08:54 logback.xml
-rw-r–r-- 1 root root 2.2K Dec 22 08:54 logback-admin.xml
-rw-r–r-- 1 root root 1.8K Dec 22 08:54 logback-access.xml
-rw-r–r-- 1 root root 1005 Dec 28 16:14 dremio.conf
-rw-r–r-- 1 root root 2.9K Jan 18 17:57 hdfs-site.xml
-rw-r–r-- 1 root root 6.4K Jan 18 17:57 hive-site.xml
-rw-r–r-- 1 root root 5.4K Jan 18 18:40 core-site.xml

@akanksha

Are these the same files under conf? Can we create a symlink instead?

@balaji.ramaswamy
Are these the same files under conf? YES

Are you looking for this?
ls -lrth /opt/dremio/plugins/connectors/hive2.d/conf
lrwxrwxrwx 1 root root 16 Jan 27 16:29 conf → /opt/dremio/conf

@akanksha
Try this

Create a new Hive 2.x source
Name is testhive
for metastore host and port, look at the below parameter in hive-site.xml

hive.metastore.uris thrift://localhost:9083

Click save

@balaji.ramaswamy
I tried creating a new Hive 2.x source and when i tried to access tables in hive, i was getting below error:
[image]

User: dremio is not allowed to impersonate admin.

admin is the user id to login dremio dashboard.

@akanksha

Nice, we moved one step forward. you need to add Dremio to the proxy user, host, group list in core-site.xml and refresh the namenode

Note: This is not the client side core-site but server side, you Hadoop admin will know, the entry will be like below

Hadoop using YARN · Dremio

For now to workaround the issue, you can turn off impersonation at the Dremio level. This means whoever logons to Dremio, irrespective of that the query on Hive will run as “dremio”

To do this edit the hive source in Dremio, click on advanced and add the below

name - hive.server2.enable.doAs
value - false

Save the source, now try to run the same query

Thanks
Bali

@balaji.ramaswamy
Thanks for the support.
Its working now. I am able to see data in hive tables.

@akanksha

Glad it is working now