Dremio Kerberized Hive2 Connection Help

Hi all,

Background info:
I have helm deployment of Dremio EE 25.1.0 running on a Kubernetes cluster, and I’m trying to connect to a hive 2.x metastore. I am looking for clarification on what exactly I need to do to my helm chart to get this working. I have gotten a connection to HDFS working. For HDFS I added an hdfs-site.xml, core-site.xml, and krb5.conf to the config directory of my helm chart, created a kubernetes secret with my hdfs principal’s keytab and included it in my values.yaml file, added some entries to the extraStartParams of the executor and coordinator sections to specify the keytab file path, kerberos principal name, etc. All of that is working!

What I have tried for Hive:
I am now moving on to getting Hive connected (which uses a different principal than my HDFS connection). Here is what I have done so far:

  • I tried telnet’ing the metastore url + port on my cluster nodes which connects successfully, so there shouldn’t be a network issue.
  • Added my hive-site.xml and core-site.xml (not sure if I needed to include this one) files to the config/hive2 directory in my helm chart (I confirmed that it is being loaded into the master pod at /opt/dremio/plugins/connectors/hive2-ee.d
  • Created a new kubernets secret for my Hive principal’s keytab
  • Updated the values.yaml file to include that secret
  • In the dremio UI I am using the metastore host on port 9083
  • I have Enable SASL checked and put the hive principal for the keytab file I created the secret for
  • Updated the logback.xml to include more verbose hive error messages

My Question:

  1. What else do I need to do?
  2. When I attempt to connect to hive with the changes outlined above, it seems like its not reading from my hive-site.xml. From the error messages below, its saying the metastore URI is null (this is defined in the hive-site.xml), and its using auth:SIMPLE instead of kerberos (kerberos is specified as the auth type in the hive-site.xml). It also says its using the HDFS principal and keytab file, but I’m not sure if thats expected behavior or not.
  3. Do I need to added entries in the extraStartParams in the coordinator and executor sections of the values.yaml for specifying my hive principal, keytab path, etc?
  4. Do I need to include the yarn.resourcemanager.principal property in the Advanced Options tab when setting up the connection in the UI? If so, how do I provide the keytab for it?

Errors:

DEBUG org.apache.hadoop.hive.conf.HiveConf - Found metastore URI of null
DEBUG c.d.exec.store.hive.HiveConfFactory - Linux host detected. Enabling ORC zero-copy feature
DEBUG c.d.exec.store.hive.HiveConfFactory - Setting fs.s3.impl to org.apache.hadoop.fs.s3a.S3AFileSystem
DEBUG c.d.exec.store.hive.HiveConfFactory - Setting fs.s3n.impl to org.apache.hadoop.fs.s3a.S3AFileSystem
INFO  c.d.e.store.hive.Hive3StoragePlugin - Setup Hadoop user info using kerberos principal <REDACTED HDFS PRINCIPAL> and keytab file /opt/dremio/keytabs/hdfs/dremio.keytab successful.
INFO  c.d.e.store.hive.Hive3StoragePlugin - Hive Metastore SASL enabled. Kerberos principal: <REDACTED HIVE PRINCIPAL>
DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction [as: dremio (auth:SIMPLE)][action: com.dremio.exec.store.hive.HiveClientImpl$$Lambda$4454/0x0000000842d0b040@47499bf3]
2024-10-10 12:40:48	
java.lang.Exception: null
INFO  o.a.h.h.m.HiveMetaStoreClient - Trying to connect to metastore with URI thrift://<REDACTED>:9083
INFO  o.a.h.h.m.HiveMetaStoreClient - HMSC::open(): Could not find delegation token. Creating KERBEROS-based thrift connection.
DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction [as: dremio (auth:SIMPLE)][action: org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Client$1@11227a04]
	java.lang.Exception: null
ERROR c.d.h.t.transport.TSaslTransport - SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed

@lewis133

Can you please provide screenshots of your Hive source settings? The general and advanced tabs?

Also if you are using Enterprise edition, please open a ticket on Zendesk so you can get enterprise level support too

Hey @balaji.ramaswamy, thanks for your response! Attached the screenshots below with sensitive info redacted.

The IP address I’m using is one of 2 IP’s for this HA environment and is for the Hive Metastore (not HiveServer). I’ve used telnet to test network connectivity on my kubernetes cluster to both of the IP’s and they are successful.
In my helm values.yaml file I have included a keytab file for the hive/@ service principal. I have not for the rm/@ yarn resource manager principal as I’m still unsure if this one is even needed.


@lewis133 For HIve Kerberos Principal, you have added the one from hive-site.xml right?

Als , can you add the below as part of Connection properties?

<name>hive.metastore.uris</name>
<value>thrift://metastore1:9083,thrift://metastore2:9083</value>

Also it looks like a SASL error, can you send us full server.log and server.out as it will have full stack, if not then next step we can add Kerberos DEBUG