Connection String Options for Hive

@laurent
I will try the second solution.
Setting master address is not a solution, because with HA master can change.

how to connect to a hive with kerberos enabled? besides filling the “kerberos princple”, is it required to fill the below properties (referring from presto, no idea what key should be specified here)

  • hive.metastore.service.principal
  • hive.metastore.client.keytab

i have installed dremio 1.3 from the tarball on hdp2.6, and added below config to the dremo.conf

and fill the hive connection proptery as:
dremio-hive

When clck the “hive-dev” source, the dremo server.log has error:
2017-12-05 19:24:57,895 [qtp826104455-198] ERROR hive.log - Got exception: org.apache.thrift.transport.TTransportException null
org.apache.thrift.transport.TTransportException: null
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) ~[libthrift-0.9.2.jar:0.9.2]
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) ~[libthrift-0.9.2.jar:0.9.2]
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) ~[libthrift-0.9.2.jar:0.9.2]
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) ~[libthrift-0.9.2.jar:0.9.2]
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) ~[libthrift-0.9.2.jar:0.9.2]
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) ~[libthrift-0.9.2.jar:0.9.2]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_all_databases(ThriftHiveMetastore.java:739) ~[hive-metastore-1.2.1.jar:1.2.1]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_all_databases(ThriftHiveMetastore.java:727) ~[hive-metastore-1.2.1.jar:1.2.1]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:1031) ~[hive-metastore-1.2.1.jar:1.2.1]
at com.dremio.exec.store.hive.HiveClient$2.run(HiveClient.java:154) [dremio-hive-plugin-1.3.0-201711211846350824-31f8d91.jar:1.3.0-201711211846350824-31f8d91]

Can anyone has any advice on connecting to a hive with kerberos enabled?

my dremio.conf file is:

master: {
service
name: dev.com,
port: 45678
}

paths: {
local: ${DREMIO_HOME}"/data"
}

services: {
coordinator.enabled: true,
executor.enabled: true,
kerberos.principal: "ubuntu@DEV.COM",
kerberos.keytab.file.path: /etc/security/keytabs/ubuntu.headless.keytab
}

zookeeper: "dev.com:2181"
services: {
coordinator.embedded_master_zk.enabled:false
}

Did you click on “Enable SASL” to enable it? The box shows as empty on your screenshot.

As advised, SASLis enabled, but showing below error when clicking “save”

the server.log showed:
Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Could not connect to meta store using any of the URIs provided. Most
recent failure: org.apache.thrift.transport.TTransportException: GSS initiate failed
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)

The dremio is running with ubuntu accout, it can connect to hive via beeline.

is possbile something wroing with my dremio.conf ?

I don’t think it is related to dremio.conf. You do have Hive Metastore running and can access it via hive shell (I know you mentioned beeline, but just to make sure) using kerberos creds you added when configuring Hive in Dremio. Can you also see if Hive Metastore logs show anything.

Logging initialized using configuration in file:/etc/hive/2.6.2.0-205/0/hive-log4j.properties
hive> use dev;
OK
Time taken: 6.592 seconds
hive> select * from mytable;
OK
1 2 NULL

Hive metastore log showed:

Caused by: org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS initiate failed
at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:199)
at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
… 10 more

have tried the kerberos testing tool https://github.com/Teradata/kylo/tree/master/core/kerberos/kerberos-test-client, it’s successful

File Count: 12
Generating Kerberos ticket for principal: ubuntu@DEV.COM at key tab location: /etc/security/keytabs/ubuntu.headless.keytab

Sucessfully got a kerberos ticket in the JVM

What user do you start dremio with - same user you have kerberos principal for?
If you start dremio with “dremio” user but try to use kerberos principal for a different one it is not going to work.
You can also try to enable Kerberos debug on dremio to see Kerberos messages:
in dremio-env uncomment
#DREMIO_JAVA_EXTRA_OPTS=
and set it to:

DREMIO_JAVA_EXTRA_OPTS=-Dsun.security.krb5.debug=true -Dsun.security.spnego.debug=true -Djavax.net.debug=all

restart dremio and see what additional info you will get when trying to setup hive source

thanks advice from @yufeldman, dremio-env is updated :
DREMIO_JAVA_EXTRA_OPTS="-Dsun.security.krb5.debug=true -Dsun.security.spnego.debug=true -Djavax.net.debug=all"

and this time i am using the hive service account to run dremio, and config to use hive kerberos ticket with the embedded zk

dremio.conf
services: {
coordinator.embedded_master_zk.enabled: true,
coordinator.embedded_master_zk.port: 12181
}

services.kerberos.principal: “hive/en1-dev1-tbdp.dev.com@DEV.COM”
services.kerberos.keytab.file.path: “/etc/security/keytabs/hive.service.keytab”

but the error is the same, and no much more info in the server.log to me
dremio-log.zip (13.4 KB)

Would someone have any ideas? thanks a ton.

I don’t think it is using kerberos debug. Can you check dremio execution command if those options

are included and included correctly - without quotes

ps -aef | grep Dremio

Also - do you have user kerberos principal (no host)? I would recommend using it in both dremio.conf (which is BTW not really needed - as long as you include core-site.xml with those specified) and while creating Hive source.

without quotes:

/opt/dremio-community-1.3/conf/dremio-env: line 78: -Dsun.security.spnego.debug=true: command not found
starting dremio, logging to /opt/dremio-community-1.3/log/server.out

dremio@en1-dev1-tbdp:/opt/dremio-community-1.3$ ps -aef | grep Dremio
dremio 75218 75146 70 10:22 pts/1 00:00:13 /opt/jdk1.8.0_144/bin/java -Djava.util.logging.config.class=org.slf4j.bridge.SLF4JBridgeHandler -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/opt/dremio-community-1.3/log/server.gc -Ddremio.log.path=/opt/dremio-community-1.3/log -Xmx4096m -XX:MaxDirectMemorySize=8192m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/dremio-community-1.3/log -cp /opt/dremio-community-1.3/conf:/opt/dremio-community-1.3/jars/:/opt/dremio-community-1.3/jars/ext/:/opt/dremio-community-1.3/jars/3rdparty/* com.dremio.dac.daemon.DremioDaemon dremio start

with quotes
dremio 77744 77686 14 10:27 pts/1 00:00:11 /opt/jdk1.8.0_144/bin/java -Djava.util.logging.config.class=org.slf4j.bridge.SLF4JBridgeHandler -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/opt/dremio-community-1.3/log/server.gc -Ddremio.log.path=/opt/dremi-community-1.3/log -Xmx4096m -XX:MaxDirectMemorySize=8192m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/dremio-community-1.3/log -Dsun.security.krb5.debug=true -Dsun.security.spnego.debug=true -Djavax.net.debug=all -cp /opt/dremio-community-1.3/conf:/opt/dremio-community-1.3/jars/:/opt/dremio-community-1.3/jars/ext/:/opt/dremio-community-1.3/jars/3rdparty/* com.dremio.dac.daemon.DremioDaemon dremio start

ok. As much as I can tell from the log you posted following is the error:

Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)

at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) ~[na:1.8.0_131]
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) ~[na:1.8.0_131]
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) ~[na:1.8.0_131]
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) ~[na:1.8.0_131]
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) ~[na:1.8.0_131]
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) ~[na:1.8.0_131]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) ~[na:1.8.0_131]
… 78 common frames omitted

But still there is no indication Kerberos debug is on.

wondering if kerberos config is right in the dremio.conf… as even using the hive service account , it has the same error

services.kerberos.principal: "hive/en1-dev1-tbdp.dev.com@DEV.COM"
services.kerberos.keytab.file.path: “/etc/security/keytabs/hive.service.keytab”

Kerberos config in dremio.conf is not related to connection to Hive. It is for connection to HDFS or if you use HDFS for metadata.

it’s not able to connect to a kerberiosized hdfs as same error to me. Does the Community version support the connection to a securitied hadoop with kerberios?
Community Edition vs Enterprise Edition

The Community version can connect to hive without kerberios enabled. but after kerberios enabled, it is failed.

I’m trying to connect to Hive, normally I was using Beeline to connect to Hive (having username and password).
Could you please advise what I need additionally to configure/check in order to connect to Hive in Dremio?