Adding Multiple kerberos enabled HIVE, HDFS source in DREMIO

Hi Team,

Can we add the multiple kerberos enabled source in DREMIO. (like HDFS, HIVE).

When accessing any kerberos enabled source, we need to add the kerberos principal, keytab file path in dremio.conf file and core-site.xml, hdfs-site.xml, hive-site.xml files of that cluster inside the /etc/dremio/ directory.

With single source HDFS its working perfectly.

But when I tried with adding two different HDFS source in dremio by adding multiple keytab file path in dremio.conf file is not working, its only reading the one kaytab file.
Also I don’t how can I merger the core-site, hdfs-site.xml file of two hdfs sources.

Could you please help me how can I achieve this, what changes need to perform in core-site.xml, dremio.conf file.

Thanks & Regards,
Atul

Hi @achounde

First i would like to understand the setup for better visibility,

  1. Are you using YARN as executors ?. Or Dremio executors.

  2. How did you drefine the kerberos principal and ticket in Dremio.conf

  3. on the coordinator /etc/krb5.conf

Merging the files can make a conflict at multiple HDFS properties like default.fs . namenode information.

Hi Venugopal,

Thank you for quick reply. find the below information,

  1. I am using the DREMIO executors.

  2. How did you define the kerberos principal and ticket in Dremio.conf
    I have added the below property in dremio.conf file for access the single HDFS cluster.
    ===================
    services.kerberos: {
    principal: “testuser@DEV.COM”,
    keytab.file.path: “/etc/dremio/testuser.keytab”
    }
    ===================
    Also copied the core-site.xml, hdfs-site.xml inside /etc/dremio/ directory.
    For single node its working perfectly. but what should i do for multiple HDFS source with different key tab and principals.

  3. on the coordinator /etc/krb5.conf

  • Yes I have edited the krb5.conf file on coordinatior node and added the details of HDFS nodes.

Do you have any sample dremio configuration that used for accessing multiple HDFS cluster with kerberos.

Hi Team,

Do you have any idea how can I add multiple HIVE, hddfs source with kerberos enabled in same dremio cluster.

@achounde

You would have to add the properties individually and see if it works

Thanks
Bali

I have tried with adding the distinct properties while adding source on dremio portal.

my source cluster are kerberos enabled and dremio reading only one *.keytab file

@balaji.ramaswamy can you please explain a little bit what values we have to add ? I need also connect to 2 cluster hdfs at the same time. I know that the security part but I’m not sur if they are:
services.kerberos.principal value
services.kerberos.keytab.file.path value

or :
dfs.namenode.kerberos.principal value
dfs.namenode.keytabl.file value

thank you

@hieuiph You would need a CROSS REALM on your krb5.conf as you are connecting from one REALM (the one your Dremio coord is attached to) to other the other HDFS REALM

@balaji.ramaswamy thank you for your reply. So I don’t need add the properties, the cross realm is sufficient ?

@balaji.ramaswamy Can you answer my question ? I set up the cross-trust but still error so I need to know the add properties step must do or not ?

thank you

@hieuiph

I was just conceptually letting you know that you need a cross REALM. It is best you work with your Kerberos admin to first make sure from the Dremio coordinator node you are able to connect to both HDFS using the Dremio keytab (prinicpal). Once you are able to do that outside of Dremio, you should be able to add the sources

Thanks
Bali

@balaji.ramaswamy
thank you for your update. When I tried cross-trust in this article 11.5. Setting up Cross-Realm Kerberos Trusts Red Hat Enterprise Linux 7 | Red Hat Customer Portal , I don’t see they use the keytab but the shared principal. So what do you mean “you are able to connect to both HDFS using the Dremio keytab (prinicpal)” ?

thanks

@hieuiph YOu should find a file called krb5.conf where the cross realm is defined

https://docs.oracle.com/cd/E19253-01/816-4557/setup-87/index.html

@balaji.ramaswamy thanks, now it works, we can connect to 2 cluster Hadoop by create the cross-trust realm.

That is an awesome update @hieuiph