Support for HDFS HA/ namespaces

Getting below exception when trying to access default HDFS.

dremio-env has

DREMIO_CLASSPATH_USER_FIRST=/etc/hadoop/conf/

where /etc/hadoop/conf/ is populated by Cloudera Manager by correct Hadoop client configurations.

So normally all native Hadoop applications would imply hdfs:// just point to default HDFS NN.

Do I miss something in Dremio configuration? I would like not to point to a particular HDFS NN in
configuration files as this Hadoop cluster two Name Nodes (in HA setup).

Any ideas how to fix this would be super helpful.

Thanks!

2018-05-29 23:54:59,394 [Plugin Startup: __home] ERROR c.dremio.exec.util.ImpersonationUtil - Failed to create FileSystemWrapper for proxy user: Incomplete HDFS URI, no host: hdfs:///
java.io.IOException: Incomplete HDFS URI, no host: hdfs:///
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:150) ~[hadoop-hdfs-client-2.8.0.jar:na]
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2811) ~[hadoop-common-2.8.0.jar:na]
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100) ~[hadoop-common-2.8.0.jar:na]
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2848) ~[hadoop-common-2.8.0.jar:na]
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2830) ~[hadoop-common-2.8.0.jar:na]
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389) ~[hadoop-common-2.8.0.jar:na]
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:181) ~[hadoop-common-2.8.0.jar:na]
at com.dremio.exec.store.dfs.FileSystemWrapper.(FileSystemWrapper.java:96) ~[dremio-sabot-kernel-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.exec.util.ImpersonationUtil$2.run(ImpersonationUtil.java:210) ~[dremio-sabot-kernel-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.exec.util.ImpersonationUtil$2.run(ImpersonationUtil.java:206) ~[dremio-sabot-kernel-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_152]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_152]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807) ~[hadoop-common-2.8.0.jar:na]
at com.dremio.exec.util.ImpersonationUtil.createFileSystem(ImpersonationUtil.java:206) ~[dremio-sabot-kernel-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.exec.store.dfs.FileSystemPlugin.getFs(FileSystemPlugin.java:181) [dremio-sabot-kernel-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.exec.store.dfs.FileSystemPlugin.getFS(FileSystemPlugin.java:174) [dremio-sabot-kernel-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.exec.store.dfs.FileSystemPlugin.start(FileSystemPlugin.java:432) [dremio-sabot-kernel-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.dac.homefiles.HomeFileSystemStoragePlugin.start(HomeFileSystemStoragePlugin.java:80) [dremio-dac-backend-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.exec.catalog.ManagedStoragePlugin$2.run(ManagedStoragePlugin.java:238) [dremio-sabot-kernel-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.concurrent.RenamingRunnable.run(RenamingRunnable.java:36) [dremio-common-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.concurrent.SingletonRunnable.run(SingletonRunnable.java:41) [dremio-common-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.concurrent.SafeRunnable.run(SafeRunnable.java:40) [dremio-common-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]
at com.dremio.concurrent.Runnables$1.run(Runnables.java:45) [dremio-common-2.0.1-201804132205050000-10b1de0.jar:2.0.1-201804132205050000-10b1de0]

Issue was resolved by adding namespace
So instead of
hdfs:///…
specified
hdfs://namespace1/…
but the former syntax is valid too, but jsut not recognized by Dremio for some reason…

1 Like

Yes, Dremio doesn’t rely on the default filesystem set in Hadoop configuration, so namespace id has to be specified in hdfs url.