Support for YARN Resource Manager HA?

Dremio asks for specific YARN ResourceManager HA hostname.
All our clusters have HA setup for YARN RMS.
Is there is a way to add support for YARN RM?

Best case if we would have just to point to /etc/yarn/conf that has all the client configs
generated by Cloduera Manager that point to / describe all available YARN RMs.

Same applies to HDFS NameNode - see screenshot below.

It would be hard for us to productionize Dremio as every time a YARN RM or HDFS NN
fails over/ switches over, we would need to update Dremio configs manually in several places.

Thanks.

Hi @Tagar

We are still working on handling HA for RM, for NN you should be able to use the “HA-enabled logical URI” from the hdfs-site.xml

fs.defaultFS - the default path prefix used by the Hadoop FS client when none is given

Optionally, you may now configure the default path for Hadoop clients to use the new HA-enabled logical URI. If you used “mycluster” as the nameservice ID earlier, this will be the value of the authority portion of all of your HDFS paths.

fs.defaultFS hdfs://mycluster

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html

1 Like

Btw, working with Anthony from Dremio we found that adding yarn.resourcemanager.cluster-id from yarn-site.xml instead of yarn rm hostname makes Dremio aware of YARN RM HA setup.

cc @anthony

Thanks for the update @Tagar, will document this