I was attempting to set up dremio on AWS EMR and couldn’t get the master nodes to submit the Yarn job. It would fail immediately with this error:
Exception in thread “main” java.lang.NoClassDefFoundError:
org/apache/hadoop/conf/Configuration
at java.lang.Class.getDeclaredMethods0(Native Method) .
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at org.apache.twill.launcher.TwillLauncher.main(TwillLauncher.java:70)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
… 6 more
After a lot of debugging, I discovered the yarn-site.xml file I was providing to dremio contained new lines in the yarn.application.classpath value. These new lines caused a failure in twill (a yarn helper) because twill expects the configuration to be one line. Hope this helps someone out there.