Error starting workers on HDP 2.6 with LzoCodec on HDFS

Hi,

we are experiencing the following exception when provisioning dremio executors on YARN:

java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.

at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:139) ~[dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at org.apache.hadoop.io.compress.CompressionCodecFactory.(CompressionCodecFactory.java:179) ~[dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at com.dremio.exec.store.dfs.FileSystemWrapper.(FileSystemWrapper.java:105) ~[dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at com.dremio.exec.store.dfs.FileSystemWrapper.(FileSystemWrapper.java:100) ~[dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at com.dremio.exec.store.dfs.FileSystemWrapper.get(FileSystemWrapper.java:125) ~[dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at com.dremio.dac.daemon.SystemStoragePluginInitializer.initialize(SystemStoragePluginInitializer.java:61) ~[dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at com.dremio.dac.daemon.SystemStoragePluginInitializer.initialize(SystemStoragePluginInitializer.java:45) ~[dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at com.dremio.service.InitializerRegistry.start(InitializerRegistry.java:58) ~[dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at com.dremio.service.SingletonRegistry$AbstractServiceReference.start(SingletonRegistry.java:137) [dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at com.dremio.dac.daemon.NonMasterSingletonRegistry.start(NonMasterSingletonRegistry.java:54) [dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at com.dremio.dac.daemon.DACDaemon.startServices(DACDaemon.java:174) [dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at com.dremio.dac.daemon.DACDaemon.init(DACDaemon.java:180) [dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at com.dremio.dac.daemon.DremioDaemon.main(DremioDaemon.java:164) [dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_112]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_112]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_112]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_112]
at org.apache.twill.ext.BundledJarRunner.run(BundledJarRunner.java:119) [dremio-twill-shaded-1.4.9-201802191836310213-7195059.jar:1.4.9-201802191836310213-7195059]
at org.apache.twill.ext.BundledJarRunnable.run(BundledJarRunnable.java:57) [dremio-twill-shaded-1.4.9-201802191836310213-7195059.jar:1.4.9-201802191836310213-7195059]
at org.apache.twill.internal.container.TwillContainerService.doRun(TwillContainerService.java:222) [dremio-twill-shaded-1.4.9-201802191836310213-7195059.jar:1.4.9-201802191836310213-7195059]
at org.apache.twill.internal.AbstractTwillService.run(AbstractTwillService.java:189) [dremio-twill-shaded-1.4.9-201802191836310213-7195059.jar:1.4.9-201802191836310213-7195059]
at twill.com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) [dremio-twill-shaded-1.4.9-201802191836310213-7195059.jar:1.4.9-201802191836310213-7195059]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_112]
Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2122) ~[dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132) ~[dremio-daemon-bundle.jar:1.4.9-201802191836310213-7195059]
… 22 common frames omitted

Does Anyone know how we can configure dremio’s workers to use these libs (lzo)?
Thanks
Sergio

OUR specs:
hdp 2.6.0
dremio deployment on YARN (2 coordinators in HA, 5 executor)
centos 6

I assume you copied the Hadoop XML (core-site, hdfs-site, yarn-site) files to Dremio conf dir ($DREMIO_HOME/conf)? If you did not, please go ahead.
Next, there are 2 options to try (don’t forget to restart service)

  1. Find which properties use the LzoCodec. According to this online article there is 1 in core-site and 1 in hdfs-site. Delete all mentions of LzoCodec from the XML files in Dremio conf dir
  2. If you know the exact location of the lib jar that contain the LzoCodec lib, you can try to copy that into $DREMIO_HOME/jars/3rdparty

Hi,

thank you for your answers but we resolved including the jar classes in the bundled JAR, following analogous steps to those described in

https://docs.dremio.com/deployment/yarn-hadoop/distribution-dependencies.html?h=wandisco

I think that those information may be useful for other installations/people, so I suggest to expand the documentation with a more general procedure.

Regards
Sergio