Error Starting Dremio

Hi Community:

I am trying to setup a small PoC Hadoop cluster for Dremio (Dremio Yarn) and I have come across an issue I haven’t been able to resolve:

Here’s what I am getting:
Catastrophic failure occurred. Exiting. Information follows: Failed to start services, daemon exiting.
com.dremio.common.exceptions.UserException: Tried to access non-existent source [__jobResultsStore].
at com.dremio.common.exceptions.UserException$Builder.build(UserException.java:746)
at com.dremio.exec.catalog.CatalogServiceImpl.synchronize(CatalogServiceImpl.java:407)
at com.dremio.exec.catalog.CatalogServiceImpl.getPlugin(CatalogServiceImpl.java:813)
at com.dremio.exec.catalog.CatalogServiceImpl.getSource(CatalogServiceImpl.java:841)
at com.dremio.dac.daemon.DACDaemonModule$4.get(DACDaemonModule.java:381)
at com.dremio.dac.daemon.DACDaemonModule$4.get(DACDaemonModule.java:376)
at com.dremio.service.jobs.LocalJobsService.start(LocalJobsService.java:258)
at com.dremio.service.SingletonRegistry$AbstractServiceReference.start(SingletonRegistry.java:137)
at com.dremio.service.ServiceRegistry.start(ServiceRegistry.java:74)
at com.dremio.service.SingletonRegistry.start(SingletonRegistry.java:33)
at com.dremio.dac.daemon.DACDaemon.startServices(DACDaemon.java:175)
at com.dremio.dac.daemon.DACDaemon.init(DACDaemon.java:181)
at com.dremio.dac.daemon.DremioDaemon.main(DremioDaemon.java:131)

Please keep into consideration the following:

  1. This is a fresh install; however, I have already restarted and deleted/cleaned multiple times the installation due to other multiple errors that have already been fixed.
  2. I have already cleaned Dremio’s local folders (E.g. rm -r /var/lib/dremio/db/) (this helps with a different error)
  3. Dremio’s has full rights on its hdfs cache folder.
  4. The logs attached already have the option “debug” enable on logback.xml
  5. When I run “sudo service dremio start” it initially works, the output from “sudo service dremio status” is “dremio is running.” However, after few seconds it breaks with the catasthropic failure shown above.
  6. As per the official Dremio Yarn configuration guide, all changes have been only made on the Executor Node, no changes at all have been made on the Worker Nodes (except for the changes made to core-site.xml which I guess Ambari propagates across all nodes in the cluster). This relates to “Grant Dremio service user the privilege to impersonate the end user.”

Here’s the details of my setup:

Dremio Version: latest Dremio Community Edition (RPM dremio-community-2.0.5-201806021755080191_767cfb5_1.noarch.rpm)
OS: CentOS Linux release 7.5.1804 (Core)
CPU: Intel® Xeon® CPU E5-2673 v4 @ 2.30GHz (checked and has AVX2 support)
Cluster Details: Azure / 1 master / 3 workers (to make things simpler the master node is used as namenode and Dremio Coordinator)
Hadoop distribution: HDP 2.6.5.0-292

For your referece, please find attached configuration files and server logs.DremioSetup.zip (264.4 KB)

1 Like

To begin with, thank you for being very thorough in your post and including almsot everything. I see an issue with your dremio.conf under paths.dist. You currently have it as hdfs:///user/dremio can you change it to the correct sytnax like hdfs://host:8020/user/dremio

FYI according to your Hadoop files it should be hdfs://caprinalytics-m0.wghxeskrlamufe31qjspargjhh.px.internal.cloudapp.net:8020/user/dremio
However, I have seen issues with Azure so please note 1. You must make sure the port is open in the firewall 2. If it doesn’t work, you may have to replace the host with IP

2 Likes

Anthony,

You were absolutely spot on! Issue fixed, I was missing the port number and using an incorrect syntax. Thanks mate!