zookeeper is not electing the slave node as master node
my dremio cluster consist of
1 zookeeper
1 master coordinator
2 coordinator
1 executer
zookeeper is not electing the slave node as master node
my dremio cluster consist of
1 zookeeper
1 master coordinator
2 coordinator
1 executer
If you are having only one znode, then there is no master or slave right? What is the exact issue? Is Dremio not coming up. Are you able to share your server.log?
2021-03-17 10:57:25,572 [scheduler-3] WARN c.d.s.s.LocalSchedulerService - Execution of task com.dremio.service.reflection.ReflectionServiceImpl$CacheRefresher@54802bf4 failed
com.dremio.datastore.DatastoreException: Failed to search on store id: materialization_store
at com.dremio.datastore.RemoteIndexedStore.find(RemoteIndexedStore.java:43)
at com.dremio.datastore.TracingKVStore$TracingIndexedStore.lambda$find$0(TracingKVStore.java:141)
at com.dremio.common.tracing.TracingUtils.lambda$trace$0(TracingUtils.java:116)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:99)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:115)
at com.dremio.datastore.TracingKVStore.trace(TracingKVStore.java:60)
at com.dremio.datastore.TracingKVStore$TracingIndexedStore.find(TracingKVStore.java:141)
at com.dremio.datastore.adapter.LegacyIndexedStoreAdapter.find(LegacyIndexedStoreAdapter.java:50)
at com.dremio.service.reflection.store.MaterializationStore.getAllDoneWhen(MaterializationStore.java:360)
at com.dremio.service.reflection.ReflectionServiceImpl.getValidMaterializations(ReflectionServiceImpl.java:449)
at com.dremio.service.reflection.ReflectionServiceImpl.access$1900(ReflectionServiceImpl.java:134)
at com.dremio.service.reflection.ReflectionServiceImpl$CacheHelperImpl.getValidMaterializations(ReflectionServiceImpl.java:1032)
at com.dremio.service.reflection.MaterializationCache.updateCache(MaterializationCache.java:129)
at com.dremio.service.reflection.MaterializationCache.compareAndSetCache(MaterializationCache.java:106)
at com.dremio.service.reflection.MaterializationCache.refresh(MaterializationCache.java:99)
at com.dremio.service.reflection.ReflectionServiceImpl.refreshCache(ReflectionServiceImpl.java:425)
at com.dremio.service.reflection.ReflectionServiceImpl$CacheRefresher.run(ReflectionServiceImpl.java:1134)
at com.dremio.service.scheduler.LocalSchedulerService$CancellableTask.run(LocalSchedulerService.java:187)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.dremio.exec.rpc.RpcException: master node is down
at com.dremio.datastore.DatastoreRpcService.newEndpoint(DatastoreRpcService.java:152)
at com.dremio.datastore.DatastoreRpcService.getSearchEndpoint(DatastoreRpcService.java:174)
at com.dremio.datastore.DatastoreRpcClient.find(DatastoreRpcClient.java:271)
at com.dremio.datastore.RemoteIndexedStore.find(RemoteIndexedStore.java:41)
… 24 common frames omitted
2021-03-17 10:57:55,573 [scheduler-16] WARN c.d.s.s.LocalSchedulerService - Execution of task com.dremio.service.reflection.ReflectionServiceImpl$CacheRefresher@54802bf4 failed
com.dremio.datastore.DatastoreException: Failed to search on store id: materialization_store
at com.dremio.datastore.RemoteIndexedStore.find(RemoteIndexedStore.java:43)
at com.dremio.datastore.TracingKVStore$TracingIndexedStore.lambda$find$0(TracingKVStore.java:141)
at com.dremio.common.tracing.TracingUtils.lambda$trace$0(TracingUtils.java:116)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:99)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:115)
at com.dremio.datastore.TracingKVStore.trace(TracingKVStore.java:60)
at com.dremio.datastore.TracingKVStore$TracingIndexedStore.find(TracingKVStore.java:141)
at com.dremio.datastore.adapter.LegacyIndexedStoreAdapter.find(LegacyIndexedStoreAdapter.java:50)
at com.dremio.service.reflection.store.MaterializationStore.getAllDoneWhen(MaterializationStore.java:360)
at com.dremio.service.reflection.ReflectionServiceImpl.getValidMaterializations(ReflectionServiceImpl.java:449)
at com.dremio.service.reflection.ReflectionServiceImpl.access$1900(ReflectionServiceImpl.java:134)
at com.dremio.service.reflection.ReflectionServiceImpl$CacheHelperImpl.getValidMaterializations(ReflectionServiceImpl.java:1032)
at com.dremio.service.reflection.MaterializationCache.updateCache(MaterializationCache.java:129)
at com.dremio.service.reflection.MaterializationCache.compareAndSetCache(MaterializationCache.java:106)
at com.dremio.service.reflection.MaterializationCache.refresh(MaterializationCache.java:99)
at com.dremio.service.reflection.ReflectionServiceImpl.refreshCache(ReflectionServiceImpl.java:425)
at com.dremio.service.reflection.ReflectionServiceImpl$CacheRefresher.run(ReflectionServiceImpl.java:1134)
at com.dremio.service.scheduler.LocalSchedulerService$CancellableTask.run(LocalSchedulerService.java:187)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.dremio.exec.rpc.RpcException: master node is down
at com.dremio.datastore.DatastoreRpcService.newEndpoint(DatastoreRpcService.java:152)
at com.dremio.datastore.DatastoreRpcService.getSearchEndpoint(DatastoreRpcService.java:174)
at com.dremio.datastore.DatastoreRpcClient.find(DatastoreRpcClient.java:271)
at com.dremio.datastore.RemoteIndexedStore.find(RemoteIndexedStore.java:41)
… 24 common frames omitted
2021-03-17 10:58:15,603 [scheduler-9] WARN c.d.s.s.LocalSchedulerService - Execution of task com.dremio.exec.server.options.SystemOptionManager$FetchSystemOptionTask@7bf149d7 failed
com.dremio.datastore.DatastoreException: Failed to get from store id: project_options
at com.dremio.datastore.RemoteKVStore.get(RemoteKVStore.java:118)
at com.dremio.datastore.TracingKVStore.lambda$get$0(TracingKVStore.java:71)
at com.dremio.common.tracing.TracingUtils.lambda$trace$0(TracingUtils.java:116)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:99)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:115)
at com.dremio.datastore.TracingKVStore.trace(TracingKVStore.java:60)
at com.dremio.datastore.TracingKVStore.get(TracingKVStore.java:71)
at com.dremio.datastore.adapter.LegacyKVStoreAdapter.get(LegacyKVStoreAdapter.java:54)
at com.dremio.exec.server.options.SystemOptionManager.getOptionProtoListFromStore(SystemOptionManager.java:270)
at com.dremio.exec.server.options.SystemOptionManager.populateCache(SystemOptionManager.java:260)
at com.dremio.exec.server.options.SystemOptionManager$FetchSystemOptionTask.run(SystemOptionManager.java:475)
at com.dremio.service.scheduler.LocalSchedulerService$CancellableTask.run(LocalSchedulerService.java:187)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.dremio.exec.rpc.RpcException: master node is down
at com.dremio.datastore.DatastoreRpcService.newEndpoint(DatastoreRpcService.java:152)
at com.dremio.datastore.DatastoreRpcService.getGetEndpoint(DatastoreRpcService.java:158)
at com.dremio.datastore.DatastoreRpcClient.get(DatastoreRpcClient.java:94)
at com.dremio.datastore.RemoteKVStore.get(RemoteKVStore.java:115)
… 18 common frames omitted
2021-03-17 10:58:25,574 [scheduler-17] WARN c.d.s.s.LocalSchedulerService - Execution of task com.dremio.service.reflection.ReflectionServiceImpl$CacheRefresher@54802bf4 failed
com.dremio.datastore.DatastoreException: Failed to search on store id: materialization_store
at com.dremio.datastore.RemoteIndexedStore.find(RemoteIndexedStore.java:43)
at com.dremio.datastore.TracingKVStore$TracingIndexedStore.lambda$find$0(TracingKVStore.java:141)
at com.dremio.common.tracing.TracingUtils.lambda$trace$0(TracingUtils.java:116)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:99)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:115)
at com.dremio.datastore.TracingKVStore.trace(TracingKVStore.java:60)
at com.dremio.datastore.TracingKVStore$TracingIndexedStore.find(TracingKVStore.java:141)
at com.dremio.datastore.adapter.LegacyIndexedStoreAdapter.find(LegacyIndexedStoreAdapter.java:50)
at com.dremio.service.reflection.store.MaterializationStore.getAllDoneWhen(MaterializationStore.java:360)
at com.dremio.service.reflection.ReflectionServiceImpl.getValidMaterializations(ReflectionServiceImpl.java:449)
at com.dremio.service.reflection.ReflectionServiceImpl.access$1900(ReflectionServiceImpl.java:134)
at com.dremio.service.reflection.ReflectionServiceImpl$CacheHelperImpl.getValidMaterializations(ReflectionServiceImpl.java:1032)
at com.dremio.service.reflection.MaterializationCache.updateCache(MaterializationCache.java:129)
at com.dremio.service.reflection.MaterializationCache.compareAndSetCache(MaterializationCache.java:106)
at com.dremio.service.reflection.MaterializationCache.refresh(MaterializationCache.java:99)
at com.dremio.service.reflection.ReflectionServiceImpl.refreshCache(ReflectionServiceImpl.java:425)
at com.dremio.service.reflection.ReflectionServiceImpl$CacheRefresher.run(ReflectionServiceImpl.java:1134)
at com.dremio.service.scheduler.LocalSchedulerService$CancellableTask.run(LocalSchedulerService.java:187)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.dremio.exec.rpc.RpcException: master node is down
at com.dremio.datastore.DatastoreRpcService.newEndpoint(DatastoreRpcService.java:152)
at com.dremio.datastore.DatastoreRpcService.getSearchEndpoint(DatastoreRpcService.java:174)
at com.dremio.datastore.DatastoreRpcClient.find(DatastoreRpcClient.java:271)
at com.dremio.datastore.RemoteIndexedStore.find(RemoteIndexedStore.java:41)
… 24 common frames omitted
2021-03-17 10:58:28,863 [catalog-source-synchronization] WARN c.dremio.exec.catalog.PluginsManager - Failure while synchronizing sources.
2021-03-17 10:58:55,575 [scheduler-2] WARN c.d.s.s.LocalSchedulerService - Execution of task com.dremio.service.reflection.ReflectionServiceImpl$CacheRefresher@54802bf4 failed
com.dremio.datastore.DatastoreException: Failed to search on store id: materialization_store
at com.dremio.datastore.RemoteIndexedStore.find(RemoteIndexedStore.java:43)
at com.dremio.datastore.TracingKVStore$TracingIndexedStore.lambda$find$0(TracingKVStore.java:141)
at com.dremio.common.tracing.TracingUtils.lambda$trace$0(TracingUtils.java:116)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:99)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:115)
at com.dremio.datastore.TracingKVStore.trace(TracingKVStore.java:60)
at com.dremio.datastore.TracingKVStore$TracingIndexedStore.find(TracingKVStore.java:141)
at com.dremio.datastore.adapter.LegacyIndexedStoreAdapter.find(LegacyIndexedStoreAdapter.java:50)
at com.dremio.service.reflection.store.MaterializationStore.getAllDoneWhen(MaterializationStore.java:360)
at com.dremio.service.reflection.ReflectionServiceImpl.getValidMaterializations(ReflectionServiceImpl.java:449)
at com.dremio.service.reflection.ReflectionServiceImpl.access$1900(ReflectionServiceImpl.java:134)
at com.dremio.service.reflection.ReflectionServiceImpl$CacheHelperImpl.getValidMaterializations(ReflectionServiceImpl.java:1032)
at com.dremio.service.reflection.MaterializationCache.updateCache(MaterializationCache.java:129)
at com.dremio.service.reflection.MaterializationCache.compareAndSetCache(MaterializationCache.java:106)
at com.dremio.service.reflection.MaterializationCache.refresh(MaterializationCache.java:99)
at com.dremio.service.reflection.ReflectionServiceImpl.refreshCache(ReflectionServiceImpl.java:425)
at com.dremio.service.reflection.ReflectionServiceImpl$CacheRefresher.run(ReflectionServiceImpl.java:1134)
at com.dremio.service.scheduler.LocalSchedulerService$CancellableTask.run(LocalSchedulerService.java:187)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.dremio.exec.rpc.RpcException: master node is down
at com.dremio.datastore.DatastoreRpcService.newEndpoint(DatastoreRpcService.java:152)
at com.dremio.datastore.DatastoreRpcService.getSearchEndpoint(DatastoreRpcService.java:174)
at com.dremio.datastore.DatastoreRpcClient.find(DatastoreRpcClient.java:271)
at com.dremio.datastore.RemoteIndexedStore.find(RemoteIndexedStore.java:41)
… 24 common frames omitted
2021-03-17 10:59:15,603 [scheduler-18] WARN c.d.s.s.LocalSchedulerService - Execution of task com.dremio.exec.server.options.SystemOptionManager$FetchSystemOptionTask@7bf149d7 failed
com.dremio.datastore.DatastoreException: Failed to get from store id: project_options
at com.dremio.datastore.RemoteKVStore.get(RemoteKVStore.java:118)
at com.dremio.datastore.TracingKVStore.lambda$get$0(TracingKVStore.java:71)
at com.dremio.common.tracing.TracingUtils.lambda$trace$0(TracingUtils.java:116)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:99)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:115)
at com.dremio.datastore.TracingKVStore.trace(TracingKVStore.java:60)
at com.dremio.datastore.TracingKVStore.get(TracingKVStore.java:71)
at com.dremio.datastore.adapter.LegacyKVStoreAdapter.get(LegacyKVStoreAdapter.java:54)
at com.dremio.exec.server.options.SystemOptionManager.getOptionProtoListFromStore(SystemOptionManager.java:270)
at com.dremio.exec.server.options.SystemOptionManager.populateCache(SystemOptionManager.java:260)
at com.dremio.exec.server.options.SystemOptionManager$FetchSystemOptionTask.run(SystemOptionManager.java:475)
at com.dremio.service.scheduler.LocalSchedulerService$CancellableTask.run(LocalSchedulerService.java:187)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.dremio.exec.rpc.RpcException: master node is down
at com.dremio.datastore.DatastoreRpcService.newEndpoint(DatastoreRpcService.java:152)
at com.dremio.datastore.DatastoreRpcService.getGetEndpoint(DatastoreRpcService.java:158)
at com.dremio.datastore.DatastoreRpcClient.get(DatastoreRpcClient.java:94)
at com.dremio.datastore.RemoteKVStore.get(RemoteKVStore.java:115)
… 18 common frames omitted
2021-03-17 10:59:25,576 [scheduler-10] WARN c.d.s.s.LocalSchedulerService - Execution of task com.dremio.service.reflection.ReflectionServiceImpl$CacheRefresher@54802bf4 failed
com.dremio.datastore.DatastoreException: Failed to search on store id: materialization_store
at com.dremio.datastore.RemoteIndexedStore.find(RemoteIndexedStore.java:43)
at com.dremio.datastore.TracingKVStore$TracingIndexedStore.lambda$find$0(TracingKVStore.java:141)
at com.dremio.common.tracing.TracingUtils.lambda$trace$0(TracingUtils.java:116)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:99)
at com.dremio.common.tracing.TracingUtils.trace(TracingUtils.java:115)
at com.dremio.datastore.TracingKVStore.trace(TracingKVStore.java:60)
at com.dremio.datastore.TracingKVStore$TracingIndexedStore.find(TracingKVStore.java:141)
at com.dremio.datastore.adapter.LegacyIndexedStoreAdapter.find(LegacyIndexedStoreAdapter.java:50)
at com.dremio.service.reflection.store.MaterializationStore.getAllDoneWhen(MaterializationStore.java:360)
at com.dremio.service.reflection.ReflectionServiceImpl.getValidMaterializations(ReflectionServiceImpl.java:449)
at com.dremio.service.reflection.ReflectionServiceImpl.access$1900(ReflectionServiceImpl.java:134)
at com.dremio.service.reflection.ReflectionServiceImpl$CacheHelperImpl.getValidMaterializations(ReflectionServiceImpl.java:1032)
at com.dremio.service.reflection.MaterializationCache.updateCache(MaterializationCache.java:129)
at com.dremio.service.reflection.MaterializationCache.compareAndSetCache(MaterializationCache.java:106)
at com.dremio.service.reflection.MaterializationCache.refresh(MaterializationCache.java:99)
at com.dremio.service.reflection.ReflectionServiceImpl.refreshCache(ReflectionServiceImpl.java:425)
at com.dremio.service.reflection.ReflectionServiceImpl$CacheRefresher.run(ReflectionServiceImpl.java:1134)
at com.dremio.service.scheduler.LocalSchedulerService$CancellableTask.run(LocalSchedulerService.java:187)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.dremio.exec.rpc.RpcException: master node is down
at com.dremio.datastore.DatastoreRpcService.newEndpoint(DatastoreRpcService.java:152)
at com.dremio.datastore.DatastoreRpcService.getSearchEndpoint(DatastoreRpcService.java:174)
at com.dremio.datastore.DatastoreRpcClient.find(DatastoreRpcClient.java:271)
at com.dremio.datastore.RemoteIndexedStore.find(RemoteIndexedStore.java:41)
… 24 common frames omitted
@balaji.ramaswamy server.log of slave coordinator
Is the Dremio metadata on a NFS share, also the zoo keeper needs to be an external ZK. Would you mind sharing your dremio.conf file?