Hi Dremio community , im trying to create a K8s cluster with Dremio + Zookeeper , but i have some issues in both Coordinator and Executor node. this is the information:
I have distributed storage using a S3 bucket. it works on master node, i even added the bucket as a source to Dremio. i have a fully working master node, i can even connect to the dashboard ( and watch the service run ).
The problem i get in the other nodes happens when i start the service. Logs :
Mon Apr 9 09:38:02 UTC 2018 Starting dremio on dremio-app-executor-rhv9j
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 15738
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1048576
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Catastrophic failure occurred. Exiting. Information follows: Failed to start services, daemon exiting.
java.lang.RuntimeException: Failure while attempting to create com.dremio.service.users.SimpleUserService.
at com.dremio.service.BinderImpl$InjectableReference.get(BinderImpl.java:427)
at com.dremio.service.BinderImpl.lookup(BinderImpl.java:109)
at com.dremio.service.BinderImpl$DeferredProvider.get(BinderImpl.java:83)
at com.dremio.exec.server.ContextService.newSabotContext(ContextService.java:188)
at com.dremio.exec.server.ContextService.start(ContextService.java:145)
at com.dremio.service.SingletonRegistry$AbstractServiceReference.start(SingletonRegistry.java:137)
at com.dremio.dac.daemon.NonMasterSingletonRegistry.start(NonMasterSingletonRegistry.java:54)
at com.dremio.dac.daemon.DACDaemon.startServices(DACDaemon.java:174)
at com.dremio.dac.daemon.DACDaemon.init(DACDaemon.java:180)
at com.dremio.dac.daemon.DremioDaemon.main(DremioDaemon.java:164)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at com.dremio.service.BinderImpl$InjectableReference.get(BinderImpl.java:421)
... 9 more
Caused by: java.lang.NullPointerException: Unknown store creator com.dremio.service.users.SimpleUserService$UserGroupStoreBuilder
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:250)
at com.dremio.datastore.RemoteKVStoreProvider.getStore(RemoteKVStoreProvider.java:55)
at com.dremio.service.users.SimpleUserService.<init>(SimpleUserService.java:96)
... 14 more
my executor conf file (autogenerated by k8s) :
paths: {
# the local path for dremio to store data.
local: "/var/lib/dremio"
# the distributed path Dremio data including job results, downloads, uploads,etc
#dist: "pdfs://"${paths.local}"/pdfs"
dist: "s3a://$MY_BUCKET_S3"
}
services: {
coordinator.enabled: false,
coordinator.master.enabled: false,
executor.enabled: true
}
zookeeper: "zookeeper:2181"
my coordinator conf file (also autogenerated by k8s) :
paths: {
# the local path for dremio to store data.
local: "/var/lib/dremio"
# the distributed path Dremio data including job results, downloads, uploads,etc
#dist: "pdfs://"${paths.local}"/pdfs"
dist: "s3a://$MY_BUCKET_S3"
}
services: {
coordinator.enabled: true,
coordinator.master.enabled: false,
executor.enabled: false
}
zookeeper: "zookeeper:2181"
all my nodes share the /etc/dremio/core-site.xml
dremio version : 1.4.9-201802191836310213-7195059
java version: jre1.8.0_131
uname -a: 4.4.115-k8s #1 SMP Thu Feb 8 15:37:40 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
my cluster is working on AWS using Kops ( so my pods are created in EC2 instances )
Hope i can solve this problem to be able to share my docker images and my pod template, so the community can deploy automatically a cluster with a working Dremio