I am deploying dremio in Kubernetes cluster which is installed in Docker Enterprize edition using the below helm charts
After deployment pods are not coming up and is in pending status, Getting Error from server (BadRequest): pod dremio-executor-0 does not have a host assigned
Anyone faced this kind of issue before, can you have any thought what’s going wrong here?
PS C:\Users\E99887\Desktop\Kube\charts> kubectl exec dremio-executor-0 – ls -la /
Error from server (BadRequest): pod dremio-executor-0 does not have a host assigned
PS C:\Users\E99887\Desktop\Kube\charts>
It could be that your cluster does not have enough resources. Try
kubectl describe pod dremio-executor-0
and check why it is pending. If it states something to the tune of no resources available to satisfy the cpu/memory requirements, the underlying resources are not sufficient. The default values.yaml includes values that we expect users to use in production environments. You can try reducing the cpu/memory and count in values.yaml and attempt it again. Note, that depending on what you are attempting to process, the amount of memory/cpu available to Dremio is important.
If reason for pending is something else, please post your error here.
As you said the values.yaml file is created for Prod deployment, so we customized the cpu/memory specifications and count of pods in values.yaml file for master/executor/zookeeper nodes and limited the no of nodes that would like to spin up based on the resource availability in our cluster.
Is it anything to do with nodeSelector value as that’s hashed out in values.yaml, will it be dynamically be decided which cluster node the pod will spin up while deploying releases.
Except below warning I am not seeing any other error in pod describe output.
Conditions:
Type Status
PodScheduled False
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
Warning FailedScheduling 43s (x296 over 4h50m) default-scheduler pod has unbound immediate PersistentVolumeClaims
nodeSelector is allows you to schedule your pods on nodes with labels specified.
Do you have taints defined on nodes on your kubernetes cluster? What it looks like is that no node is available for scheduling the pods. Looks like you are running it on Windows machine. Can you share some more details about your Kubernetes environment?
@nsen Thanks for all of your response and update so far, feels like we have great community support in dremio.
Our kubernetes cluster is installed in Docker EE, version v1.11.9-docker-1 and based on os image Ubuntu 16.04.5 LTS.
I tried below approach:
As I checked, nodes have default taints defined pre-configured in Docker EE cluster, I tried untaint the nodes, If try untaining the nodes, says untained like below but actually not getting untainted.
kubectl taint nodes q01cdk001 com.docker.ucp.orchestrator.kubernetes-
node/q01cdk001 untainted
after untaining, issue is same pods are not getting scheduled, as I said nodes are not getting untained, taints shows in node describe output.
option-2:
Used node selector option with key value pair to schedule the pods in one node, but even that is not working, pods are being launched not getting scheduled into nodes.
@nsen
I tried adding tolerations to the deployment template with both manager and orchestrator nodes, but not working. pods are still in pending status and waiting for host to be assigned, As I said we are using kubernetes integrated with docker EE UCP cluster, it seems toleration are getting added automatically to kubernetes objects while deploying and even if we mention toleration to pod spec will be overridden by UCP toleration, may there could be some configuration changes in UCP to allow service accounts to schedule pods, exploring those options.
For the time being. tested with tomcat chart deploying from helm repo and its up and running, but dremio chart deployed pod are not being scheduled. See below pod is by default scheduled in kubernetes worker and tiller pod is also scheduled in the same worker node. no workload can be scheduled on manager nodes as per configuration.
Is it because of persistent volumes not mounted scheduling being refused.
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
Warning FailedScheduling 18m default-scheduler persistentvolumeclaim “dremio-executor-volume-dremio-executor-0” not found
Warning FailedScheduling 60s (x18 over 18m) default-scheduler pod has unbound immediate PersistentVolumeClaims
@nsen
I think the issue is with persistent volume group, as I tried deployed other helm charts like redid- Postgres none of the pods coming up due to PVCS, could you please help or share some lights on configuring pvcs for Dremio
As mentioned in the values.yaml it will use default storage class if not mentioned anything, not sure what else to define.
Issue in pvc binding, below the status from describe pvc, I think we need to mention storage class and mention persistent volume name.
Events:
Type Reason Age From Message
Normal FailedBinding 93s (x3684 over 15h) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
Mounted By: zk-1
Based on your error, checkout their doc on creating storage class depending on your environment. For example, https://docs.docker.com/ee/ucp/kubernetes/storage/configure-aws-storage/ talks about creating storage classes in AWS. (Using AKS in Azure or EKS in AWS does create a default storageclass).
@nsen we are in the process of creating PV in the kubernetes cluster so PVC can be claimed in pod configuration, will let you know how that goes and thanks all for your feedback and input so far.
I think we have a great dremio community support and active work going on, loving much.
we are having issue with dynamic storage provisioning in our Kubernetes cluster that we are trying to resolve,
so helm deployment is not working.
would like to deploy manually, so could you please let us know where these yaml files are in github repo? would be really helpful if you can send the path so can take a look.
Those files are not available. The direction is to use helm.
You can look at the templates directory for the templatized versions of those files. Or, you can do a helm install --debug --dry-run . to generate usable template files and go from there.