While running a query on Dremio 4.6.1 installed on Kubernetes, we are getting the following error message from Dremio UI:
ExecutionSetupException: One or more nodes lost connectivity during query. Identified nodes were [dremio-executor-2.dremio-cluster-pod.dremio.svc.cluster.local:0].
here are the logs from mentioned worker:
dremio-executor-2-logs.zip (6.7 KB)
Dremio-env config has the following settings:
DREMIO_MAX_HEAP_MEMORY_SIZE_MB is not set
We are using workers of 16G /8c (Total of 10 workers)
1 Master Coordinator with the same config
Zookeeper with 1G/ 1c
Any idea what s causing this behavior ?
By running logs of the worker crashing here are the logs before the crash
An irrecoverable stack overflow has occurred. Please check if any of your loaded .so files has enabled executable stack (see man page execstack(8)) # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f41cdac4fa8, pid=1, tid=0x00007f41dc2ed700 # # JRE version: OpenJDK Runtime Environment (8.0_262-b10) (build 1.8.0_262-b10) # Java VM: OpenJDK 64-Bit Server VM (25.262-b10 mixed mode linux-amd64 compressed oops) # Problematic frame: # C 0x00007f41cdac4fa8 # # Core dump written. Default location: /opt/dremio/core or core.1 # # An error report file with more information is saved as: # /tmp/hs_err_pid1.log # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # [error occurred during error reporting , id 0xb]