Out of memory exception

May i know what is the best practice to allocate direct memory space, max total memory and heap memory.
We had configured Dremio from the azure market place. We are currently using a medium cluster with 1 coordinator node and 10 executors. But we have been getting “out of memory” errors. But .
The executors are Azure VMs of Standard_E16s_v3 (128 GB RAM). But somehow dremio keeps running out of memory. We had tried setting close to 116046 GB as DREMIO_MAX_MEMORY_SIZE_MB. But it keeps throwing the error that “Query exceeded memory limits set by admin”.
May I know what are we doing wrong here.

@reshma.cs

Can you paste ps -ef | grep dremio ourput from dremio coordiantor and executor nodes ?

Also if you can share the query profile helps to look at the problem.

@Venugopal_Menda

Thanks Venugopal. Below is the executor and master configuration

Executor:

x86_64/jre/bin/java -Djava.util.logging.config.class=org.slf4j.bridge.SLF4JBridgeHandler -Djava.library.path=/opt/dremio/lib -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/var/log/dremio/server.gc -Ddremio.log.path=/var/log/dremio -Ddremio.plugins.path=/opt/dremio/plugins -Xmx10000m -XX:MaxDirectMemorySize=100000m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dremio -Dio.netty.maxDirectMemory=0 -DMAPR_IMPALA_RA_THROTTLE -DMAPR_MAX_RA_STREAMS=400 -cp /etc/dremio:/opt/dremio/jars/:/opt/dremio/jars/ext/:/opt/dremio/jars/3rdparty/* com.dremio.dac.daemon.DremioDaemon
DremioA+ 2970 2587 0 13:41 pts/0 00:00:00 grep --color=auto dremio

Master:

-Djava.util.logging.config.class=org.slf4j.bridge.SLF4JBridgeHandler -Djava.library.path=/opt/dremio/lib -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/var/log/dremio/server.gc -Ddremio.log.path=/var/log/dremio -Ddremio.plugins.path=/opt/dremio/plugins -Xmx8192m -XX:MaxDirectMemorySize=2048m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log dremio -Dio.netty.maxDirectMemory=0 -DMAPR_IMPALA_RA_THROTTLE -DMAPR_MAX_RA_STREAMS=400 -cp /etc/dremio:/opt/dremio/jars/:/opt/dremio/jars/ext/:/opt/dremio/jars/3rdparty/* com.dremio.dac.daemon.DremioDaemon
DremioA+ 67473 67398 0 13:50 pts/0 00:00:00 grep --color=auto dremio

@reshma.cs
Can you provide the query profile as well for the failed query

@Venugopal_Menda

In the query profile I have been getting this as the error : Out of memory while receiving incoming message. Message size: 1048576

Allocation outcome details:
allocator[op:1:0:incoming] reservation: 0 limit: 9223372036854775807 used: 18874368 requestedSize: 1048576 allocatedSize: 0 localAllocationStatus: success
allocator[frag:1:0] reservation: 9000000 limit: 5368709120 used: 5351289344 requestedSize: 1048576 allocatedSize: 0 localAllocationStatus: success
allocator[phase-1] reservation: 0 limit: 5368709120 used: 5351813632 requestedSize: 1048576 allocatedSize: 0 localAllocationStatus: success
allocator[query-20e65b23-014a-da38-c4a7-d699f1db5c00] reservation: 0 limit: 5368709120 used: 5368590336 requestedSize: 1048576 allocatedSize: 0 localAllocationStatus: fail

I use VisualVM to monitor the java memory. Configure remote not local connections

JConsole is also good. It breaks down the java pools. I sent an email to our Oracle rep to see if we need to license JDK when using JConsole. VisaulVM is open source