Dremio Aggregate reflections go OOM

The usecase is as follows

CASE1:
I am trying to create indexes on 36GB of parquet files created by partitioning year + month-ly. The raw index succeeded creating some 1.5GB of index --all are hash indexes
When it try to create Aggregate indexes the job fails saying

2018-01-18 00:17:30,366 [FABRIC-rpc-event-queue] INFO  c.d.s.e.rpc.CoordToExecHandlerImpl - Received remote fragment start instruction for 25a065be-00d7-607f-e1e0-2cc4f154dc00:5:3
2018-01-18 00:19:31,666 [e0 - 25a065be-00d7-607f-e1e0-2cc4f154dc00:frag:5:1] ERROR com.dremio.sabot.driver.SmartOp - One or more nodes ran out of memory while executing the query.
com.dremio.common.exceptions.UserException: One or more nodes ran out of memory while executing the query.
        at com.dremio.common.exceptions.UserException$Builder.build(UserException.java:648) ~[dremio-common-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
        at com.dremio.sabot.driver.SmartOp.contextualize(SmartOp.java:125) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
        at com.dremio.sabot.driver.SmartOp$SmartSingleInput.consumeData(SmartOp.java:231) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-map
r]
        at com.dremio.sabot.driver.StraightPipe.pump(StraightPipe.java:59) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
        at com.dremio.sabot.driver.Pipeline.doPump(Pipeline.java:82) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
        at com.dremio.sabot.driver.Pipeline.pumpOnce(Pipeline.java:72) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
        at com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run(FragmentExecutor.java:288) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-
a7af5c8-mapr]
        at com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run(FragmentExecutor.java:284) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-
a7af5c8-mapr]



Caused by: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 26843545600, max: 26843545600)
        at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:510) ~[netty-common-4.0.49.Final.jar:4.0.49.Final]
        at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:464) ~[netty-common-4.0.49.Final.jar:4.0.49.Final]
        at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:766) ~[netty-buffer-4.0.49.Final.jar:4.0.49.Final]
        at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:742) ~[netty-buffer-4.0.49.Final.jar:4.0.49.Final]
        at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:244) ~[netty-buffer-4.0.49.Final.jar:4.0.49.Final]
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:226) ~[netty-buffer-4.0.49.Final.jar:4.0.49.Final]
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:146) ~[netty-buffer-4.0.49.Final.jar:4.0.49.Final]
        at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.newDirectBufferL(PooledByteBufAllocatorL.java:171) ~[arrow-memory-0.7.0-201711092339270134-6305826-dremio.jar:4.0.49.Final]
        at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.directBuffer(PooledByteBufAllocatorL.java:204) ~[arrow-memory-0.7.0-201711092339270134-6305826-dremio.jar:4.0.49.Final]
        at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:55) ~[arrow-memory-0.7.0-201711092339270134-6305826-dremio.jar:4.0.49.Final]
        ... 23 common frames omitted

The Aggregate functions are partitioned by year+month (same as parquet files) and no month folder size is greater than 15GB, however inspite of increasing the dremio direct memory from 8GB to 25GB the error still continues. Any ideas? As these are aggregate functions, i am of the view that the storage size during the idex creation/ after should be very small.

CASE2:
I assumed that if we can integrate dremio with mapr-yarn the issue may be resolved ('cos in that case yarn will be using the full 3 node cluster of mapr – 5.2.2)
As said in the doc: https://docs.dremio.com/deployment/yarn-deployment.html i configured the dremio. When dremio starts i see a running-job in my resource manager; however when i try to run a query on the parque files i get error that there are no active executors. i think its because of the below configuration executor.enabled=false.

services: {
executor.enabled: false,
coordinator.enabled: true
}

if i change it to true (executor.enabled) then i dont see the aggregation work delegated to the resource manager (of mapr-spark). any ideas? my dremio.conf is as below

master: {
name: localhost,
port: 45678
}

paths: {
local:/var/lib/dremio
dist: “maprfs:///dremioVol/cacheDir”
}

zookeeper: “110.122.121.104:5181”

services: {
coordinator.enabled: true,
coordinator.embedded_master_zk.enabled:false,
coordinator.auto-upgrade:false,
coordinator.client-endpoint.port:31050,
executor.enabled: true
}

Any ideas please?

@ravi.eze some thoughts:

Case 1
Could you try increasing the memory further? In some cases the in-memory representation of the data at query time might end up being larger than expected. Sounds like the aggregation query results in a lot of distinct groups, resulting in a OOM, as Dremio needs to keep all combinations in memory for the aggregation. It might also be useful to share a query profile to confirm this.

Case 2
I’ll let someone else comment on this. Might be useful if you share a screenshot of 1) your Provisioning configuration screen + node status screen in Dremio 2) Resource Manager screen with the Dremio job.

For Case 2 What you are going to achieve is that Dremio Executors will be running on number of containers you specify during Provisioning process. And - yes - it will create better parallelization of work if you did not install Dremio on multiple nodes yet.
Changing dremio.conf not going to help you - Executors started through YARN are always executors :).
Now - you are saying job in RM shows up, but your containers may not have fully initialized and subsequently running.
Things to check:

  1. On Provisioning screen - unless you see containers in “Provisioned” mode they are not really fully functioning
  2. On Node Activity Screen in Dremio - if you don’t see additional nodes - they are not functioning
  3. On RM - take a look inside the job - it should be > 1 containers (AppMaster + real workers). If you see just master it does not help much. Take a look at the logs of the containers for clues.

One clue is if you are running MapR with no security there is one bug in some tertiary dependency we use that may cause ZK issues and you may need to stop/start provisioning few times for it to succeed.

Thank u for replying. I was awake for this. Mine is un-secure mapr community cluster (deplyed on 3 nodes). The screen shot of the configuration is below.

on the provisioning screen i see the below. The status never goes to provisioned (even after 2/ 3 hrs)

There are 2 containers created on 2 different nodes
image

however when i see the stdlog of the container i see the below.

Showing 4096 bytes. Click here for full log

, com.dremio.exec.planner.acceleration.substitution, com.dremio.sabot.task.slicing.SlicingTaskPool, com.dremio.extras.plugins.elastic, com.dremio.plugins.mongo, com.dremio.exec.store.jdbc, com.dremio.plugins.elastic, com.dremio.plugins.mongo, com.dremio.exec.store, com.dremio.exec.ExecConstants, com.dremio.exec.compile, com.dremio.exec.expr, com.dremio.exec.physical, com.dremio.exec.planner.physical.PlannerSettings, com.dremio.exec.server.options, com.dremio.exec.store, com.dremio.exec.store.dfs.implicit.ImplicitFilesystemColumnFinder, com.dremio.exec.rpc.user.security, com.dremio.sabot, org.apache.hadoop.hive, com.dremio.exec.fn.hive, org.apache.hadoop.hive, com.dremio.exec.store.hbase, com.dremio.exec.expr.fn.impl.conv, com.dremio.exec.store.jdbc] in locations [jar:file:/opt/spark/usercache/mapr/appcache/application_1516191146405_0010/container_e36_1516191146405_0010_01_000002/dremio-daemon-bundle.jar!/] took 1841ms
2018-01-18 02:10:04,223 [TwillContainerService] INFO c.d.datastore.LocalKVStoreProvider - Starting LocalKVStoreProvider
2018-01-18 02:10:04,277 [TwillContainerService] INFO c.d.datastore.LocalKVStoreProvider - Stopping LocalKVStoreProvider
2018-01-18 02:10:04,278 [TwillContainerService] INFO c.d.datastore.LocalKVStoreProvider - Stopped LocalKVStoreProvider
2018-01-18 02:10:04,282 [TwillContainerService] ERROR o.apache.twill.ext.BundledJarRunner - Error while trying to run com.dremio.dac.daemon.DremioDaemon within /opt/spark/usercache/mapr/appcache/application_1516191146405_0010/container_e36_1516191146405_0010_01_000002/dremio-daemon-bundle.jar
java.lang.reflect.InvocationTargetException: null
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_151]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_151]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_151]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_151]
at org.apache.twill.ext.BundledJarRunner.run(BundledJarRunner.java:119) ~[dremio-twill-shaded-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at org.apache.twill.ext.BundledJarRunnable.run(BundledJarRunnable.java:57) [dremio-twill-shaded-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at org.apache.twill.internal.container.TwillContainerService.doRun(TwillContainerService.java:222) [dremio-twill-shaded-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at org.apache.twill.internal.AbstractTwillService.run(AbstractTwillService.java:189) [dremio-twill-shaded-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at twill.com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) [dremio-twill-shaded-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
Caused by: org.rocksdb.RocksDBException: lock /var/lib/dremio/db/catalog/LOCK: Resource temporarily unavailable
at org.rocksdb.RocksDB.open(Native Method) ~[dremio-daemon-bundle.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at org.rocksdb.RocksDB.open(RocksDB.java:286) ~[dremio-daemon-bundle.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at com.dremio.datastore.ByteStoreManager.start(ByteStoreManager.java:144) ~[dremio-daemon-bundle.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at com.dremio.datastore.CoreStoreProviderImpl.start(CoreStoreProviderImpl.java:167) ~[dremio-daemon-bundle.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at com.dremio.datastore.LocalKVStoreProvider.start(LocalKVStoreProvider.java:86) ~[dremio-daemon-bundle.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at com.dremio.dac.daemon.DremioDaemon.checkVersion(DremioDaemon.java:112) ~[dremio-daemon-bundle.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at com.dremio.dac.daemon.DremioDaemon.main(DremioDaemon.java:151) ~[dremio-daemon-bundle.jar:1.3.1-201712020435270019-a7af5c8-mapr]
… 10 common frames omitted

infact there is a file with name LOCK (check the below linux command). Not sure why LOCK resource is not available.

-rw-r–r-- 1 mapr mapr 1740655 Jan 18 02:09 /var/lib/dremio/db/catalog/000081.log
-rw-r–r-- 1 mapr mapr 16 Jan 18 01:49 /var/lib/dremio/db/catalog/CURRENT
-rw-r–r-- 1 mapr mapr 37 Jan 17 20:24 /var/lib/dremio/db/catalog/IDENTITY
-rw-r–r-- 1 mapr mapr 0 Jan 17 20:24 /var/lib/dremio/db/catalog/LOCK
-rw-r----- 1 mapr mapr 6576 Jan 18 02:10 /var/lib/dremio/db/catalog/LOG
-rw-r–r-- 1 mapr mapr 73142 Jan 17 23:34 /var/lib/dremio/db/catalog/LOG.old.1516212276559852

Couple of things.

  1. Is container with errors in the log by any chance running on the same node as coordinator? If yes - it’s not going to work. We currently don’t support coordinator and executor as separate processes on the same host.
  2. Could you change “localhost” here:

Set to ip/hostname of coordinator instead of “localhost”