The usecase is as follows
CASE1:
I am trying to create indexes on 36GB of parquet files created by partitioning year + month-ly. The raw index succeeded creating some 1.5GB of index --all are hash indexes
When it try to create Aggregate indexes the job fails saying
2018-01-18 00:17:30,366 [FABRIC-rpc-event-queue] INFO c.d.s.e.rpc.CoordToExecHandlerImpl - Received remote fragment start instruction for 25a065be-00d7-607f-e1e0-2cc4f154dc00:5:3
2018-01-18 00:19:31,666 [e0 - 25a065be-00d7-607f-e1e0-2cc4f154dc00:frag:5:1] ERROR com.dremio.sabot.driver.SmartOp - One or more nodes ran out of memory while executing the query.
com.dremio.common.exceptions.UserException: One or more nodes ran out of memory while executing the query.
at com.dremio.common.exceptions.UserException$Builder.build(UserException.java:648) ~[dremio-common-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at com.dremio.sabot.driver.SmartOp.contextualize(SmartOp.java:125) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at com.dremio.sabot.driver.SmartOp$SmartSingleInput.consumeData(SmartOp.java:231) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-map
r]
at com.dremio.sabot.driver.StraightPipe.pump(StraightPipe.java:59) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at com.dremio.sabot.driver.Pipeline.doPump(Pipeline.java:82) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at com.dremio.sabot.driver.Pipeline.pumpOnce(Pipeline.java:72) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-a7af5c8-mapr]
at com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run(FragmentExecutor.java:288) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-
a7af5c8-mapr]
at com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run(FragmentExecutor.java:284) [dremio-sabot-kernel-1.3.1-201712020435270019-a7af5c8-mapr.jar:1.3.1-201712020435270019-
a7af5c8-mapr]
Caused by: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 26843545600, max: 26843545600)
at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:510) ~[netty-common-4.0.49.Final.jar:4.0.49.Final]
at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:464) ~[netty-common-4.0.49.Final.jar:4.0.49.Final]
at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:766) ~[netty-buffer-4.0.49.Final.jar:4.0.49.Final]
at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:742) ~[netty-buffer-4.0.49.Final.jar:4.0.49.Final]
at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:244) ~[netty-buffer-4.0.49.Final.jar:4.0.49.Final]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:226) ~[netty-buffer-4.0.49.Final.jar:4.0.49.Final]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:146) ~[netty-buffer-4.0.49.Final.jar:4.0.49.Final]
at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.newDirectBufferL(PooledByteBufAllocatorL.java:171) ~[arrow-memory-0.7.0-201711092339270134-6305826-dremio.jar:4.0.49.Final]
at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.directBuffer(PooledByteBufAllocatorL.java:204) ~[arrow-memory-0.7.0-201711092339270134-6305826-dremio.jar:4.0.49.Final]
at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:55) ~[arrow-memory-0.7.0-201711092339270134-6305826-dremio.jar:4.0.49.Final]
... 23 common frames omitted
The Aggregate functions are partitioned by year+month (same as parquet files) and no month folder size is greater than 15GB, however inspite of increasing the dremio direct memory from 8GB to 25GB the error still continues. Any ideas? As these are aggregate functions, i am of the view that the storage size during the idex creation/ after should be very small.
CASE2:
I assumed that if we can integrate dremio with mapr-yarn the issue may be resolved ('cos in that case yarn will be using the full 3 node cluster of mapr – 5.2.2)
As said in the doc: https://docs.dremio.com/deployment/yarn-deployment.html i configured the dremio. When dremio starts i see a running-job in my resource manager; however when i try to run a query on the parque files i get error that there are no active executors. i think its because of the below configuration executor.enabled=false.
services: {
executor.enabled: false,
coordinator.enabled: true
}
if i change it to true (executor.enabled) then i dont see the aggregation work delegated to the resource manager (of mapr-spark). any ideas? my dremio.conf is as below
master: {
name: localhost,
port: 45678
}paths: {
local:/var/lib/dremio
dist: “maprfs:///dremioVol/cacheDir”
}zookeeper: “110.122.121.104:5181”
services: {
coordinator.enabled: true,
coordinator.embedded_master_zk.enabled:false,
coordinator.auto-upgrade:false,
coordinator.client-endpoint.port:31050,
executor.enabled: true
}
Any ideas please?