Memory was leaked by query

Guys,

I’m getting a Memory Leak error when I try to create a VDS. My VDS is built by a script using the API, and should bring loooots of data. Bellow goes a query sample.

My cluster is:

1x Coordinator - EC2 m5d.xlarge - 4 vCPU 16 Gb
2x Executors - EC2 m5d.xlarge - 8 vCPU 32 Gb

Dremio env and full log attached.

Dremio doesn’t suppose to spill this kind of query? If I break my query on more “levels” of VDS it should be lighter?

dremio-files.zip (6.1 KB)

Hi @allan.sene

We only spill query execution on direct memory for certain operators. In your case you are running out of heap memory, see below

2019-04-26 21:20:22,485 [233c8a37-062f-d6a9-de30-70332d63d700:foreman] ERROR ROOT - Dremio is exiting. There was insufficient heap memory to continue operating.
java.lang.OutOfMemoryError: Java heap space

You should have .hprof file generated. Should be under the log folder of Dremio. Assuming this is a non-yarn deployment. You can use jhat to parse your hprof and see why so much heap was getting used

I also see your memory limits are at default and heap is at 4 GB and direct is at 8GB. What is the total RAM on the box.

A simple calculation would be to leave 2 GB for the OS, give 8 GB of heap to Dremio and remaining to direct

Thanks
@balaji.ramaswamy

Node AWS EC2 Instance vCPUs RAM
1x Coordinator EC2 m5d.xlarge 4 vCPU 16 Gb RAM
2x Executors EC2 m5d.xlarge 8 vCPU 32 Gb RAM

A simple calculation would be to leave 2 GB for the OS, give 8 GB of heap to Dremio and remaining to direct

This configuration is supposed to be for both node types (Executor and Coordinator)?

Thanks!

Hi @allan.sene

If you have a pretty big metadata set that you are working on then for coordinator set heap at 16 GB while for the executor at 8 GB

Thanks
@balaji.ramaswamy

Ok, @balaji.ramaswamy. Gonna try this and bring back the results. Thanks for helping!

Didn’t work, @balaji.ramaswamy. Did you notice that my query has lots of UNIONS ALL? Maybe if I create some intermediate VDS in between it works, makes sense? :thinking:

When I crop my SQL to UNION ALL less VDSs, on a yearly basis, for example, it works… I don’t really understand why, but Dremio was supposed to handle this by himself inside the query planner, right?