DREMIO - Query was cancelled because it exceeded the memory limits set by the administrator

Gerbasi · August 7, 2023, 12:59pm

Good morning,

We are having constant problems of lack of memory when executing this query, I am forwarding the profile, can you evaluate what we can do to solve this problem?

The error happens once or twice in a row, and sometimes on the third attempt it works, because when the error occurred the DREMIO processing was high, and when it works is because it has little processing?

How do I resolve this?

d56f8333-c284-40ce-a320-10e2299dc9b6.zip (1,3,MB)

Gerbasi · August 11, 2023, 10:42pm

Good night, friends, has anyone managed to solve this problem? I need help.

Thanks in advance

balaji.ramaswamy · August 15, 2023, 7:06am

@Gerbasi

You have 24 GB of direct memory per node, see error tab

Allocator(ROOT) 0/24639199972/24639690542/24639438848 (res/actual/peak/limit) numChildAllocators:18

out of this, the query that failed, job id# 1b2f1b5a-d5fe-de77-abf7-615bfc218600 used 16GB per node

    Allocator(query-1b2f1b5a-d5fe-de77-abf7-615bfc218600) 0/17171254708/17171598772/9223372036854775807 (res/actual/peak/limit) numChildAllocators:231

There were 10 queries running at this time

  Allocator(queue-25364dcc-ecf7-400d-a7a7-efc8f19617f7) 0/20877793812/24211071416/9223372036854775807 (res/actual/peak/limit) numChildAllocators:10

So you are falling short of direct memory, only Aggs and Sorts can spill at this time and hence if a Join is processing in memory and there is no direct memory left, it will run OOM

A few other query ID’s that were running at the same time

    Allocator(query-1b2f1bdd-e529-7102-791c-ef0a7ba00600) 0/1963532056/3708841128/9223372036854775807 (res/actual/peak/limit) numChildAllocators:19
    Allocator(query-1b2f21b8-6d1a-235f-e77b-7ffcad54d600) 0/1569822068/2081086528/9223372036854775807 (res/actual/peak/limit) numChildAllocators:7
    Allocator(query-1b2f21bc-dac1-7f8f-e502-0c83a37e3700) 0/1097226644/1303792836/9223372036854775807 (res/actual/peak/limit) numChildAllocators:8

Gerbasi · August 16, 2023, 11:30am

@balaji.ramaswamy, good morning,

We have 32 GB on this machine where DREMIO is installed on AWS, based on what you analyzed in this problem that we sent, do you have any recommendations on how we could parameterize the memory in the dremio-env.

balaji.ramaswamy · August 16, 2023, 4:58pm

@Gerbasi Really hard to tell as I do not know the concurrency you desire, SLA’s of the query and type of workload, for example, are these queries dash board queries or Ad hoc queries?

For example ad-hoc user queries may need a larger memory per node requirements while queries from dashboard usually select one or columns with aggregates using filters on partition columns. Creating an aggregate reflection on this would help. If you are limited on memory, another thing we can do is separate out workloads during different times of the day, for example run Ad-hoc queries after the dashboards are done. Increasing memory is also another option, by how much is hard to tell without studying the workload and looking at your profiles and see what is happening

Topic		Replies	Views
Query was cancelled because it exceeded the memory limits set by the administrator	26	9760	January 26, 2023
Query Failure (Query was cancelled because it exceeded the memory limits set by the administrator)	5	1749	April 8, 2022
Reflection memory limits & lost connectivity Dremio University	5	1570	November 18, 2021
1 of the 4 worker node dead	3	418	January 14, 2024
Out of memory exception	6	2368	July 23, 2020

DREMIO - Query was cancelled because it exceeded the memory limits set by the administrator

Related topics