Query exceeded memory limits

martinocando · January 15, 2021, 2:09pm

I’m having issues with some reflections crashing with the error: “Query was cancelled because it exceeded the memory limits set by the administrator”

Can you please help me troubleshoot what is going on?

Attached query profiles:
tc_operator_work_history_rep.zip (145.5 KB)
mtms_service_request_items.zip (78.3 KB)
This one is very odd, since is a VDS with only two columns and only 8 days worth of data, so I don’t understand how it can exceed the memory limit:
gt_latency_check.zip (101.6 KB)

Thank you very much for the help.

balaji.ramaswamy · January 16, 2021, 6:33am

@martinocando

For the last query, there were a few queries on Dremio when this query ran OOM, if you add up all the actual it should sum up to the actual on the ROOT allocator, if not add up the peak. similarly you can check the other 2 jobs

Allocator(ROOT) 0/23622266368/23636203136/23622320128 (res/actual/peak/limit) numChildAllocators:12

Allocator(query-1fff9cc4-46b1-4e26-ee5a-69fc6ad88800) 0/8953235264/9284486040/15032385536 (res/actual/peak/limit) numChildAllocators:1
Allocator(query-1fff9b9b-caaa-dac0-acc4-66042611f200) 0/5384800192/5388537472/15032385536 (res/actual/peak/limit) numChildAllocators:1
Allocator(query-1fff3681-14f6-0ee4-7535-023ce98d2b01) 0/5091755792/6536967236/15032385536 (res/actual/peak/limit) numChildAllocators:37
Allocator(query-1fff9bdd-55e8-412f-fca4-ed27da448500) 0/1969635200/2324823808/15032385536 (res/actual/peak/limit) numChildAllocators:5
Allocator(query-1ffe5d80-bb9d-515c-92b2-11ee7d38b600) 0/1222837104/1223059368/15032385536 (res/actual/peak/limit) numChildAllocators:7
Allocator(query-1ffe5d7f-e8a2-8927-c841-fc347c0e4400) 0/559598656/654668096/15032385536 (res/actual/peak/limit) numChildAllocators:4
Allocator(query-1fff2bf8-b471-3721-a6c6-6c9810845a00) 0/65536/623976312/15032385536 (res/actual/peak/limit) numChildAllocators:1
Allocator(query-1ffecbed-acf5-74c2-d4f0-4b1182fb0900) 0/65536/592038040/15032385536 (res/actual/peak/limit) numChildAllocators:1
Allocator(query-1fff16ec-3799-54d7-ce7e-8ae913c90400) 0/65536/592056144/15032385536 (res/actual/peak/limit) numChildAllocators:1
Allocator(query-1fff0d69-d734-9cb7-3753-0b89f817ae00) 0/49152/511220424/15032385536 (res/actual/peak/limit) numChildAllocators:1
Allocator(query-1ffe5d7f-6ce8-4d92-2df6-a4c2d96ace00) 0/334629056/334814400/15032385536 (res/actual/peak/limit) numChildAllocators:2
Allocator(query-1fff3644-eb34-831c-9fde-6edcc8060500) 0/32768/600452360/15032385536 (res/actual/peak/limit) numChildAllocators:2

To see how much memory each node has consumed, go to the jobs page and then click on the job. On the right pane of the window, click on profile, scroll down to nodes and expand nodes

martinocando · January 16, 2021, 1:12pm

Thanks Balaji.
Very useful information to help us troubleshoot our cluster. Overall there were a lot of queries and reflections failing with the same error, then one of the nodes got into a wedged state and had to be rebooted, and when that happened, all jobs that were failing started to run without issues.
Apparently we have some kind of memory leak.

Any advice on how can we troubleshoot for this? Maybe look into some logs?

Thanks

balaji.ramaswamy · January 18, 2021, 3:20am

@martinocando

It seems like you are running on 4 executors, so either reduce concurrency or increase # of executors. One other thing you can do is upgrade to the latest version and if this issue happens, look at server.gc logs to see why there was a Full GC. That will tell us if this is just a capacity issue

Thanks
Bali

marcosllama · January 25, 2022, 9:30am

I had this same problem on part of my reflections, and in my case it was due to ElasticSearch field nesting. I compared to the other reflections that were still working and the difference was on two nested fields in Elasticsearch (i.e aaa.bbb), that were not present in the working reflections. I unnested these fields and it solved my problem.

Topic		Replies	Views
DREMIO - Query was cancelled because it exceeded the memory limits set by the administrator	4	875	August 16, 2023
Query Failure (Query was cancelled because it exceeded the memory limits set by the administrator)	5	1737	April 8, 2022
OUT_OF_MEMORY ERROR: Query was cancelled because it exceeded the memory limits set by the administrator	4	2442	March 7, 2022
Query was cancelled because it exceeded the memory limits set by the administrator	26	9629	January 26, 2023
Query cancelled because it exceeded the memory limits set by the administrator	18	151	October 2, 2024

Query exceeded memory limits

Related topics