Out of Memory / Dremio Configuration

Hello,

I am having some trouble with performance in my data pipeline using Dremio. Speeds are very slow and we often run out of memory on jobs.

I built a data pipeline for a client that essentially moves a small amount of files (< 50 files each file is on the order of KB) every 5 min to Azure Blob Storage where Dremio is pointing. The files are placed in directories in Azure that reflect the date they are processed. We surface those blobs in Dremio including the date directories. We then use dbt to transform the blobs into consumable datasets also in Dremio.

Over the course of a year+, the files have piled up. In some of the datasets we have 100MM+ rows.

The queries we run are not complex, but given the amount of rows, they can be large.

We are using onprem version 22.1.1-202208230402290397-a7010f28. I don’t know the exact system configuration but I think the executors are running on either 4 core x 32GB or 8 core x 64GB VM’s and we have 3.

I am also using reflections, which are being hit, but it’s still very slow and high memory. I assume reflections don’t help with memory, just speed.

Is this a use case that Dremio should be able to handle? Why am I getting OOM’s? I attached a profile of one of them.

The cost is really high so I assume we are just exceeding the footprint we gave it. Is there a formula for how to configure Dremio based on the average cost of the jobs?

Thanks,
Drew
ab007763-f120-40e3-8d71-a1ac8ad0ce9b.zip (105.7 KB)

@drew Reflection will help with both performance and memory if the size of the reflection is smaller which depends on the type of reflection and the display columns selected

I see there were other queries running at the same time, for example Job ID # 1c93d44d-474c-e4b7-3f1d-c477d90ca000

Are you able to just run this query alone and let us see what is the memory it takes, we can then analyze the operators to see which specific operator is blowing up the memory and then come up with ideas

It is the HASH_AGGREGATE that using up all the memory on phases 4,5 and 7. I do see some spilling happening. Let me further troubleshoot this but meanwhile see if you can run only this query during a time when the system is not used and send the profile over

@balaji.ramaswamy Thank you. I will try to run it tonight when there should be less activity. It’s a production system and we don’t have an isolated instance to test against.

RE: Reflections. If “display columns” means columns returned in the query, then we can’t really trim that down as the application needs all of them at this stage of the pipeline. I am building a raw and aggregate reflection for all of our base staging tables to avoid going back to the raw files for every query. Unfortunately every reflection seems to fail periodically. In the sys.reflections table they are all marked as CANNOT_ACCELERATE. The cause of the error seems to be memory. I set the queue for large reflections to 1 but it still seems to fail. I am not sure if there is a way to specify when reflections run or if we can pin all reflections and/or large jobs to a specific queue to isolate them from regular jobs.

@drew Is using only agg reflection an option? I see your data is in Azure Storage and from the scan it is PARQUET or ORC or AVRO. so if you are selecting all columns for your raw and your reflections are also in AZURE, I am wondering why going back to raw data is expensive, is too many small files? Basically just build Aggregate reflections on the Semantic layer on the VDS with joins and aggs so the dash board query that usually has measures and dimensions would use the agg reflection

We have lots of small (KB) files in Azure that are surfaced through Dremio. We had issues running queries that ultimately hit the raw files. I forget the error off the top of my head, but it indicated that the issue was related to hitting too many files. We also started getting this error "Max Rel Metadata call count exceeded”.

Our pattern is to create a staging table in Dremio (stored in $scratch not back in Azure) that mirrors the table that sits on top of the raw blobs. Then we add reflections (raw and aggregate) to each of those staging table so that downstream tables/views/queries are hitting reflections instead of the raw blobs. This is not ideal as there is staleness in the data until the reflection refreshes, but we can live with it.

We have since refactored some of our downstream queries, so the query profile we sent in this thread is no longer in play. Given that reflections are running on a schedule that I cannot control, trying to run a query without anything else running is difficult to schedule.

I attached a profile (e9706607-45ae-4a5c-8250-7ee0a7adda4f) of a “refresh reflection” job that took over an hour to complete. It did not run out of memory, although in the profile it did indicate “insufficient memory” in one of the attempts. As I mentioned, I set the reflection queue for large jobs to 1 so they should not run in parallel.

I also attached a profile (f5ce39a5-7921-4a47-91cd-6349ae284c06) of a reflection that failed due to running out of memory.

Are we just pushing Dremio to it’s limits?

e9706607-45ae-4a5c-8250-7ee0a7adda4f.zip (82.3 KB)
f5ce39a5-7921-4a47-91cd-6349ae284c06.zip (63.5 KB)

@drew The one that succeeded was accelerated using the reflection refl_stg_h_instrument__raw on dataset stg_h_instrument

If you see the job that failed with OOM it was not accelerated

The one that was accelerated failed with HashAgg (on 36 columns) but dreio retries with StreamAgg which is less memory intensive and it completed