Query plan spending lots of time in Dremio 4.0

Hello everyone.

I’m trying to optimize the total execution time in the following scenario:

PDSs* -> Base table (VDS) -> Derived table (VDS, more than 20 left joins in base table)
*PDSs are in parquet files located in S3

We’ve tested some combinations:

  1. Base table with reflection and Derived table with reflection
  2. Base table materialized with CTAS and Derived table with reflection

The total execution time is in the best combination (base table w/ CTAS and derived table w/ reflection is the following: query plan 17 and execution 5s = total 22s.

For my purpose, the response time goal must be much less than 10s, and I believe the query plan
is making it impossible. On the other hand, use CTAS for derived tables isn’t and option due to managing complexity.

b3f7f5bc-451c-4620-b354-c4cc1766455c.zip (115,8,KB)

Is there a way of reducing and optimizing the query plan?

Cheers,

Danilo from DataSprints

3 Likes

Hi there,

We haven’t found a solution for this yet.

Any tips, comments, best practices are really welcome.

Thank you.

1 Like

@dcmsdancosta

It looks like we ar spending time on the materialization (reflection replacement plans), to confirm that, kindly turn on verbose planning and send us the profile

admin-support-support key-“planner.verbose_profile”, show - enable - save

Also try to do the below and see if the planning time comes down?

admin-support-support key-“accelerator.enable_agg_join”-disable-save

@balaji.ramaswamy we’ve tested with your suggestions.

The verbose profile can be reached here: https://drive.google.com/open?id=1YGUw0P1Ewh-h1mEVtoAQQ5WYOpH-Zwlk

Thank you

1 Like

Hi Balaji,

Have you read the files that I’ve shared?

Cheers,

Danilo

1 Like

Hello,

Balaji, do you have any suggestions on that? Things still the same.

Thank you,

Danilo