Hello everyone.
I’m trying to optimize the total execution time in the following scenario:
PDSs* -> Base table (VDS) -> Derived table (VDS, more than 20 left joins in base table)
*PDSs are in parquet files located in S3
We’ve tested some combinations:
- Base table with reflection and Derived table with reflection
- Base table materialized with CTAS and Derived table with reflection
The total execution time is in the best combination (base table w/ CTAS and derived table w/ reflection is the following: query plan 17 and execution 5s = total 22s.
For my purpose, the response time goal must be much less than 10s, and I believe the query plan
is making it impossible. On the other hand, use CTAS for derived tables isn’t and option due to managing complexity.
b3f7f5bc-451c-4620-b354-c4cc1766455c.zip (115,8,KB)
Is there a way of reducing and optimizing the query plan?
Cheers,
Danilo from DataSprints
3 Likes
Hi there,
We haven’t found a solution for this yet.
Any tips, comments, best practices are really welcome.
Thank you.
1 Like
@dcmsdancosta
It looks like we ar spending time on the materialization (reflection replacement plans), to confirm that, kindly turn on verbose planning and send us the profile
admin-support-support key-“planner.verbose_profile”, show - enable - save
Also try to do the below and see if the planning time comes down?
admin-support-support key-“accelerator.enable_agg_join”-disable-save
@balaji.ramaswamy we’ve tested with your suggestions.
The verbose profile can be reached here: https://drive.google.com/open?id=1YGUw0P1Ewh-h1mEVtoAQQ5WYOpH-Zwlk
Thank you
1 Like
Hi Balaji,
Have you read the files that I’ve shared?
Cheers,
Danilo
1 Like
Hello,
Balaji, do you have any suggestions on that? Things still the same.
Thank you,
Danilo