Too much time planning

Hello, we have latest Dremio version, we have 1 large pyshical iceberg partitioned table

we have about 200 folders in Application layer inside we have 3 VDS en each folder
example:

|- example.Application.Spaces.s66613f12842c9132256fd465
|---- posts (derivated of iceberg table) (with raw reflection)
|---- only_posts (derivated of previous posts VDS) (with aggr reflection)
|---- only_comments (derivated of previous posts VDS) (with aggr reflection)

wheen run a query to any of this VDS for some strange reason Dremio took too long planning

ddba9788-22ab-45de-b32b-d4c35aefc123.zip (122,3 KB)

I appreciate your help

@dacopan

Can you please turn on planner.verbose_profile, then run same query and turn it off, send profile

The acceleration tab shows that you have too many reflections that are being considered by the planner to match into the query.

Substitution terminated after timeout of 30000 milliseconds.

Do you really need that many? The planner is basically starting with reflections on overlapping views and tables with the query. A quick workaround is to use reflection hints to limit which reflections are considered or chosen:

Hello @Benny_Chow and @balaji.ramaswamy yes we need all this reflection because we have a VDS for each group, so from our superset client we run query to specific VDS of a group.
but if we run query against specific VDS why Dremio planner considered reflection of other VDS?

Because both VDS contain the same table. This is how algebraic matching works.

You can turn off join matching specifically by setting reflections.planning.algebraic_match to false. If you set this to false, then the planner will only match reflections into SQLs using views and the exact table. But it won’t do the join matching part which is probably what is timing out.

1 Like