Hi,
I have a VDS that I expect to query and get results within a second, so I created the appropriate reflection to accelerate it.
But when I execute the query, I get the results in 10 seconds, and most of the time is spent in Logical Planning phase.
Why does it take so long to do this planning while this is a simple group by query where the fields are covered by an acceleration ?
NB : in order to get my results in an acceptable duration, it’s a pity I had to create a scratch version of my VDS that I put in an EFS storage ( CREATE TABLE $scratch.MyVDS_SCRATCH AS SELECT * FROM MyVDS), and I create a new VDS referencing this new parquet dataset.
The same query returns incredibly faster (in less than 300 ms total):
It looks like you probably have a lot of reflections matching. You can see this by looking at the Acceleration tab, and seeing not just which reflections matched, but how many times each one matched.
There is a known issue where having lots of reflection matches results in high Normalization and Logical Planning times. This is something we are working to improve, but I do not currently have timeline for this.
If you want, you can send me a private message with the profile, and I can take a look. Your planning times are actually higher than I’ve seen before, so there may be something else going on.
Hi, I’m facing the same issue (takes 52seg), but it occurs when I add where to an aggregate query, something like this:
SELECT COUNT(*)
FROM model WHERE “fecha” >= '2019-02-20 00:00:00’ AND “fecha” <= '2019-02-27 00:00:00’
When I run this without the “Where”, the query runs perfect and takes only some seconds, the logical planning takes 7s in this case.
@Rosmar_Torres, also, to understand this better, kindly turn on verbose query planning. You can do this by going to Admin–>Support, add the key, planner.verbose_profile, enable it, and save your changes.
When you rerun a query, the profile will contain a much richer set of information now.
Hi,
We deactivated most of the reflection and in some ocasions the SQL runs ok, also we reformulated somethings in the model and now runs with the “Where” condition. But We notice that this same problem occurs when It’s executes a DISTINC SQL… the only way that we found to resolve the problem was making a view with “group by” and adding a constant “1” to sum it (to imitated the “count”) and making a reflection upon the view and that make it work now.
@Rosmar_Torres, we are currently working on some fixes related to long planning times when many reflections are applicable.
With regards to:
We notice that this same problem occurs when It’s executes a DISTINCT SQL… the only way that we found to resolve the problem was making a view with “group by” and adding a constant “1” to sum it (to imitated the “count”) and making a reflection upon the view and that make it work now.
Are you seeing the same long logical planning phase? Is it considering a large number of reflections?
I face similar issue.
A suggestion , can dremio skip some part of the planning phase, like checking all the reflectiosn etcc, for second execution of the same query?. Assuming that it already know which is the best plan and cache that information.