I’m running the same query on two separate dremio installations (same version, same configuration) with very different results concerning execution time.
On neither system the query gets accelarated.
The query is a simple select with predicates:
select * from table where column_a=foo and column_b=bar
On System A:
The query gets pushed down to the database basically as is. Execution takes about 2 minutes.
On System B:
The query that gets pushed down to the database looks like this:
select * from table
Rows in the result set that do not match the predicates seem to be eliminated later. Execution takes about 20 minutes, that is 10 times longer.
Can you help me understand how this is happening and also is there a way to force the behavior of system a for system b?
I’ve attached the query profiles: system_a.zip (17,5 KB) system_b.zip (15,0 KB)
What kind of source is bdb? Is there any difference between the way you have configured that source between the the two Dremio installs?
I do see that you have reflections enabled on bdb.public.dokument_neu in the environment where the filters are pushdown, but not the other. These reflections are note matched, however. We may be able to understand better if you enable verbose profiles.
In each Dremio environment, can you go to Admin --> Support and add and enable the following parameter: planner.verbose_profile
Please rerun the query on each system and attach the results here.
@kelly@ben
dremio version: “3.1.1-201901281837360699-30c9d74”
source: postgresql data source
configuration including reflection store setup is the same on both systems
@ben
I configured the data source the same way on both environments.
There is an aggregate reflection in one environment but it is not used by the query in question. I already tried building the same reflection on the other environment but it did not change the execution plan.
Query profiles with planner.verbose_profile enabled: system_a_verbose.zip (20,7 KB) system_b_verbose.zip (18,0 KB)
This is perplexing, @oliver, because it appears that both queries initially cost the Postgres scan in the same way, so there’s not an obvious reason why one system would plan for the pushdown while the other would choose a filter in Dremio.
The profiles indicate that there is 1 difference between the two Dremio configurations, namely, results.max.age_in_days is set to 3 in the profile of the query with pushdown. I have no reason to believe this would affect the planning, but you could try making every support parameter exactly the same between the instances.
You could also try disabling the reflections in instance without the pushdown. Perhaps that is affecting the planning path in a manner that is not recorded in the profile.
Is there any other difference you can think of between these 2 systems?
Is there any other difference you can think of between these 2 systems?
I really missed a difference. On the environment where the query gets pushed down Enable legacy dialect is checked in the advanced options of the postgresql data source.
Checking this option on environment B will result in the query getting pushed down to the database just as on environment A.