Single query across different data sources (MySql, Postgres, CSV) taking longer time.
Data is of mini size. However all data sources alongwith dremio are on single machine.
Is there any memory configuration parameter to be configured.
Can you share a profile of the job that is not performant?
please find attached job-profile.
query_1.zip (12.7 KB)
query_1.zip (12.7 KB)
Most of the time in this query is taken by the query that’s pushed down into the MySQL source:
SELECT *
FROM `banking_356`.`banking_customer`, `banking_356`.`cust_product_acc`, `banking_356`.`currency_master`
This cross join returns 23 million records. It’s not clear why your filters are not being pushed down… let me get back to you on that, as their may be a way to re-write the query to get it to push down the filter and join conditions.