Query Pushdown in Joins

HLNA · August 26, 2018, 7:02am

I tested a scenario with two tables (in RedShift & MySQL) joined on a key to return single row. Both of the following queries involved full-table scan. Are we supposed to make any settings?

Qry-01:
SELECT a.*
FROM ds_redshift a JOIN ds_mysql b ON a.id = b.id
WHERE a.id = 123 and b.id = 123

Qry-02:
SELECT a.*
FROM ds_redshift a JOIN ds_mysql b ON a.id = b.id
WHERE a.id = 123

desidero · August 27, 2018, 4:16am

Do you have the query profiles? If so, can you attach them?

jduong · August 27, 2018, 4:28am

@HLNA

This is a known issue that will be fixed in an upcoming release. In the mean time, would you be able to restructure your query so that the filters explicitly happen before the join using sub-queries?

Eg for query 1:
SELECT a.* from (SELECT * from ds_redshift where id = 123) a join (SELECT * FROM ds_mysql where ID = 123) b ON a.id = b.id

query 2:
SELECT a.* from (SELECT * FROM ds_redshift where id = 123) a join ds_mysql b on a.id = b.id

HLNA · August 27, 2018, 5:57am

@jduong I don’t have it handy… but, the query I shared is exactly the one being executed. I confirmed on both DB Server consoles that full table scan is happening. While I will try to fetch query profile next time, I am not sure why full table scan is initiated for this query as it it not complex and the predicate is clearly on a single id column

jduong · August 27, 2018, 6:48am

The reason is that the join won’t get pushed down (since this is across two different sources), but the planner is not pushing the filter to each join operand (this is being fixed).

HLNA · August 27, 2018, 6:43pm

Thanks @desidero and @jduong

I am eagerly waiting for 2.1. Will test it.

HLNA · August 27, 2018, 10:46pm

hi @jduong I tested with sub-queries on the same large tables, which returned “No Data”. I created two test tables (across two datasources) and they returned the result (ds_redshift+ ds_mysql worked fine, which is test table with 3 or 4 records)
26%20PM

sudhko · September 5, 2018, 6:12am

I think the later query works, because the key in join operation happens to the first member of the select…this took us few iterations to figure out Not sure if this is getting addressed in the upcoming 2.1 release?

Rahul · July 29, 2019, 3:11pm

Dears any update on this?

Topic		Replies	Views
Predicate Pushdown	4	2006	August 21, 2019
How to force join pushdown?	4	2443	November 6, 2020
Some advice on the single table query （Elasticsearch）	3	1349	January 30, 2018
SQL joins performance	8	4127	February 16, 2018
Dremio SELECT * on joins	2	5218	November 6, 2018

Query Pushdown in Joins

Related topics