Complexity for Dremio joining is O(N*M) or O(N+M) or others?

alaw · January 15, 2024, 8:17am

Would like to know how the Joining (e.g. LEFT JOIN, CROSS JOIN) is implemented? Is the complexity O(N*M) as it seems quite slow and takes a lot memory for large table. Any insights would be welcome.

Benny_Chow · January 16, 2024, 1:37am

Most joins are implemented as shuffle or broadcast hash joins. However, cross joins will always use nested loop joins. You can see which join implementation was chosen by the cost based optimized in the physical plan. This is in the query profile → planning tab. Do you have a profile?

Topic		Replies	Views
JOIN Algorithms	3	1411	February 5, 2020
HASH_JOIN doesn't spill on disk	5	1573	August 26, 2020
SQL joins performance	8	4129	February 16, 2018
Is there any benchmark for the improvement of vectorized Join/Aggregate	2	1390	September 13, 2018
FULL OUTER JOIN is not supported	2	2874	August 18, 2021

Complexity for Dremio joining is O(N*M) or O(N+M) or others?

Related topics