Join performance

comphead · November 2, 2019, 11:37pm

Hi

I can see the query join operation takes ~80% of overall time and I during the run only 1 executor is really busy, can this be a situation that join is not parallelized and running on 1 executor only?

Can I find this out from profile information?

capochiani · November 2, 2019, 11:44pm

Dear @comphead,

Have you tried with the data from Hash Aggregate Operator Metrics?

comphead · November 3, 2019, 7:22pm

Hi @capochiani thanks for the reply. Yes I tried to use that kind of information

I could see there num_hash_partitions, but its still unclear if 1 partition per 1 executor or 1 partition per 1 cpu on the same executor?

Topic		Replies	Views
Hash Join query performance	0	1161	May 4, 2019
Query execution time	5	1194	June 6, 2022
Very Long Time to Execute	1	903	November 14, 2019
Simple query trying to return billions of rows	2	1210	January 30, 2019
Is there any benchmark for the improvement of vectorized Join/Aggregate	2	1390	September 13, 2018

Join performance

Related topics