I have a performance bottleneck on the reflection/query operations doing EXTERNAL_SORT steps. It happens on small (few Go) and larger data (>100Go). The processing rate on this step is extremely low and the SPILL_TIME_NANO accounts for all the wait.
My configuration is 3 executors of 48Go/8cores. Different ressource configurations (Eg. 4 executors 32Go/10cores) show the same problem.
Cloud volumes are fast (sequential write ~700Mb/sec). I tested the same field sorting based on the raw parquet data or the a raw reflection "display’ for all concerned fields and it is as slow in both cases.
I changed values of planner.slice_target (30000, 50000 , 100000) without improvements.
Using CTAS method to sort the data as in the reflection ran into the same problem.
What can be the cause of such slow EXTERNAL_SORT ?
Which logs can be used to investigate this problem ?
Eg of raw reflection : raw display on 10 fields, 1 sort on a field (job killed to get logs):
[Single parquet file of 2.62Go snappy compressed, 1 row group]
With a raw reflection “display” on all 10 fields to accelerate:
Directly on raw data: