HIVE Queries taking more Time on Blocked on Upstream

Hello Team,

Dremio HA cluster is behaving strangely with hive querry.
Some time the query get completed within 20 Sec but some time it took more than 20min to complete.

When I have check the in job profile we found that
Under the Thread Overview
for Phase 0 - under the filed Min Blocked Time is 14 min and minimum sleep time 6min,

Further for the Phase 0 - under filed Blocked on Upstream it took 14Min.

When I check for detail time summery of job with respect to operator -

The most of time taken was by hive_sub_scan as below .

SqlOperatorImpl ID Type Avg Wait Time Max Wait Time
01-xx-03 HIVE_SUB_SCAN 29.576s 5m18s 13m12s

I am using the same hive source with other single node dremio cluster and with that node I am getting the stable performance.

Could you please help me to find the root cause for issue i am facing with HA DREMIo cluster.

Thanks,
Atul

@achounde

Can you supply profiles for the slow and fast runs, i would like to see the differences.

@achounde

yes sure, But could you please tell me how can i upload the profile here. I have already downloaded the profile but did not see any option to upload it.

Hello @achounde

Try drag and drop into the writing console.

Thanks,
@Rakesh_Malugu

@achounde

Most of your time is spent on waiting for the block to be read from IO (HDFS), if you expand the operator metrics under the HIVE_SUB_SCAN, and scroll to the right, do you see local and short circuit reads? Are those columns populated? If not then A) Validate, if short circuit reads are enabled via HDFS B) Implement C3 Cache on the Dremio side