Job does not cancel at 60 seconds and never starts the reflection process

Hi Guys,

I would like to report an issue while executing a query.

The issue is that the job died at 19 minutes with the timeout error by planning> 60 sec.

The strange thing is that it does not cancel at 60 secs. as it should, but it extended to 19 minutes.
On the other hand, it was strange that the work marks as it found reflection but never starts to run.

Could someone give some light on this?

Thanks & Regards
Federico

Hello @fmoine,

Can you attach the profile for the job to this ticket?

7d2026a2-2e77-45d8-bf13-bcd0cc6fa0fa.zip (137,0 KB)

Hi @Ben, see attached the profile.

Regards
Federico

@fmoine, the error message is not indicating that Dremio will terminate the query at the 60 second mark of planning, nor does Dremio check the state of the planning every 60 seconds.

After some portion of the planning pipeline is completed, the planner reports back the time it took and if that value was longer than 60 seconds, the query is canceled. Most of the time, these planner phases take less than a second, but in some cases, like this one, they can run for a long time before completion.

The profile indicates that most of those 19 minutes were spent in physical planning. This looks similar to a bug we’ve fixed in an upcoming release. We can verify this if you run jstack on the Dremio process on the coordinator host as this query is stuck in planning, and attach the results here. jstack should already be installed if you have a Java 8 SDK:

    $ ps -ef | grep Dremio  --- note the Dremio PID, then
    $ jstack <Dremio PID here>  >>  dremio-threads.txt