Dremio jobs taking too long to finish/cancel

Hi guys.
Sometimes, when I edit a query, Dremio creates a job before allow me to edit it. Sometimes this job takes forever to finish, and I’m not been able to edit the query before it finishes. When I try to cancel it, Dremio says that a cancellation request was sent, but the job is not canceled. Just when I restart the executor/coordinator node the job is finished.

Is that a bug? Is there some workaround to make it stop?
I’m using Dremio 3.0.6

4 Likes

Hi @Paulo_Vasconcellos

We have made some UI enhancements in the upcoming 3.1 release which should be out very shortly. Here we have decoupled the edit of the SQL to the actual preview getting completed. This is especially useful when a VDS is previewed that is pretty complex and you do not have to wait for the preview to complete in case we want to edit the SQL. Kindly let us know once you upgrade.

We are planning more enhancements to editing the SQL experience and faster previews in later releases

Thanks
@balaji.ramaswamy

That’s awesome, @balaji.ramaswamy. Thank you so much for that. I’ll wait for 3.1 :wink:

@balaji.ramaswamy This problem persisted even with Dremio 3.1.0. After press the job cancel button, it takes too long to actually cancel it :sleepy:.

@pollyannaogoncalves

What @Paulo_Vasconcellos was experiencing is different where the preview was taking too much time to load the SQL. The preview experience is much better now. For you, I guess canceling queries takes time? Does this happen frequently? Is it on the same query? Has this query been running for a long time?

Thanks
@balaji.ramaswamy

In fact, the problem was the same reported by @Paulo, I could even use the same words to report the problem :sweat_smile: “Sometimes this job takes forever to finish, when I try to cancel it, Dremio says that a cancellation request was sent, but the job is not canceled. Just when I restart the executor/coordinator node the job is finished.”

1 Like

Same problem here. Using version 3.1.6.

@allan.sene, we’ve made a number changes to query execution and cancellation since 3.1.6. It would be worthwhile to upgrade to our latest 4.0 version

Even the same issue persists in 4.1 as well, I am having the same issue in while cancelling a reflection refresh job, it says cancellation is submitted, but actually the job is running forever.

1 Like

@balaji.ramaswamy I am using 4.1 and facing this issue. When i cancel the job ( preview job in my case), i am seeing Job cancellation request sent, but the job is still is not cancelling for . How much time it takes to cancel the job?

@balaji.ramaswamy

I am also still receiving this issue, where jobs will run for days and won’t automatically cancel.

In Jobs it is still running.
In Jobs > Profile > Query it says ‘Cancelled’.
When using a GET request at api/v3/job/ it is still marked as ‘running’.
When I POST a request at api/v3/job//cancel, I get a 204, but nothing ever cancels.

Version of Dremio I am running: 4.0.4.

@yeshreddy

Cancel job as other databases only is a request to cancel. It would be good if we can get the server.log from the executors and if possible some jstacks on the executor to see if this stuck on something, to take jstack, login to the executor as the same user as the process running Dremio and run the below script. Send us the server.log and the jstack outputs when the problem is actually happening

for i in seq -w 3 1 300
do
jstack -l > ThreadDump$i.txt
sleep 1
done