Job cancellation requested failure

Hi there,

I’ve some troubles cancelling some long running jobs: when I try to cancel some of them from the Jobs UI, I get the Job cancellation requested notification, but after nothing happens, the job is still running.
I do have to restart the master pod in order to get rid of them.

Our Dremio is running on a K8s cluster, the release info are:
Build 17.0.0-202107060524010627-31b5222b
Edition Community Edition
Build Time 06/07/2021 07:41:44
Change Hash 31b5222bbf667ebdfb277c2c1b727eb9943083d3
Change Time 05/07/2021 22:51:06

Is there another way to cancel them ?

Also I can see in the documentation that there is a query runtime limit that could be set but I do not know how to set it. Is it possible to do that ?

Thanks,

Luc

@lucbaro Query run time limit can be set under WLM queue settings

If the query is not cancelling , setting runtime may not help. We need to find out out which fragment is still running

Expand each phase and look at the status of each thread, if you find any RUNNING, log on to the executor that it was runing on and see i that executor logs and GC logs have anything to tell us, like Full GC events, ZK was suspended etc

Thanks @balaji.ramaswamy,

Are WLM settings available for Enterprise edition only ?
I can only see Queue timeout setting in Dremio Admin > Settings > Engines > Queue Control
But no query runtime limit…

@lucbaro My bad, yes it is