I am running Dremio in standalone mode on a Ubuntu box. I ran a big query unioning several multi-million row MySQL tables but this seemed to cause it to crash. I rebooted Dremio with ./bin/dremio restart and when I look at running jobs the job is still running. If I click “Cancel” I get “Unable to cancel job started on [hostname of Ubuntu box]. It may have completed before cancellation was request.”
Is there any way to cancel this perhaps via a command line?
When you restart a Dremio node, all tasks that were running on that node are immediately stopped. What you are experiencing is most likely a bug in the jobs UI where we fail to detect that the query is no longer running.
It seems that the job is no longer running, but the final status of the job was never received due to the crash. I don’t know if there is a way to update the job status in this case, which means it will continue to show as Running in the jobs, but is not actually running anymore.
You can confirm it is no longer running by looking at CPU usage of Dremio, or running jstack.
Upon restarting shouldn’t the UI marked all previous jobs as terminated or cancelled so it won’t be confusing to the users? Maybe an enhancement?
I believe it should be the case now - upon restart of a coordinator job status should not be “Running” anymore.