Sometimes, due to various housekeeping tasks,
we need to stop our dremio instance correctly.
I’m looking for a correct procedure here.
From what I understood, I need to:
My system is in YARN Deployment mode
- stop workers
- stop dremio from systemd unit
- always check log files for any kind of issues
- start dremio by systemd
- start workers and wait for workers provisioning
My question is: Is there an API Call or shell command to change workers status? [ stop / start]
Also I like to open a think-thank discussion about how to control and monitor this kind of infrastructure to prevent errors and stale situation.
Thanks in advance,
I would suggest the below order for clean maintenance.
- Stop the YARN provisioning
- Shut down all Dremio executor
- Shut down Dremio coordinator nodes
- Do housekeeping/maintenance
- Start Dremio coordinators
- Start Dremio Executors
- Start YARN provisioning.
Thank you for your fast reply and completeness of it.
Do you also know how to perform 2 and 6 by API or dremio-admin or other shell commands?
Currently, item’s (2) “Shut down all Dremio executor” and (6) “Start Dremio Executors” can only be done through YARN provisioning window of the Dremio UI. You could go to each host with a Dremio executor running, find the process id for each executor and kill those containers, but this is not advised.