Hi team,
We encounterred some Dremio UI cannot connect issues. And we did below:
cd ${DREMIO_HOME}/bin
./dremio stop
cd ${DREMIO_HOME}/bin
./dremio start
We see the Dremio UI can be connected, and we can see different spaces, and views under different spaces.
But when I run one view, got the error of “The default engine is not online”.
But after about 20 mins, we see this error no showed, and views can running, would like to understand how to check the reason of the error “The default engine is not online”.
Are you using AWSE? If so, it could be that the default engine takes a few minutes for the EC2 instances to be procured and initialized by AWS.
Next time, go to Settings → Elastic Engines and see what the state of the default engine is.
There are also auto-start and auto-stop capabilities that you can configure in the Engine Settings page.
we are not using AWSE, what I am seeing in the server.out, I can see after I start Dremio service, it started dremio on our cluster instantly, and then we can connect to Dremio UI, but got errors of “The default engine is not online”, and after 10 minutes, I see in server.out, the Dremio start at the second time, however I did not triger the Dremio serivce sttart, it ran like “Mon Oct 18 03:34:33 UTC 2021 Terminating dremio pid 21169
Mon Oct 18 03:34:43 UTC 2021 Starting dremio on xxxxxxx”, and then when we connect to Dremio UI to run views, no such errors happend. can we find out some thing in the log about why that happen. From my side, I just see in first Dremio start, I cannot see our nodes in “Node Activity”, just the our cluster (master coordinator) shown there, there is also no host running under the “Elastic Engines”, our used Engine (we have two engines configured, one is in use, the other is not in use, I suppose it will not impact us)
same thing happened today again, our cluster got some problem, so cluster restarted, after that we start the mapr service, and then we start the dremio service. Still in the 1st time Dremio start, UI get connected, but error “The default engine is not online”, after 10 mins, I see in the server.our, Dremio started again, no triggered it. However, after this auto-start, problem still there, and about 30 mins, we start Dremio manually, then the problem solved, do we have something can track the reason of this issue.
@dolphinlei auto start of an engine is when a query is fired, engine will automatically wake up, not when coordinator is started, have you tried to execute a query and engine does not come up? Is auto start enabled?