I did a fresh install on my Mac of dremio Community Edition, dremio-community-4.0.1-201909191652190301-211720e
I created some sources (mysql, excel)
I can see the schemas and tables but every query fails after running for a long time ~1 min with this error : Error setting up remote fragment execution
Any idea of the issue?
I wasn’t able to run a single query so far. I have downloaded the query profile however who do I send it to as I am currently using the community edition?
It looks like there was a problem with node “au10739” which is acting as both coordinator and executor. Kindly check the Dremio logs on that server and see if you see any error messages. Also look at /var/log/messages or dmesg for any errors
at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:225) ~[curator-client-2.12.0.jar:na]
at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:94) ~[curator-client-2.12.0.jar:na]
at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:117) ~[curator-client-2.12.0.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:835) ~[curator-framework-2.12.0.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809) ~[curator-framework-2.12.0.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64) ~[curator-framework-2.12.0.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267) ~[curator-framework-2.12.0.jar:na]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[na:na]
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[na:na]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[na:na]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[na:na]
at java.base/java.lang.Thread.run(Thread.java:835) ~[na:na]
2019-09-25 20:15:16,406 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to RECONNECTED
2019-09-25 20:15:16,432 [Curator-ServiceCache-0] INFO c.d.s.c.TaskLeaderStatusListener - New Leader node for task MASTER au10739:45678 registered itself
Looks like you lost connection to zookeeper, does this happen every time? What is the total RAM on your MAC?
at java.base/java.lang.Thread.run(Thread.java:835) ~[na:na]
2019-09-25 20:15:16,406 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to RECONNECTED
2019-09-25 20:15:16,432 [Curator-ServiceCache-0] INFO c.d.s.c.TaskLeaderStatusListener - New Leader node for task MASTER au10739:45678 registered itself
8 gb and I haven’t had a successful query via Dremio before. The mysql db I am connected to via Dremio runs under docker container. I can connect to the db directly via mysql workbench and query without any problem
I’m seeing the same problems with our Linux Tar install after we upgraded to 4.0.0. We didn’t change anything in the configuration file. Are there new defaults or settings in 4.0.0. that the upgrade process may have missed?
Eventually the Master Coordinator loses touch with ZK and the Dremio server crashes.
server.log:2019-09-25 11:47:41,571 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to LOST
server.log:2019-09-25 11:47:41,581 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to RECONNECTED
server.log:2019-09-25 11:49:39,098 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to SUSPENDED
server.log: at com.dremio.service.coordinator.zk.ZKClusterClient$1$1.call(ZKClusterClient.java:233) [dremio-services-coordinator-4.0.1-201909191652190301-211720e.jar:4.0.1-201909191652190301-211720e]
server.log: at com.dremio.service.coordinator.zk.ZKClusterClient$1$1.call(ZKClusterClient.java:217) [dremio-services-coordinator-4.0.1-201909191652190301-211720e.jar:4.0.1-201909191652190301-211720e]
server.log:2019-09-25 11:50:22,846 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to LOST
server.log:2019-09-25 11:50:22,861 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to RECONNECTED
server.log:2019-09-25 11:51:09,625 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to SUSPENDED
I see your coordinator is constantly losing ZK connectivity
> 2020-02-24 02:02:00,838 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to SUSPENDED
> 2020-02-24 02:02:02,726 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to LOST
> 2020-02-24 02:02:02,748 [Curator-ConnectionStateManager-0] INFO c.d.s.coordinator.zk.ZKClusterClient - ZK connection state changed to RECONNECTED
This can be either
#1 Your Zoo keeper is unstable - check zk logs #2 You cordinator is constantly doing garbage collection check server.gc logs on your coordinator log folder for the same time