Issue with Standby Coordinator Not Taking Over as Master in Dremio Cluster

Hi everyone,

I’m currently setting up a Dremio cluster with multiple coordinator and executor nodes. However, I’ve encountered an issue where the standby coordinator does not automatically take over as the master when the master node goes down.

Dremio Cluster Setup

My Dremio cluster consists of:

  1. 1 Master Coordinator Node
  2. 2 Standby Coordinator Nodes
  3. 2 Executor Nodes
  4. 3 External Zookeeper Nodes
  5. NFS for metadata storage
  6. MinIO bucket for distributed storage

Dremio Configuration

Below are the configurations for each node:

1. Master Coordinator Node (dremio.conf)

paths: {
  local: "/data/dremio-metadata"
  dist: "dremioS3:///dremio-distributed"
}

services: {
  web-admin {
    host: "0.0.0.0"
    port: 9191
  }

  coordinator {
    enabled: true
    master.enabled: true
    master.embedded-zookeeper.enabled: false
    client-endpoint {
      port: 31010
    }
  }

  executor {
    enabled: false
  }

  flight {
    use_session_service: true
  }
}

zookeeper: "dremio-1.example.internal:2181,dremio-2.example.internal:2181,dremio-3.example.internal:2181/dremio"

2 & 3. Standby Coordinator Nodes (dremio.conf)

paths: {
  local: "/data/dremio-metadata"
  dist: "dremioS3:///dremio-distributed"
}

services: {
  web-admin {
    host: "0.0.0.0"
    port: 9191
  }

  coordinator {
    enabled: true
    master.enabled: false
    master.embedded-zookeeper.enabled: false
    client-endpoint {
      port: 31010
    }
  }

  executor {
    enabled: false
  }

  flight {
    use_session_service: true
  }
}

zookeeper: "dremio-1.example.internal:2181,dremio-2.example.internal:2181,dremio-3.example.internal:2181/dremio"

4 & 5. Executor Nodes (dremio.conf)

paths: {
  local: "/data/dremio-metadata"
  dist: "dremioS3:///dremio-distributed"
}

services: {
  web-admin {
    host: "0.0.0.0"
    port: 9191
  }

  coordinator {
    enabled: false
    master.enabled: false
    master.embedded-zookeeper.enabled: false
    client-endpoint {
      port: 31010
    }
  }

  executor {
    enabled: true
  }

  flight {
    use_session_service: true
  }
}

zookeeper: "dremio-1.example.internal:2181,dremio-2.example.internal:2181,dremio-3.example.internal:2181/dremio"

6. Zookeeper Configuration (zoo.cfg)

tickTime=2000
initLimit=5
syncLimit=2
dataDir=/data/zookeeper
dataLogDir=/var/log/zookeeper
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
maxClientCnxns=60
standaloneEnabled=false
admin.enableServer=true

server.1=dremio-1.example.internal:2888:3888;2181
server.2=dremio-2.example.internal:2888:3888;2181
server.3=dremio-3.example.internal:2888:3888;2181

Each Zookeeper node has a unique myid file:

  • Zookeeper Node 1: 1
  • Zookeeper Node 2: 2
  • Zookeeper Node 3: 3

Issue

The cluster starts normally with this configuration, and everything works as expected. However, when I simulate a master node failure, the standby coordinator nodes do not automatically take over as the master.

Has anyone encountered a similar issue? Am I missing any configuration to enable automatic failover?

Any help would be greatly appreciated!

Best regards,
Arman

[SOLVED] Understanding the Difference Between HA/Failover and Load Balancing for Coordinator Nodes

I just realized that HA/Failover (Standby/Backup Coordinator Node) and Load Balancing for Coordinator Nodes (Secondary Coordinator Node) are different concepts. Initially, I misunderstood them as the same, which led to incorrect deployment assumptions.

To clarify for others who might face the same confusion, I have adjusted my deployment to the following configuration:

Deployment Configuration

  1. 2 Master Coordinator Nodes
  • One active, one standby (backup)
  • This setup provides HA (High Availability) / Failover.
  • If the active master fails, the standby coordinator is automatically elected by Dremio (handled via Zookeeper).
  1. 1 Secondary Coordinator Node
  • Helps with load balancing for query planning, serving UI, handling JDBC connections, etc.
  • Does not participate in failover (only Master Coordinators do).
  1. 2 Executor Nodes
  • Responsible for query execution only.

With this setup, we ensure High Availability (failover) while also optimizing workload distribution with a Secondary Coordinator Node.


Updated Configuration Files

1. Master Coordinator Nodes (2 Nodes, 1 Active + 1 Standby)

paths: {  
  local: "/data/dremio-metadata"  
  dist: "dremioS3:///dremio-distributed"
}
services: {  
  web-admin {  
    host: "0.0.0.0"  
    port: 9191  
  }  
  coordinator {  
    enabled: true  
    master.enabled: true  
    master.embedded-zookeeper.enabled: false  
    client-endpoint { port: 31010 }  
  }  
  executor { enabled: false }  
  flight { use_session_service: true }  
}
zookeeper: "dremio-1.example.internal:2181,dremio-2.example.internal:2181,dremio-3.example.internal:2181/dremio"

2. Secondary Coordinator Node (Load Balancing Only, Not HA)

paths: {  
  local: "/data/dremio-metadata"  
  dist: "dremioS3:///dremio-distributed"
}
services: {  
  web-admin {  
    host: "0.0.0.0"  
    port: 9191  
  }  
  coordinator {  
    enabled: true  
    master.enabled: false  
    master.embedded-zookeeper.enabled: false  
    client-endpoint { port: 31010 }  
  }  
  executor { enabled: false }  
  flight { use_session_service: true }  
}
zookeeper: "dremio-1.example.internal:2181,dremio-2.example.internal:2181,dremio-3.example.internal:2181/dremio"

3. Executor Nodes (2 Nodes for Query Execution Only)

paths: {  
  local: "/data/dremio-metadata"  
  dist: "dremioS3:///dremio-distributed"
}
services: {  
  web-admin {  
    host: "0.0.0.0"  
    port: 9191  
  }  
  coordinator {  
    enabled: false  
    master.enabled: false  
    master.embedded-zookeeper.enabled: false  
    client-endpoint { port: 31010 }  
  }  
  executor { enabled: true }  
  flight { use_session_service: true }  
}
zookeeper: "dremio-1.example.internal:2181,dremio-2.example.internal:2181,dremio-3.example.internal:2181/dremio"

Key Takeaways

  • HA/Failover is provided by the 2 Master Coordinator Nodes (1 Active, 1 Standby).
  • Load Balancing for query planning, UI, and JDBC/ODBC connections is handled by the Secondary Coordinator Node.
  • Executor Nodes remain dedicated to processing queries.

Hope this helps others who might have had the same confusion!

@armandwipangestu That is correct,

For HA you need a shared disk like NAS (NFS V4 mount with file locking), here only one is active while the other one is waiting for a lock. WHen the primary goes down the lock is released and secondary takes over. For K8s you do not need this as when the pod goes down, it comes up automatically on another physical node. Only needed on stand alone deployments or in Hadoop systems where coordinator is on a n edge node

Scale out is for load balancing, only supports JDBC and ODBC queries