What is the benefit of having multiple coordinators

Sneha_Krishnaswamy · December 26, 2018, 4:41am

Hi,

What is the role and benefit of having multiple coordinator nodes in a dremio cluster?

balaji.ramaswamy · December 26, 2018, 4:53am

If you are having concurrent queries and want to distribute the planning time then having multiple coordinators might help. As a first step, are you having multiple queries at the same time and experiencing high planning times?

Thanks
@balaji.ramaswamy

Sneha_Krishnaswamy · December 26, 2018, 4:57am

I’m just exploring dremio as of now, so right now we don’t have concurrent queries.
So, essentially all coordinator nodes can perform query planning and not just the master node, is it?

balaji.ramaswamy · December 26, 2018, 6:20am

That’s right @Sneha_Krishnaswamy,

coordinator can also be master or just coordinator. But only one coordinator can be the master (talks to rocksDB) and you should not configure two coordinators to be both masters

services: {

  coordinator: {
    enabled: true,

    # Auto-upgrade Dremio at startup if needed
    auto-upgrade: false,

    master: {
      enabled: true,
      # configure an embedded ZooKeeper server on the same node as master
      embedded-zookeeper: {
        enabled: true,
        port: 2181,
        path: ${paths.local}/zk
      }
    },

balaji.ramaswamy · December 27, 2018, 2:58am

Hi @Sneha_Krishnaswamy,

Just to confirm that when I said both coordinators cannot be masters is in a non HA mode when both coordinators are active and planning queries. Of course you can have a HA setup where we can have 2 masters in 2 coordinators but only one master coordinator will be active and can plan queries

Thanks
@balaji.ramaswamy

Sneha_Krishnaswamy · December 27, 2018, 3:02am

understood… thank you for the detailed explanation.

Sneha_Krishnaswamy · December 27, 2018, 9:45am

One last question though, could you direct me to some documentation on how to create an HA setup?

anthony · December 27, 2018, 2:10pm

https://docs.dremio.com/advanced-administration/high-availability.html

Ming · November 30, 2022, 12:44am

@balaji.ramaswamy
In what situations that JDBC and ODBC connections will be running on the secondary coordinator? For example heavy loading of master coordinator.

balaji.ramaswamy · November 30, 2022, 5:06am

@Ming Dremio at a given time can plan not more than # of cpu’s - 1, so if you have 32 cores on the coordinator, Dremio can plan 31 in parallel. Say the dash board fires 155 queries in parallel, then the first 31 get planned (command pool time), while the remaining 124 queries are in PENDING state. Say each query in the 1st set of 31 queries take 1s to plan, the last batch of the 31 queries will have a 4s wait time before they can get planned. So depending on your concurrency and where jobs come from like JDBC or ODBC or REST, you can add a secondary coordinator only for planning and currently only for JDBC and ODBC queries

Ming · November 30, 2022, 8:29am

@balaji.ramaswamy
I tried to write a python coding for testing. Our coordinator has 8 cores. I simulated 10,000 concurrent sessions to connect the coordinator. I do not find any messages in server.log from secondary coordinator. How can I do to know the secondary coordinator is used?

balaji.ramaswamy · December 9, 2022, 3:09am

@Ming I assume the Python code is ODBC (or JDBC), the job profile should tell you the name of the coordinator

Topic		Replies	Views
Dremio multimaster	1	1116	October 29, 2020
Multiple executors on same node	14	3191	November 27, 2019
Number of Coordinators	3	1094	September 7, 2022
Configure coodinator and executor on one host	4	1096	December 13, 2019
Dremio Distributed Cluster Issues	13	814	November 22, 2023

What is the benefit of having multiple coordinators

Related topics