Hello there,
I’m trying to create an on-prem Dremio cluster.
I’m using Ubuntu 22.04, and running a MinIO S3 instance for storage. Dremio is running direct on the servers, I’m not using Docker etc.
I intend to configure a 3-node cluster, with a single master-coordinator and a couple of executors. I don’t require HA, just a setup that scales performance-wise.
It took some work, but I have successfully configured my master-coordinator instance, and have the distributed file store working complete with S3 buckets created by Dremio (e.g. ‘accelerator’, ‘metadata’ etc.)
To setup dremio.conf, I used the template here: Configuring via dremio.conf | Dremio Documentation
For the master-coordinator, I set coordinator enabled: true, master enabled: true and executor enabled: false.
I’m running embedded zookeeper on the master, on port 2181.
After starting the Dremio service, the web portal is available (I had to use a different port to 8080 due to other software running on this instance), and it shows the master-coordinator in the node listing screen. Zookeeper is listening on port 2181.
All good. I’ve not yet tested setting up a data source connection etc. I will get to this testing later. However, for the minute, it looks like the master-coordinator is active and online.
The issue arises when adding my first executor node.
I copied core-site.xml, dremio.conf and dremio-env to my second node.
Then, in dremio.conf, I amended the following (this is an excerpt from the conf file, as it follows the exact dremio.conf template as outlined via the link above and is quite lengthy):
services: {
coordinator: {
enabled: false,
# Auto-upgrade Dremio at startup if needed
auto-upgrade: false,
master: {
enabled: false,
# configure an embedded ZooKeeper server on the same node as master
embedded-zookeeper: {
enabled: false,
port: 2181,
path: ${paths.local}/zk
}
},
executor: {
enabled: true
},
I also double checked the zookeeper settings - everything is correct in terms of the hostname and the port. I have used the ‘nc -zv’ command to check that node 2 can communicate on port 2181 to the master-coordinator node - everything tested successfully.
I also set web enabled: false as, for the minute, I want to force all admin via the web interface on the master-coordinator.
Upon trying to start the Dremio service, I receive the following error:
com.dremio.service.namespace.RemoteNamespaceException: dataset listing failed: com.dremio.common.exceptions.UserRemoteException: SYSTEM ERROR: UnsupportedOperationException: non-master coordinators or executors do not support dataset listing
Just out of curiosity, I then set the following for a coordinator-only role:
services: {
coordinator: {
enabled: true,
# Auto-upgrade Dremio at startup if needed
auto-upgrade: false,
master: {
enabled: false,
# configure an embedded ZooKeeper server on the same node as master
embedded-zookeeper: {
enabled: false,
port: 2181,
path: ${paths.local}/zk
}
},
executor: {
enabled: false
},
Upon starting the service I get the following error:
com.google.common.cache.CacheLoader$InvalidCacheLoadException: CacheLoader returned null for key class com.dremio.exec.server.options.SystemOptionManager$OptionStoreCreator
I’ve poured over the logs and the online documentation, but I must be missing something.
I’ve also tried removing large chunks of the dremio.conf template to get me back to something that looks a little more like the examples in this documentation (to simplify the config more than anything else):
Configuring Dremio Services | Dremio Documentation
To no avail.
I’ve been going round in circles for a couple of days now. I’m hoping someone can assist me with this, else I may have to just stick with the single node setup, which would be less than ideal.
Thanks in advance!