Hi everyone,
I would like to share my current setup and experience running Dremio in a clustered environment, and I also have some questions for those who might have encountered similar situations.
Current Dremio Cluster Setup:
- Master Coordinator Node
NOTE: I use zookeeper external but installed on this node
- 1 node
- 16 vCPU
- 64 GB RAM
- Executor Node
- 1 node
- 32 vCPU
- 126 GB RAM
As we know, Dremio requires centralized NFSv4-compatible storage that supports object locking, especially for:
rocksdb
(key-value metadata store)c3cache
(columnar cloud cache)
Storage Backend Configuration:
I’m using Ceph with a single OSD 1 TB
with this configuration pool:
rocksdb
=900 GB
c3cache
=100 GB
and exporting the pool via NFS Ganesha, which is then mounted on each Dremio node like this:
EXPORT
{
Export_Id = 101;
Path = "/volumes/dremio_group/rocksdb";
Pseudo = "/dremio/rocksdb";
Squash = "No_root_squash";
SecType = "sys";
Protocols = 4;
CLIENT {
Clients = ip_address_node_dremio_1, ip_address_node_dremio_2;
Access_Type = RW;
}
FSAL {
Name = CEPH;
User_Id = "ganesha";
Secret_Access_Key = "XXXXXXXXXXXXXXXXXXXXXXXX";
}
}
EXPORT
{
Export_Id = 102;
Path = "/volumes/dremio_group/c3cache";
Pseudo = "/dremio/c3cache";
Squash = "No_root_squash";
SecType = "sys";
Protocols = 4;
CLIENT {
Clients = ip_address_node_dremio_1, ip_address_node_dremio_2;
Access_Type = RW;
}
FSAL {
Name = CEPH;
User_Id = "ganesha";
Secret_Access_Key = "XXXXXXXXXXXXXXXXXXXXXXXX";
}
}
nfs-lakehouse-1.example.internal:/dremio/rocksdb nfs4 1000G 277G 724G 28% /mnt/nfs/dremio/rocksdb
nfs-lakehouse-1.example.internal:/dremio/c3cache nfs4 1000G 277G 724G 28% /mnt/nfs/dremio/c3cache
The Issue:
When i tried copying files into the mounted NFS path from the Dremio nodes via NFS Ganesha, I noticed the transfer speed is capped at around 3–4 MB/s.
NOTE: I tried to research about this issue on the official NFS Ganesha GitHub repository, and it turns out that many users have reported similar performance issues. From what I could find, this has been a known issue since at least 2019, with several open discussions and reports indicating slow throughput when using NFS Ganesha for large or high-frequency I/O workloads.
To troubleshoot, I tested mounting CephFS directly to the Dremio node without going through NFS Ganesha using this command:
sudo mount -t ceph nfs-lakehouse-1.example.internal:6789:/ /mnt/test -o name=admin,secretfile=admin.key
When copying the same files, the transfer speed jumped significantly to around 150–250 MB/s.
This indicates a possible bottleneck in the NFS Ganesha layer.
My Questions:
-
Is there any recommended NFS server that works well and performs efficiently for Dremio clustering, especially for
rocksdb
andc3cache
? -
Is it safe and supported to mount CephFS directly on Dremio nodes without using NFS Ganesha?
- Does CephFS natively support NFSv4-style locking required by Dremio?
- Has anyone successfully run a production Dremio cluster using a direct CephFS mount?
I’m looking for a scalable solution since I plan to:
- Add more OSDs or disk capacity
- Add more coordinator (standby or secondary) and executor nodes
- Expand storage and compute without needing large migrations
Thank you in advance. I hope this discussion is also helpful for others working on similar setups.
Best regards,
Arman