Multiple executors on same node

Hi,

I would like to know how to achieve this deployment setup (for dremio community edition v2.0.5)

A node with

  • 1 Coordinator which also has an executor
  • another dremio process in the same node which acts as a standalone executor

The config for coordinator goes like this
services: {
coordinator.enabled: true,
coordinator.master.enabled: false,
executor.enabled: true,
executor.enabled.embedded-zookeeper.port:2182
}
zookeeper:“masternode:2181,localhost:2182,localhost:2183”

The config for the executor which I want to add to the same node goes like this

 services:{
     coordinator.enabled:false,
     coordinator.master.enabled:false,
     executor.enabled:true,
     executor.enabled.embedded-zookeeper.port:2183
 }
 zookeeper: "masternode:2181,localhost:2182,localhost:2183"

The coordinator gets started just fine and discovers the master and shows up in the administrator UI.

When I attempt to start the executor process, it fails saying “dremio process is already running, stop it first” which is because there is a coordinator+executor dremio process already running on the node.

Is this possible? If yes, can you point me at what config is needed in the extra executor?

At this moment you can’t have coordinator and executor co-located, but you can have multiple executors co-located

What is your motivation for running multiple executors on the same node?

Will disabling the coordinator in above config work? That way we will be running two executors of dremio in same physical host.

We have a machine with 64gb memory on which we want to try out running two executors. The motivation being, verifying if java garbage collection becomes a performance blocker when run in single executor and two executor config.

@kranthikiran01 we do not recommend running multiple executors on the same physical host. You’d be paying extra overhead per node (extra heap for each JVM instance), without any specific advantages in most cases. Since Dremio uses direct memory for query execution, you typically need to only allocate smaller amounts for heap (e.g. 4-8GBs) – which should help with GC. I’d start with 4GB heap and everything else to direct.

1 Like

@can can you please elaborate on “dremio uses direct memory for execution”? I have 16 GB of Heap allocated per node but I’ve been suggested (by Dremio professional services) to have a min of 32 GB Heap made available per executor.

Dremio execution largely relies on direct memory versus heap. So when you say:

Most likely it means direct memory.
In the following example -Xmx4096m -XX:MaxDirectMemorySize=8192m heap is 4GB (4096MB), while direct memory (off heap) is 8GB (8192MB)

1 Like

@Aravind in most situations Dremio executors are direct memory intensive. However, in some edge cases (e.g. reading Avro files from Hive), we may recommend higher heap allocation as well.

1 Like

What we have done is to increase the YARN container size to be 32GB. I am pretty sure we didn’t do this. I guess increasing YARN container size isn’t the right way to do. Am I correct?

If you are using YARN deployment - container size will be sum of heap and direct memory.

Could you elaborate more on what you wanted to ask/say?

In 2.1 and later versions of Dremio, there is one memory setting and Dremio will automatically determine the best balance of direct and heap memory.

See here: https://docs.dremio.com/release-notes/21-release-notes.html#execution

Kelly one followup on this. Lets say i have a 64GB RAM machine and what is the recommended memory to be allocated to dremio. Can i assume 75% is a good start. Also what is the threshold afterwhich data is spilled to disk(i see there is a spill folder in storage data folder). What exactly goes there and when it goes there. What is the trigger point for this spill.

Further there are so many folders in data which stores metadata, spill, reflection. Want to understand which folder contain which data and when those data is getting written to these folders, what is the limit on those data size and is there any expiry strategy for these, For what data dremio uses RocksDB as storage, how multiple coordinator nodes find out which node has which metadata ? Is there a documentation that can help to understand.

Hey @sambitdixit I’ll take a first pass:

Lets say i have a 64GB RAM machine and what is the recommended memory to be allocated to dremio. Can i assume 75% is a good start.
– As long as you leave enough for the OS and any other process/application running there, you should be good. Somewhere about 2-4GB is probably enough for non-Dremio, if no other app is running.

Also what is the threshold afterwhich data is spilled to disk(i see there is a spill folder in storage data folder). What exactly goes there and when it goes there. What is the trigger point for this spill.
– Dremio currently spills sorting operations and some types of aggregations (hang tight on more updates on this in our next major release). Spilling is automatically triggered when there is memory pressure. This is behind the scenes. Queries should clean-up their own spill files after the fact. Spill files are in Arrow format.

Further there are so many folders in data which stores metadata, spill, reflection. Want to understand which folder contain which data and when those data is getting written to these folders, what is the limit on those data size and is there any expiry strategy for these, For what data dremio uses RocksDB as storage,

– Here is an overview:

Metadata or Data Default Location Details
Metastore (RocksDB) /db All Dremio metadata including dataset, space & source definitions, permission information, dataset history, metadata for Dataset Discovery and Dataset Details, job profiles, etc.
Results /results Query results for queries run in UI. By default, job results older than 30 days are cleaned.
Spilling /spill Spilling data. Queries clean-up their own spilling files.
Logs /log System and query logs. Advanced logging configuration can be done via logback.xml configuration file.
Reflections /accelerator Reflection Data. Cleaned-up when reflections are dropped or expired.
Scratch /scratch $scratch table data. Only cleaned-up using DROP command.
Downloads /downloads Staging location for downloads.
Uploads /upload User uploads. Cleaned-up when files are deleted.

how multiple coordinator nodes find out which node has which metadata ?
– Our metastore only interacts with a given coordinator at a time, called the master node. In case of a master node failure, a new master will be elected and started, from within standby masters, through a ZooKeeper based election mechanism. In HA deployments, metastore is stored on a shared network drive so that multiple nodes can mount/access it.

Hope this helps!

1 Like

Hi@can,
What about hosts with more memory? For example virtual hosts based on TidalScale solution.
We try to test DREMIO in such env just now. DREMIO host has 4.2TB RAM.
I’m going up few docker containers with executor after got OutOffMemoryError when jvm start with -XX:MaxDirectMemorySize(env DREMIO_MAX_DIRECT_MEMORY_SIZE_MB) more than has any TidalScale node. What bugs I can face?
I don’t know maybe jdk 1.8 not compatible with TidalScale Goherent Shared Memory or other problem. But I look for the ‘docker on tidalscale guest’ for test purpose.