AWS minimum instance types

The AWS cloudformation template uses r5d.4xlarge instances, which are pretty large (expensive). Documentation states this is a minimum. I just curious why and if this is minimum required or minimum recommended.

Worker node instance type (minimum): r5d.4xlarge
(16 core, 128 GiB memory, and 2 x 300 GB NVMe SSD)



Are you evaluating or is this for production use?

At the time when I started with Dremio 1, minimum requirements were not as high.
I have not changed the servers, my production is running with
2 m5d.2xlarge (8 cores, 32 GiB memory, and 1 x 300 GB NVMe SSD)

So far it’s running pretty well in Dremio Community v4.0.2.
All my accelerations are stored within S3.

1 Like

@dfleckinger Thanks for the info! Good to know. I am planning to configure to store accelerations on S3 as well.

@balaji.ramaswamy Currently evaluating. No doubt those specs would run really well for production. But I also think it might be overkill for a mid-size company like mine. I would like to understand why that is the chosen instance type. Do the attached SSDs play an important role? I’m guessing for fast access and adequate storage for accelerations. If you configure accelerations, etc. to be stored to S3 as @dfleckinger has, are the SSDs as relevant?

@mpcarter It is a recommendation, but really it depends on workloads/concurrency/datasize, no size really fits everyone’s usecase. It is always good to keep that in mind when evaluating - the query profiles will contain information if loading reflections is slow for example. We usually recommend at least 16 gigs of RAM.

For reflections its important to remember that if they are on S3 we have to load them from S3 each time and S3 can be slow, leading to some performance issues. We added the cloud cache for reflections in 4.0 for this reason in fact to help improve performance (its included in the community edition).

What is the cpu and memory for coordinator and executor used to run with m5d.2xlarge. I am not able to get it work with the following.

memory: 61440
cpu: 8
count: 0
port: 9047
port: 31010
volumeSize: 10Gi
memory: 61440
cpu: 8
count: 1
volumeSize: 10Gi

I appreciate your help. It is blowing up the expenses using r5d.4xlarge.


You can configure the memory via dremio-env but as far as cores go we use what is available. We can throttle that also via dremio-env using below, where 8 is the number of cores per executor


As far as how much memory and cores you require, that entirely depends on your workloads, type of queries, type of joins, how wide are the tables, joins etc?