Determine Dremio performance

Hamzabouazza · May 30, 2024, 10:54pm

Hello, I’m trying to Determine the performance requirements and the recommended number of nodes for Dremio to efficiently execute queries on massive datasets, such as those in the petabyte range.
For example, how many nodes does Dremio need to execute a query? Any suggestions?

bogdan.coman · May 31, 2024, 5:28am

Hi @Hamzabouazza,

For example, how many nodes does Dremio need to execute a query?

The simplest answer would be one node. Although, depending on how long you want the query to take, what type of resources you are willing to use in regards to CPU, RAM, Storage, Network… then things can be tuned accordingly. All these would be for running one single query. If you have multiple different queries accessing that dataset, then you may consider the type of dataset, how the data is partitioned and configure the workload manager and, possibly, reflections: Managing Job Workloads | Dremio Documentation
Hope this helps.

Thanks, Bogdan

Topic		Replies	Views
Dremio infrastructure sizing recommendations	4	3820	October 5, 2018
Dremio query duration takes too much time	5	675	January 12, 2024
AutoScaling dremio	3	1722	February 3, 2022
How to Optimize Query Performance for Large Datasets in Dremio Dremio Cloud	1	476	June 21, 2024
How to scale up dremio cluster?	4	594	September 5, 2023

Determine Dremio performance

Related topics