Dremio Performance & Load Balancing Issue

ruchitha · February 13, 2026, 9:54am

I am testing query performance and executor load balancing in Dremio and need guidance to understand whether I am facing a configuration or architectural bottleneck.

Environment Setup

Source: MySQL table with ~10 million rows
Dremio Object:
- Created a view on the MySQL table
- Enabled Raw Reflection
  - Partitioned by: event_date (20 distinct days)
  - Sort by: msisdn
Reflection is accelerated and queries are served from the Iceberg reflection table

Cluster Topology

Node	Role	Physical Cores	Logical CPUs	Total RAM	Available RAM
server35	Executor only	16	32	31 GB	9.1 GB
srvr82	Executor only	8	16	31 GB	14 GB
server179	Master + Coordinator	—	16	62 GB	18 GB

No executor is running on the coordinator node.

Query Under Test

Executed continuously using JMeter with 50 threads and infinite loop, using random values:
SELECT * FROM test_case.test_04 WHERE event_date =‘2025/02/11’ AND msisdn = 8030043017;

Observed Behavior

TPS achieved: ~19–20 queries/sec
Issue:
- TPS does not increase even with 2 executors
- CPU usage appears skewed, not evenly balanced across executors
- One executor often appears more loaded than the other
Reflection is definitely being used (verified via query profile)

SELECT * FROM test_case.test_02 WHERE event_date BETWEEN ‘2023-11-01’ AND ‘2023-11-20’ AND msisdn = 8030461446;

        i am getting memory issue for the above query or \~1-3 tps i am getting

Questions / Help Needed

Is ~20 TPS expected for this query pattern (high-concurrency point lookups on Iceberg reflections)?
Why might queries not be evenly distributed across both executors?
Could the following be causing bottlenecks?
- Single-row lookup (msisdn + event_date)
- Reflection partitioning strategy
- Executor core / memory imbalance (16 cores vs 8 cores)
- Coordinator planning or fragment assignment behavior
How can I verify executor-level load balancing correctly?
- Query profile metrics?
- Specific executor CPU / fragment stats to look at?
Are there recommended reflection designs or tuning settings for:
- High-TPS lookup workloads

Goal

I want to:

Achieve better TPS
Ensure work is evenly balanced between both executors
Understand whether this use case is realistically scalable in Dremio or if I’m hitting a design limitation

Any guidance, tuning suggestions, or best-practice recommendations would be greatly appreciated.
Dremio version : 25.2.0

229561ab-ed25-412f-946c-a7d77bb7c9e3.zip (13.5 KB)
Thanks in advance

ruchitha · February 16, 2026, 4:23am

Hi team,

Can you please update on above points ?

Thanks in Advance

balaji.ramaswamy · February 25, 2026, 11:50pm

@ruchitha All the time spent was on the JDBC_SUB_SCAN which is always single threaded so does matter number of executors, but I thought you said query was accelerated?

Topic		Replies	Views
How to scale up dremio cluster?	4	809	September 5, 2023
Dremio uses one core for processing	7	1756	January 12, 2020
Improving reflection performance Site Feedback	6	2125	March 20, 2018
Dremio on kubernate volume mount take more time to load data	8	1937	November 20, 2018
Jobs are not shared on executor nodes	14	2336	June 2, 2020

Dremio Performance & Load Balancing Issue

Questions / Help Needed

Related topics